Memory Latency Hiding in CUDA using Streams (CUDA Adventures - 1)

Memory Latency Hiding in CUDA using Streams (CUDA Adventures - 1)

I’m starting a new CUDA project to deepen my understanding of GPU acceleration. I’ll begin with simple tasks like vector addition and move on to more involved projects, including image processing and language model optimizations. While this series won’t be a step-by-step tutorial, I’ll share the interesting parts of my implementations, highlighting the challenges I faced and the reasoning behind my decisions. For the complete code, feel free to check out the project repository.

Read More
MultiProcessing in Python to Speed up your Data Science

MultiProcessing in Python to Speed up your Data Science

When dealing with large datasets in Python, a common issue arises - processing takes too long. This can significantly slow down your data analysis workflow and hinder productivity. To optimize your code running time and speed up the process you’ll eventually consider Parallelization as one of the methods. In this article, we’ll explore how to use parallelization in Python to accelerate your data science.

Read More