Kingsley Chukwu, PhD.

CUDA Programming. C++/C Programming. CUDA Streams. Asynchronous and synchronous data transfer.

Runtime comparison of asynchronous and synchronous data transfer for reduction kernel.

CUDA Programming. C++/C Programming. Shared and Global memory. CUDA Warps

GPU microbenckmarking was used to identify group of threads in a Warp.

CUDA Programming. C++/C Programming. CUDA Kernel Profiling. Memory Hierarchy. CUDA Pipleines.

CUDA Kernel development and optimization for euclidean distance matrix calculations.

CUDA programming. C++/C programming. Pytorch. Python programming.

CUDA implementation of a three layer neural network was compared with Pytorch implementation.

Blogs