GPU
GPUs offer many cores, but narrow instruction set and lower clock speed
VS video card: Video Card generated feed of output images; has a GPU in its core. Also usually has dedicated RAM (unlike integrated video cards that share RAM with the rest of the system).
GPUs offer many cores, but narrow instruction set and lower clock speed
Major brands: NVIDIA (GeeForce), AMD Radeon, Intel.
CUDA
CUDA - NVIDIA's platform and API for accessing their GPU's instruction sets. CUDA-powered GPUs also support open standards like OpenMP or OpenCL.
Natively supports C, C++, Fortran. Third-party wrappers exist for Python, Ruby and many other languages.
Parallelize functions by making a function doing only a part of the job depending on its thread id.
Threads are grouped into blocks; blocks form a grid.
Lower-level Driver API and higher-level Runtime API.
DL
Thinks like convolution or even matrix multiplication can easily be parallelized to run much faster on GPU.
Last updated