Until 2006, the only APIs available for programming graphics cards were OpenGL and Direct3D, which are used exclusively for manipulating graphical output. A popular GPGPU API did not exist until the release of CUDA by NVIDIA. The CUDA name originally stood for Compute Unified Device Architecture.

CUDA supports a rich set of compiler directives, programming language extensions, and run-time libraries that can be used with the C/C++ and Fortran programming languages. CUDA has several advantages:

  • CUDA is relatively easy to learn for most experienced programmers.
  • CUDA exposes special hardware features, such as shared memory and on-chip registers, that aren't available to the graphics APIs.
  • Programs can execute on any number of multiprocessors, which enables automatic scaling with the available GPU resources. This means you can efficiently use the GPU resources allocated without worrying about wasting resources.
  • Large amounts of data can be processed in parallel. With proper implementation of CUDA, a program can achieve upwards of 100 times the speedup compared to sequential execution on a single CPU core. Specific domains, such as molecular biology research and simulation, physics simulation, machine learning, image rendering and processing, and so on, benefit greatly from running on GPUs because of the massive parallelism.
  • The CUDA accelerated libraries provide common building blocks for applications, such as Fast Fourier Transforms and BLAS, that are accelerated by CUDA GPUs.
 
©  |   Cornell University    |   Center for Advanced Computing    |   Copyright Statement    |   Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)