Skip to main content


What is CUDA ?

Compute Unified Device Architecture (CUDA) is a programming API created by NVIDIA for general-purpose usage of its GPUs (GPGPU). CUDA supports a number of high level programming languages such as C and Fortran, and it features a large performance library, easing the task of GPU programming. It provides a means of utilizing a heterogeneous computing environment where users can perform serial or coarse-grain parallel computing using CPUs, while offloading massively parallel tasks to the GPUs.

About this module

As the hardware design for GPUs is optimized for highly parallel applications, the programming model is very different from the traditional serial programming model using CPUs. If you don't have any prior parallel programming experience, some of the CUDA programming concepts and techniques can be confusing. However, the goal of this module is to expose you to the basic CUDA programming concepts and techniques. Therefore, no in-depth parallel programming experience is required to use this module. Most of the GPU programming concepts and syntax will be explained in the context of the module, or referenced to an external source.

In this module, we will briefly cover general GPU topics such as hardware architecture and application speedup, followed by an introductory section on CUDA programming and performance optimization topics. Finally, lists of different C extensions (qualifiers, functions, and mathematic intrinsic functions) will be provided for reference.

Philip Nee
Cornell Center for Advanced Computing

July 2013

Reference Bibliography:
Kirk, David B., Hwu, Wen-mei W. "Programming Massively Parallel Processors, Second Edition: A Hands-on Approach". Morgan Kaufmann; 2 edition (December 28, 2012)