CUDA Higher-Level Constructs
Zilu Wang and Steve Lantz
Cornell Center for Advanced Computing
8/2025 (original)
Up to this point, we have introduced the fundamentals of CUDA, covering enough of the concepts to create a functioning CUDA program. Modern CUDA programs are much more complex and involve multiple concurrent kernel calls and data transfers. In this topic, we will explore higher-order constructs that streamline kernel programming and enable concurrent operations.
Objectives
After you complete this topic, you should be able to:
- Understand the dimensional structure of thread blocks and grids
- Use concurrent streams for asynchronous operations
- Apply synchronization in the appropriate contexts
Prerequisites
This topic covers basic CUDA programming and its connection to GPU architecture using the C programming language. A working knowledge of C/C++ and some understanding of parallel computing are necessary for this topic. Thus, you may want to complete An Introduction to C Programming and Parallel Programming Concepts and High-Performance Computing before beginning this topic. While GPU terms are explained in the context of CUDA programming, this topic does not cover the specifics of GPU architecture; you may want to complete Understanding GPU Architecture to learn more about that. No prior experience with CUDA programming or GPUs is assumed.
Should you need an in-depth reference, NVIDIA provides complete documentation for CUDA. Visit their website to see the latest versions of their NVIDIA CUDA Runtime API and CUDA C++ Programming Guide.
The Frontera User Guide and Vista User Guide have just a few short sections on GPUs with information on node types, job submission, and machine learning software. If you're on Frontera or Vista, be sure to load the CUDA module before compiling any programs. The following command will load the most recently installed CUDA version on Frontera or Vista:
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)