You will use the NVIDIA nvcc compiler to compile CUDA codes in C or C++. The nvcc compiler is suffix-sensitive, so be sure to use the proper suffix or file extension. The following table shows the supported suffixes for the nvcc compiler. If your device code is short (the lab exercises in this roadmap are short enough), you can put your host and device codes (CPU and GPU codes) in a single .cu file, then compile it using the nvcc compiler.

File extensions recognized by nvcc.
Extension Type of File
.cu CUDA source file, containing host and device codes
.c C source code
.cc, .cxx, .cpp C++ source code

The conventional compiler flags can be used to customize compilation. For example, -O sets the optimization level, and the -pg option allows you to instrument the executable for use with the gprof profiler. For more details, see the official nvcc documentation.

In this roadmap, all lab exercises can be compiled using the following command and compiler flags.

compute_75 and sm_75 instruct the compiler to compile source code at Compute Capability 7.5 for Turing-based NVIDIA GPUs. This is the family that the Quadro RTX 5000 GPUs on Frontera belong to, for example. The -o flag specifies the name of the output executable.

While it's not necessary to specify the -arch and -code flags with the Compute Capability (CC) of your hardware, doing so allows nvcc to use the low-level instructions that work best to speed up the program execution time, while maintaining binary compatibility with GPU devices of equal or higher CC.

Note that on Frontera, Vista, and similar HPC systems, you must first load the correct environment module to gain access to the nvcc compiler. The following command loads the CUDA 12.2 module on Frontera or the CUDA 12.5 module on Vista:

 
©  |   Cornell University    |   Center for Advanced Computing    |   Copyright Statement    |   Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)