Cornell Virtual Workshop > OpenMP Offload to GPUs > Offload Parallelism

Compiling for GPU Offload

To compile code with OpenMP offloading to NVIDIA GPUs, you need to specify the following compiler-specific flags, as recommended by TACC:

Compiler flags for OpenMP offloading to NVIDIA GPUs
Compiler	Commands	Offload flags
NVIDIA HPC Compilers	`nvc`, `nvc++`, `nvfortran`	`-mp=gpu`
GCC	`gcc`, `g++`, `gfortran`	`-fopenmp -foffload=nvptx-none`

When compiling with nvc, it is recommended to specify the target architecture with -gpu=cc100 on Horizon or -gpu=cc90 on Vista. If this option is omitted and the target GPU is not present at compile time (e.g., on a login node), nvc compiles the code for a broad range of compute capabilities, producing a larger executable. Specifying the Blackwell architecture that is found on Horizon, or the Hopper architecture that is found on Vista, keeps the build smaller.

For GCC, the above flags only work if NVIDIA GPU support was enabled at the time that GCC was installed. You can check this with the command gcc -v 2>&1 | grep "offload". If NVIDIA support was enabled, you should see --enable-offload-targets=nvptx-none included in the output. At TACC, this is true for all GCC versions on Horizon and Vista, but not on older clusters.

`OMP_TARGET_OFFLOAD`

The OMP_TARGET_OFFLOAD environment variable controls what happens when a target region is encountered at runtime. It accepts three values: default, disabled, and mandatory. The default behavior of OpenMP is to offload to the device if one is available, and fall back to the host otherwise. This behavior can be explicitly defined with OMP_TARGET_OFFLOAD=default. If it is set to disabled, it runs the target region on the host, and if it is set to mandatory, it aborts the program if offloading is not possible.

Back

© | Cornell University | Center for Advanced Computing | Copyright Statement | Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)