Cornell Virtual Workshop > OpenMP Offload to GPUs

Offload Parallelism

Zilu Wang, Steve Lantz
Cornell Center for Advanced Computing

4/2026 (original)

OpenMP introduced two more levels of parallelism in version 4.0: simd and target. SIMD allows a single thread to process multiple data elements simultaneously, and target supports offloading a region to devices such as GPUs. Each level of parallelism offers different performance benefits depending on the resources at the program's disposal. In this topic, we will cover the process of offloading computation to GPUs through the target directive.

Objectives

After you complete this topic, you should be able to:

Offload computation to GPUs with the target directive
Parallelize work on the device with teams, distribute, and loop

Prerequisites

A working knowledge of general programming concepts
A working knowledge of Linux; otherwise, try working through the Linux topic first
Ability to program in a high-level language such as Fortran or C
A basic familiarity with parallel programming concepts and OpenMP

© | Cornell University | Center for Advanced Computing | Copyright Statement | Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)