Besides offering thread parallelism on the CPU, OpenMP added support for offloading to GPU devices in version 4.0. With the target directive, you can make use of powerful constructs to parallelize C/C++/Fortran code for GPUs while maintaining portability. In this topic, we will cover various data mapping and parallelization directives on the device, as well as asynchronous data transfers.

Objectives

After you complete this roadmap, you should be able to:

  • Offload computation to GPUs with the target directive
  • Control data transfer between host and device with the map clause
  • Parallelize work on the device with teams, distribute, and loop
  • Manage data persistence across regions with target data
  • Compile functions for the device with declare target
  • Initiate asynchronous offloading and data transfer with the nowait clause
Prerequisites
  • A working knowledge of general programming concepts
  • A working knowledge of Linux; otherwise, try working through the Linux topic first
  • Ability to program in a high-level language such as Fortran or C
  • A basic familiarity with parallel programming concepts and OpenMP
Requirements

There are no requirements for this roadmap.

©  |   Cornell University    |   Center for Advanced Computing    |   Copyright Statement    |   Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)