Roadmap: GPU Migration and Portability
We explore some basic ways of moving computational work and the associated data from CPUs to GPUs for codes that are well suited to computing on GPU architectures. After describing the host-device execution model, we compare a few of the capabilities of the CUDA and OpenMP programming models by looking at simple C++ code examples. Then, we survey additional software tools or systems that enable portability across different types of GPUs in various languages, to see what is available for writing code that will run on heterogeneous platforms.
Objectives
After you complete this roadmap, you should be able to:
- Describe the kinds of computing tasks that ought to work well on GPUs vs. CPUs
- Identify the roles of the CPU and GPU in kernel execution, memory allocation, and data transfers
- Write and execute simple C++ programs that offload computations to the GPU using CUDA and OpenMP
- Compare different code portability solutions (e.g., CUDA, OpenMP, HIP, SYCL, Kokkos, Alpaka) in terms of performance and ease of implementation
- Name common GPU strategies for Python developers
- Evaluate the trade-offs between performance portability and development complexity in research applications
Prerequisites
- Familiarity with High Performance Computing (HPC) concepts could be helpful, but most terms are explained in context.
- Roadmaps on Parallel Programming Concepts and High-Performance Computing and Understanding GPU Architecture are possible companions to this topic, for those who seek to expand their knowledge of parallel computing and GPUs, respectively.