Chris Myers, Steve Lantz
Cornell Center for Advanced Computing

Revisions: 4/2024, 1/2022, 2/2021 (original)

The hierarchy of computational components in an advanced HPC cluster can be arranged into three levels: nodes; cores; and vector processing units. These three levels can also can be visualized as three different dimensions in space. For effective parallelization of an application, the three dimensions may all turn out to be comparable in importance and should therefore be given due attention. However, the parallelization strategy is different in each case. Typically one employs message passing among nodes, multithreading among cores, and autovectorization for the VPUs. We consider each of these dimensions in turn and present a short example code that promotes scaling along the corresponding axis.

Objectives

After you complete this topic, you should be able to:

  • Explain how CPUs participate in both shared and distributed memory parallelism
  • Describe the role of SIMD in computation
  • Define the terms scaling out, scaling up, and scaling deep
  • Compile a simple MPI code and run it across multiple nodes
  • Compile a hybrid OpenMP/MP code and run it using multiple threads across multiple nodes
  • Identify compiler flags that enable auto-vectorization
  • Describe the features of a code that allow it to auto-vectorize well
  • Explain how memory bottlenecks act to inhibit peak vector performance
Prerequisites
  • Familiarity with High Performance Computing (HPC) concepts. Those who are less conversant with HPC terms and techniques should be prepared to inspect the glossary terms rather frequently. It may also be helpful to review Cornell Virtual Workshop content on Parallel Programming Concepts and High-Performance Computing and either MPI or OpenMP.
  • Programming experience in C or Fortran. Introductions to C and Fortran are available, though the reader will need to look elsewhere for a full tutorial on these languages.
  • Readers who need an introduction to either Stampede3 or Frontera will find it helpful to first review one of more of the following items: the Stampede3 User Guide, the Frontera User Guide, and the Getting Started on Frontera CVW material.
 
©  |   Cornell University    |   Center for Advanced Computing    |   Copyright Statement    |   Inclusivity Statement