The standard techniques for creating parallel programs assume a uniform architecture, with a uniform memory model. As a consequence, they fall into one of two categories:

  • Threads for shared memory
    • parent process uses pthreads or OpenMP to fork multiple threads
    • threads share the same virtual address space
    • also known as SMP = Symmetric MultiProcessing
  • Message passing for distributed memory
    • processes use MPI to pass messages (data) between each other
    • each process has its own virtual address space

In contrast, HPC cluster architectures are necessarily characterized by NUMA (Non-Uniform Memory Access). Global memory is distributed across all the nodes, but shared within nodes (though divided among sockets within a node). How do we deal with this situation? We attempt to combine both types of techniques in a hybrid strategy:

  • Hybrid programming
    • try to exploit the whole shared/distributed memory hierarchy

In the Creating Hybrid Configurations topic, we examine ways to write parallel programs that use OpenMP to take advantage of shared memory at the node level, while incorporating MPI to transmit data between nodes (or perhaps between sockets or cores on the same node).

 
©  |   Cornell University    |   Center for Advanced Computing    |   Copyright Statement    |   Inclusivity Statement