How OpenMP Works
In its most basic form,
OpenMP introduces parallelism into applications by launching a set of threads that execute portions of a code concurrently. There are mechanisms, described below, that determine how many threads are launched and which portions of your code and data to delegate to each thread. The threads that are launched on Stampede2 are pthreads, no different from the kind that you could launch explicitly with pthread_create()
When a parallel region is entered as a result of OpenMP directives, a thread team is created ("forked") by the primary thread, as pictured below.
Once the parallel region ends, synchronization can occur as part of a join. This is known as fork-join parallelism. Note that threads remain available for use after a parallel region, which allows for faster "forks" in the future.
While not covered in this topic, OpenMP 4.0 and 5.0 include other directives to control SIMD parallelism and the offloading of tasks to an attached accelerator. Here we focus on the multithreading directives that are available in all commonly installed versions of OpenMP.
In all versions of OpenMP, a directive consists of a sentinel, a directive name, and its clauses, if any. Depending on how the code is compiled, the sentinel is interpreted as the beginning of either a comment or an OpenMP directive.
The syntax of valid OpenMP sentinels is:
C |
Fortran 77 Fixed Format |
Fortran Free Format |
---|---|---|
#pragma omp |
!$OMP |
!$OMP |
C$OMP |
||
*$OMP |
The compiler flag that determines whether the OpenMP directives are interpreted by the compiler is -fopenmp
-qopenmp
Since all OpenMP directives are inserted into your source code as comments, a compiler that is unaware of OpenMP — or hasn't been given the appropriate flag — will ignore the directives and compile your code as an ordinary serial program.