Parallel Construct
In codes that use OpenMP primarily for multithreading,
the first OpenMP construct that typically appears is the parallel construct.
An OpenMP construct consists of an OpenMP directive, together with the executable code that is affected by it; the latter is referred to as a region.
It is when the parallel
directive is encountered that new threads are started. These threads will continue in existence until the parallel region for which they were started comes to an end. A parallel region is delimited by a block of code enclosed in curly braces { … }
in C/C++ and by an END PARALLEL
directive in Fortran.
Whatever code is enclosed in a parallel region will be executed by every thread in the current team, exactly as written. This sounds like it would simply lead to a wasteful duplication of effort! However, as we will see, OpenMP provides additional constructs and functions to ensure that every thread can indeed do different, useful work in parallel. To ensure proper coordination among the threads, it is illegal to branch into or out of a parallel region; likewise, if a thread should happen to terminate in a parallel region, all the rest of the threads will be killed.
The behavior of the parallel construct can be influenced by three control variables that an OpenMP runtime maintains. These variables are accessible only through utility functions and environment variables. The environment variables are listed in the table below in ALL CAPS.
Control variable | Ways to modify | Way to query1 | Default |
---|---|---|---|
nthreads | OMP_NUM_THREADSomp_set_num_threads() |
omp_get_max_threads() |
vendor-defined |
dynamic | OMP_DYNAMIComp_set_dynamic() |
omp_get_dynamic() |
vendor-defined |
nesting | OMP_NESTEDomp_set_nested() |
omp_get_nested() |
false |
1Note that omp_get_max_threads()
is the appropriate query for finding the value of the nthreads control variable; omp_get_num_threads()
returns the number of threads in the currently executing parallel region, which may be less.
Importantly, the parallel construct can be combined with any of the worksharing constructs that are described in later sections. The relevant constructs are the for and do worksharing loops, as well as sections, single, and workshare. But let's start by focusing on the parallel construct itself. It begins with the OpenMP sentinel and can be followed on the same line (or continuation of that line) by a number of modifiers in clauses.
Examples:
!$OMP PARALLEL [clause[, clause]]
⋮
!$OMP END PARALLEL
!$OMP PARALLEL DO [clause[, clause]]
DO I=1,1000000
⋮
END DO
!$OMP END PARALLEL DO [NOWAIT]
#pragma omp parallel for [clause[, clause]]
for (i=0; i<1000000; i++)
{ … }
Next, we will consider how clauses modify the constructs initiated by OpenMP sentinels, taking the parallel construct as our example.