Most modern applications and software use multiple threads. These programs start with a main function but launch additional threads to handle specific activities, like obtaining input from the user, updating the user interface, processing database queries, or accessing web services. Each thread is like a serial program in that it will have nested function calls and thus a call stack containing functions that are awaiting the completion of subfunctions. What is different here is that, at any instant, the operating system can schedule the various threads on different cores and run the threads concurrently. Multithreading is what allows applications running on a personal computer to have a responsive GUI while computations occur on other threads. On Stampede2, multithreading allows computationally-intensive programs to use multiple threads to carry out different parts of the computation simultaneously.

Shared memory programming means that all threads of execution within the same parent process can access the same values. When threads execute in parallel, however, there is no guarantee that instructions from one thread will execute before the instructions in the other unless the code includes specific instructions for thread coordination. For instance, with two threads, both threads could assign a value to an element of an array whose components are in the virtual memory of the parent process.

Two threads accessing a single array element in shared memory. Thread 0 reads the array element, adds 2 to the value and then writes the newly computed sum back to the array element. Thread 1 sets the value of the array element to 3.
Two threads accessing a single array element in shared memory. Thread 0 reads the array element, adds 2 to the value and then writes the newly computed sum back to the array element. Thread 1 sets the value of the array element to 3.
In this example, two threads of execution each have read and write access to the same vector in the memory of the parent process.

In the example above, the computer will execute an interleaving of the operations in the two threads, but the particular interleaving is not defined and could change each time the code is executed. In this case, each interleaving produces a different result (for simplicity, assume the vector was initialized with zeros)! The figure below illustrates three possible interleavings. If you want the first thread (thread 0) to use the value placed in the vector by the second thread (thread 1), you need to use a mechanism that assures thread 1 has written the value before thread 0 reads it.

three possible interleavings of the operations within two threads
three possible interleavings of the operations within two threads
The operation in the second thread could occur before, between, or after the operations in the first thread, leading to different results.

In summary, programs for shared memory computers typically use multiple threads in the same process. If more than one thread accesses the same memory location in the task's virtual address space, some mechanism needs to be used to guarantee that the contents of that location are what the programmer intends. Note that OpenMP, a commonly used API for facilitating shared memory programming, includes mechanisms to ensure that operations like the one shown above occur in the desired order.

 
©   Cornell University  |  Center for Advanced Computing  |  Copyright Statement  |  Inclusivity Statement