Exercise
Let's start with code that is similar to the exercise from the roadmap on collective communication.
Briefly, what the code below does is to take NPTS
of data created in a loop on the rank 0 process,
allocate buffers on all other process ranks in the communicator to receive the data, and scatter the data to
these other processes, where it is then summed and printed.
On Stampede2 or Frontera, copy and paste the code above into a command line editor, then compile and run it using an interactive session. The Stampede2 and Frontera CVW Topics explain these steps in more detail.
- Compile:
mpicc scatterv_intercomm.c -o scatterv_intercomm
-
Start an interactive session using:
idev -N 1 -n4
-
Run the code using the
ibrun
MPI launcher wrapper.ibrun -np 4 scatterv_intercomm
After running the compiled code on four processors as above, you should see output similar to the following:
Sum from process 0 with sendcount 26: 104
Sum from process 3 with sendcount 25: 100
Sum from process 1 with sendcount 25: 100
Sum from process 2 with sendcount 25: 100
Now, let's assume that what was the rank 0 process is operating alongside other processes that each will rarely generate some data that needs to be summed (or in reality, probably a more expensive operation than sum). It wouldn't make sense to sum it locally, since that could cause synchronization delays in the local process group; we're assuming the sum operation is more expensive than scattering the data to other processes. Taking our simplified example, the original rank 0 process should belong to one intra-communicator while the processes performing sums should be part of another intra-communicator. Try using what you've learned so far to modify the example to reflect this design. For now, do not worry about altering anything related to the call to MPI_Scatterv (call it with the same arguments, including communicators, as previously): just try to create two new intra-communicators and join them as an inter-communicator. Think carefully about the arguments to MPI_Intercomm_create as well as how and when to call it. Please be aware that this will require an MPI-3 implementation if you plan to create the intra-communicators using a noncollective call.
You can read the missing lines below or continue reading this explanatory paragraph for more hints. Unlike the example for subdividing commands, we did not have to create every group on every process, due to our use of a non-collective call (MPI_Comm_create_group) for communicator creation. Compiling and running the modified code should produce identical results as previously since we are still handling all messages over MPI_COMM_WORLD in the same way. Now that we have the infrastructure set up to change the message-sending code to reflect the addition of our inter-communicator, try modifying the program to have the root thread and intra-communicator scatter its data to the other intra-communicator. Despite this being easy conceptually, it can be tricky to manage the different ranks for processes within different communicators, especially when trying to remember which communicator is being used for an argument in a functional call.
Solution
Since there are fairly pervasive changes between the intermediate solution and final solution, we recommend trying it yourself first, and revealing the hints and full solution at the end of the page when you are stuck. The final output should look like the following, since only three of the four processes are now doing summations:
Sum from process 1 with sendcount 34: 136
Sum from process 2 with sendcount 34: 136
Sum from process 3 with sendcount 33: 132
Expand to reveal the missing lines:
Note that my_comm isn't actually used in this code, but having a pointer to the intra-communicator that points to the intra-communicator for the current process can greatly simplify code that must be executed on both intra-communicators.
Expand to reveal the full solution code: