A scan or prefix-reduction operation performs partial reductions on distributed data. Particularly, let n be the size of the process group, \(d_k[j]\) be the j-th data item in process k before the scan, and \(D_k[j]\) be the j-th data item in process k after returning from scan. We will index rank with \(k = 0, 1, \ldots, n-1\). The local data is indexed by \(j = 0, 1, \ldots, c-1\). A scan returns:

\[D_k(j) = d_0[j] * d_1[j] * \ldots * d_k[j]~\text{,}\]

where \(*\) is the reduction function, which again may be either an MPI predefined function or a user-defined function.

Note, the reduction operator is always one that performs a contraction across ranks, like +; therefore, the local array of results \(D_k\) will always be size c. Furthermore, the local results will generally depend on k, because on rank k, the reduction operation is performed only on corresponding elements \(d_k[j]\) from ranks 0 to k.

MPI Scan Syntax
int MPI_Scan(void *sbuf, void *rbuf, int count,
     MPI_Datatype datatype, MPI_Op op, MPI_Comm comm)
MPI_SCAN(sbuf, rbuf, count, datatype, op, comm, ierr)
Scan parameters:
sbuf
is the starting address of the send buffer
rbuf
is the address of the receive buffer
count
is the number of elements in the input buffer
datatype
is the data type of elements in the input buffer
op
is the reduction operation
comm
is the group communicator
Tip: Segmented Scan

Create a subgroup for each segment to perform a segmented scan.

C MPI_Scan Example

Below is an example illustrating the use of MPI_Scan, where we have a histogram distributed across nodes in exp_pdf_i, and then calculate the cumulative frequency histogram (exp_cdf_i) across all nodes. Although this is a simple example using a few floating point sums as the reduce operation, MPI_Scan would be most useful when the reduce operation is expensive and where each partial result would then be used in another computation on a single process.

Example output
process 1: cumulative sum = 0.889400
process 4: cumulative sum = 1.612789
process 7: cumulative sum = 1.954493
process 6: cumulative sum = 1.867606
process 3: cumulative sum = 1.428849
process 5: cumulative sum = 1.756041
process 0: cumulative sum = 0.500000
process 2: cumulative sum = 1.192666
process 8: cumulative sum = 2.022161
process 9: cumulative sum = 2.074860
 
©   Cornell University  |  Center for Advanced Computing  |  Copyright Statement  |  Inclusivity Statement