Cornell Virtual Workshop > Parallel I/O > MPI-IO

MPI-IO Advantages

Two common alternatives to parallel MPI-IO are:

Rank 0 accesses a file; it gathers/scatters its data from/to other ranks.
Each rank opens a separate file on local disk and does I/O to it independently.

These alternative I/O schemes are simple enough to code, but they respectively have

Poor scalability (e.g, the single task is a bottleneck), and
Challenges with file management (e.g., the files must be collected from local disk over multiple nodes).

MPI-IO is a convenient interface for enabling true parallel I/O on systems that support it. It provides

mechanisms for performing synchronization,
syntax for data movement, and
means for defining noncontiguous data layout in a file (MPI datatypes).

One big advantage of MPI-IO over Unix I/O is that the former has the ability to specify noncontiguous accesses in a file and related memory buffers. This is a common need in parallel applications where, for example, a distributed array may be stored in a single file, but in some rearranged order or layout. A sensible approach is therefore to

read or write such a file by using a derived datatype in an MPI-IO call, and
let the MPI implementation optimize the access.

Collective I/O combined with noncontiguous accesses generally yields the highest performance in MPI-IO.

Back