MPI-IO Advantages
Two common alternatives to parallel MPI-IO are:
- Rank 0 accesses a file; it gathers/scatters its data from/to other ranks.
- Each rank opens a separate file on local disk and does I/O to it independently.
These alternative I/O schemes are simple enough to code, but they respectively have
- Poor scalability (e.g, the single task is a bottleneck), and
- Challenges with file management (e.g., the files must be collected from local disk over multiple nodes).
MPI-IO is a convenient interface for enabling true parallel I/O on systems that support it. It provides
- mechanisms for performing synchronization,
- syntax for data movement, and
- means for defining noncontiguous data layout in a file (MPI datatypes).
One big advantage of MPI-IO over Unix I/O is that the former has the ability to specify noncontiguous accesses in a file and related memory buffers. This is a common need in parallel applications where, for example, a distributed array may be stored in a single file, but in some rearranged order or layout. A sensible approach is therefore to
- read or write such a file by using a derived datatype in an MPI-IO call, and
- let the MPI implementation optimize the access.
Collective I/O combined with noncontiguous accesses generally yields the highest performance in MPI-IO.
©
|
Cornell University
|
Center for Advanced Computing
|
Copyright Statement
|
Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)