Higher-Level Alternatives to MPI-IO
This roadmap focuses on MPI-IO as a basic API for parallel I/O. But as a middleware layer, MPI-IO is perhaps not your best choice for an API — several higher-level alternatives exist. All build upon MPI-IO as their foundation, so most of the concepts presented in this topic remain relevant. Here are a few options you might want to consider:
- Parallel HDF5 (PHDF5) (PHDF5 CVW Topic)
- NetCDF 4 Parallel (requires Parallel HDF5)
- PnetCDF, for working with files in the NetCDF-1, -2, and -5 formats (PnetCDF CVW Topic)
- ADIOS, the ADaptable IO System from ORNL, Georgia Tech, Rutgers, others (ADIOS CVW Topic)
HDF5 and NetCDF refer to well-known file formats for storing numerical data in self-describing fashion. Historically, libraries already existed to do I/O to files in these specialized formats, so the software was extended to add parallel I/O capabilities. If you'd like to experiment with example codes that illustrate how you might use parallel HDF5 or PnetCDF in a C/C++, Fortran, or Python application, please refer to the Cornell Virtual Workshop roadmap Parallel I/O Libraries.
Note that TACC typically provides environment modules for most of the above: phdf5
, parallel-netcdf
, and pnetcdf
.
With regard to ADIOS2, here are some features that may make it an interesting choice for your application:
- Was created collaboratively with several major HPC simulation groups
- Offers scalable I/O on files up to petabytes in size
- Favors transferring groups of variables asynchronously ("streaming-oriented")
- Has Fortran, C, C++, and Python bindings
- Resembles standard POSIX routines for doing file I/O, after initial configuration
- Supports HDF5 as well as the custom BP (binary packed) format
- Can output visualization formats including VTK and BP (recognized by ParaView)
Learn more about ADIOS in the Parallel I/O Libraries roadmap.
Based on your application's needs, you may choose to take advantage of one of these higher-level approaches to doing parallel I/O, if you prefer to avoid the details involved with MPI-IO. Even if you use a high-level approach, it is worth understanding how MPI-IO solves the challenges inherent in parallel I/O because any solution you choose must overcome the same challenges.