Cornell Virtual Workshop: Parallel I/O Libraries

Roadmap: Parallel I/O Libraries

Topics

1. Structured Data 2. PnetCDF 3. PHDF5 4. ADIOS

Many scientific applications work with structured data, and in many cases such data require pre- and post-processing. I/O libraries exist that not only allow applications to work with portable, self-describing file formats, but also provide tools to process the data. In this roadmap, we introduce parallel I/O libraries and techniques that can be used to increase the throughput and efficiency of I/O bound applications. Efficient parallel I/O becomes extremely important as we scale scientific applications across large numbers of nodes comprised of multiple cores and accelerators.

Objectives

After you complete this roadmap, you should be able to:

Summarize the motivation for using I/O libraries like netCDF and HDF5, as well as their parallel counterparts PnetCDF and PHDF5.
Describe the basic use of the PnetCDF and PHDF5 APIs, as well as the rationale for the ADIOS API.
Use a high level I/O library with parallel code.

Prerequisites

A working knowledge of general programming concepts
Ability to program in a high-level language such as Fortran, C, or C++
A basic familiarity with parallel programming concepts
A basic familiarity with parallel I/O concepts

Requirements

The examples and exercises in this roadmap are designed to run on Frontera or Stampede3. To use these systems, you need:

A TACC account to login to Frontera or Stampede3
A compute time allocation for Frontera or Stampede3