What Is PAPI?
The Performance Application Programming Interface (PAPI) from the University of Tennessee is a library for aggregating and analyzing system performance counters. It is an effort to relieve the programmer of the overhead of learning the individual implementation details for the hardware performance counters on each individual platform he or she may wish to utilize. Each platform has unique and complex specifications for performance counters, so this effort is of great value. This library is oriented towards statistical sampling-based profiling.
PAPI provides interfaces to counters for the following events:
- Floating point operations and instructions
- Total instructions and cycles
- Cache accesses and misses
- Translation Lookaside Buffer (TLB) counts
- Branch instructions taken, predicted, and mispredicted
PAPI also provides PAPI_flops
, a high-level routine which provides a basic, yet
powerful way to measure individual floating point operations and floating point
operations per second for individual sections of code. PAPI_flops
is not
guaranteed to be thread-safe. Its usage is outlined in
the man page on Frontera:
PAPI also provides a set of preset events based on the availability of various
hardware event counters available on a given platform. These events provide an
abstraction layer for cross-platform performance tuning. To see which PAPI
events are available for a given platform, run the papi_avail
command in the bin
directory of the PAPI build.
PAPI provides a "high-level" interface designed to collect coarse-grained
measurements with a minimum of code modification. These high-level functions
take care of calling the individual low-level functions required to generate the
requested data. These high-level routines come at a cost of some flexibility.
For example, high-level functions (including PAPI_flops
) are not guaranteed to
be thread-safe. They are restricted to the aforementioned PAPI preset
events and only those events that may be counted simultaneously by the
hardware.