Chris Myers, Steve Lantz
Cornell Center for Advanced Computing

Revisions: 12/2018, 1/2018 (original)

The application context for this profiling and optimization analysis involves a Pseudo-spectral Dynamo Code that had been developed and ported among different HPC clusters. In porting to the KNL subsystem on the Stampede2 cluster at TACC, the investigators sought to understand how the specific details and configuration of the cluster would impact the performance and scaling of their code. In order to characterize and optimize application performance on this new cluster architecture, the investigators first consulted general TACC guidelines on performance optimization, before delving into the use of more fine-grained analysis tools.


After you complete this topic, you should be able to:

  • Describe some of the features of the target application code used for this case study
  • Highlight some of the compiler and runtime options recommended by TACC to achieve better application performance
  • Understand how optimization for different metrics (e.g., wall clock time, SU usage) can impact decisions about application configuration

There are no specific prerequisites for learning the information contained in this topic. If one is interested in applying these analyses to one's own application code, then access to Intel compilers and the Intel performance analysis tools (VTune, Advisor, Parallel Studio) is required.

©   Cornell University  |  Center for Advanced Computing  |  Copyright Statement  |  Inclusivity Statement