Chris Myers, Steve Lantz
Cornell Center for Advanced Computing

Revisions: 12/2018, 1/2018 (original)

The performance of complex scientific codes running on advanced cluster architectures can be characterized along many different axes, each highlighting different aspects of memory usage, memory bandwidth, code hotspots, intra-node and inter-node scaling, and vectorization. The suite of performance analysis tools developed by Intel and used by the investigators provides insights into these various facets of code performance. The video presentation works through in some detail how these different analyses are carried out and what lessons can be learned from each different type of analysis.

Objectives

After you complete this topic, you should be able to:

  • Describe different types of performance analyses, such as hotspot identification, roofline plots, intra-node and inter-node scaling studies, memory access analyses, and vectorization analyses
  • Understand how to use various tools in the Intel suite to carry out different sorts of performance analyses
  • Summarize some of the lessons learned by the investigators for their particular target application
Prerequisites

There are no specific prerequisites for learning the information contained in this topic. If one is interested in applying these analyses to one's own application code, then access to Intel compilers and the Intel performance analysis tools (VTune, Advisor, Parallel Studio) is required.

 
©   Cornell University  |  Center for Advanced Computing  |  Copyright Statement  |  Inclusivity Statement