Douglas Fuller, Arizona State (original author), Steve Lantz
Cornell Center for Advanced Computing

Revisions: 10/2023, 7/2014, 3/2008 (original)

This topic describes several debugging tools and techniques commonly encountered in high performance computing, with a focus on runtime debugging. Specific methods may vary for different scenarios. For example, analyzing a core dump on Frontera may require adjusting certain settings to allow applications to dump the contents of memories into files in the first place. And debugging a running MPI program can require specialized software beyond that which is typically provided by a compiler vendor.

Objectives

After you complete this topic, you should be able to:

  • Name several types of errors caused by bugs in a program
  • Tell which types of errors can be diagnosed by a compiler
  • Distinguish between ad-hoc debugging and symbolic debugging, at runtime
  • Give reasons why a logging framework might be preferable to "printf" debugging
  • Perform basic debugging steps with gdb
  • Define the term "core dump"
  • Explain why optimized code can be harder to debug
  • Explain what makes parallel and distributed applications harder to debug
  • Describe the advantages of using a debugging tool such as DDT
Prerequisites

This topic presumes a basic knowledge of Linux/Unix. Although specific programming knowledge is not required, it will provide significantly enhanced context and understanding. Most of the tools presented apply equally well to C, C++, and Fortran. The topic is not constructed with any of these languages specifically in mind.

 
©   Cornell University  |  Center for Advanced Computing  |  Copyright Statement  |  Inclusivity Statement