Debugging
Douglas Fuller, Arizona State (original author), Steve Lantz
Cornell Center for Advanced Computing
Revisions: 10/2023, 7/2014, 3/2008 (original)
This topic describes several debugging tools and techniques commonly encountered in high performance computing, with a focus on runtime debugging. Specific methods may vary for different scenarios. For example, analyzing a core dump on Frontera may require adjusting certain settings to allow applications to dump the contents of memories into files in the first place. And debugging a running MPI program can require specialized software beyond that which is typically provided by a compiler vendor.
Objectives
After you complete this topic, you should be able to:
- Name several types of errors caused by bugs in a program
- Tell which types of errors can be diagnosed by a compiler
- Distinguish between ad-hoc debugging and symbolic debugging, at runtime
- Give reasons why a logging framework might be preferable to "printf" debugging
- Perform basic debugging steps with
gdb
- Define the term "core dump"
- Explain why optimized code can be harder to debug
- Explain what makes parallel and distributed applications harder to debug
- Describe the advantages of using a debugging tool such as DDT
Prerequisites
This topic presumes a basic knowledge of Linux/Unix. Although specific programming knowledge is not required, it will provide significantly enhanced context and understanding. Most of the tools presented apply equally well to C, C++, and Fortran. The topic is not constructed with any of these languages specifically in mind.
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)