Roadmap: Scalability
How do you optimize the overall performance of a big, computationally intensive code? On an HPC cluster like Frontera at TACC, with 8,570 nodes and nearly half a million cores of various kinds, the most beneficial optimizations are likely to be those that improve a code's scalability. Such optimizations allow your code to run on more and more processors. With that goal in mind, we start with grand design principles and strategies, then proceed to a discussion of software interfaces, the network interconnect, and even processor microarchitectures. This content should guide you towards the right level to concentrate your efforts, and give you some ideas about what to do to make your code more efficient.
Objectives
After you complete this roadmap, you should be able to:
- Explain the algorithmic and design choices that ensure good application performance in parallel
- Identify experiments to demonstrate an application's scalability on Frontera or other HPC resources
- Prepare a document justifying a request for a large-scale allocation on TACC Frontera, or a similar "Code Performance and Resource Costs" document for an allocation request for NSF ACCESS resources
- List factors that influence the placement of parallel tasks on nodes and cores
- Define hybrid programming and explain its advantages for producing scalable codes
Prerequisites
There are no prerequisites for this roadmap. However, before you attempt to optimize your code for any particular HPC system, you should become familiar with the specific architecture and hardware characteristics of that system. You should also be aware of profiling and debugging tools that are available for it.
In the case of Frontera at TACC, good references would be the Virtual Workshop roadmaps on Parallel Programming Concepts and Vectorization. But it is not necessary to complete the roadmaps on those topics before diving into this one.
Requirements
System requirements include:
- There are no specific requirements for this roadmap.