Scalable Algorithms
An efficient and scalable algorithm typically has the following characteristics:
- The work can be separated into numerous tasks that proceed almost totally independently of one another
- Communication between the tasks is infrequent or unnecessary
- Considerable computation takes place before messaging or I/O occurs
- There is little or no need for tasks to communicate globally
- There are good reasons to initiate as many tasks as possible
- Tasks retain all of the above properties as their numbers grow
Selecting a scalable algorithm (or suite of them) is an important aspect of the design phase, but there are also significant choices to be made during the implementation of the algorithm(s). The most basic choice is which platform to use for the problem at hand. For example, would the envisioned algorithm map well to Frontera, a massive machine with thousands of general purpose processors, each with dozens of cores that feature extra-wide vector units?
If so, then the next questions involve details of how to construct the code: how to split the problem into distinct processes (e.g., MPI tasks) and separate threads (e.g., via OpenMP directives); how many processes and threads to run on each node; and how to place processes or threads on particular cores.
In general, these kinds of decisions about implementation and techniques are secondary to algorithm choices.