Cornell Virtual Workshop > Getting Started on Frontera > Running Jobs

Ways to Run Jobs

To succeed in using a major HPC resource like Frontera, it is essential to have a workable plan for parallelizing computations. There is no one right way to do this; in fact, it may be best to rely on a combination of techniques. As you might expect, Frontera comes with an array of tools to facilitate the various styles of parallel processing. The main methods that are available are summarized in the table below.

Main tools available for parallel processing.
Program type	# of nodes	How to run in parallel on Frontera
Multithreaded program (OpenMP, TBB)	Single node	Set number of threads and run program
High throughput computing with serial or multithreaded code	1+ nodes	Use `launcher`, `launcher_gpu`, `pylauncher`, or `gnuparallel` (the `parallel` command)
MPI program	1+ nodes	Start program with `ibrun`
Hybrid of the above	1+ nodes	Use any or all methods in combination

You'll first want to ensure that your program can make good use of the available cores and memory on a single compute node. Just by itself, one node can accommodate up to 56 OpenMP threads, or MPI tasks, or independent serial processes. To go beyond one node, it is necessary to use MPI, or to have some means of launching independent processes that run in parallel on multiple nodes.

You will almost certainly need MPI to run an application at scale on Frontera. TACC systems feature a special MPI starter called ibrun that streamlines the process for users. Among other benefits, ibrun works with Slurm's batch environment to produce suitable hostlists for jobs. It also provides a uniform interface for different MPI stacks. Otherwise, you would need to remember the varied run commands you see here:

Intel MPI: mpiexec.hydra
MVAPICH2: mpirun_rsh
OpenMPI: mpirun (not yet available on Frontera)

Later, we'll look at an example batch script showing ibrun usage.

In addition, TACC systems provide several non-MPI launchers for high-throughput-computing, in which many independent workers grab tasks from a workpile until all tasks are done. A parameter sweep is the prototypical example of this type of computing. Frontera's launcher utilities are not covered in depth here; to learn about any of them, consult module help, for example:

$ module help launcher

This will print some clues about usage. After loading one of the modules, you can also find relevant documentation in the directory $TACC_<modulename>_DIR (or a subdirectory), where <modulename> is the name of the module (in all caps): either launcher, launcher_gpu, pylauncher, or gnuparallel. None of these utilities comes with man pages; however, for gnuparallel (only) you can run either man parallel or parallel --help for more information..

Back