Running via Slurm
On typical HPC systems, you submit parallel jobs through Slurm, a batch scheduling system for the compute nodes. As a rule, it is not permitted to run MPI programs directly on login nodes. This measure is a protection against unintended consequences that could affect login node performance for other users.
The compute nodes are commonly grouped into different queues (called partitions in Slurm), according to the characteristics that each group shares.
The relatively brief time limit on jobs in the development queue is meant to ensure short wait times.
Jobs can be submitted to Slurm using the sbatch
command and a job script. The name of the script file appears as the first argument to sbatch
. Special comments at the head of the job script provide Slurm with the parameters for the job. The purpose of many of these parameters can be inferred simply by examining the following sample script:
The key command above is the final one, ibrun
, which is a TACC-specific front end to the mpirun
command that actually initiates the MPI processes on one or more nodes. By default, ibrun
will distribute the full number of tasks across the full number of nodes requested via Slurm. On HPC systems other than TACC's, you would simply use mpirun
directly (or equivalently, mpiexec
) for the same purpose. The entire process of job initiation through Slurm is illustrated in the figure below:


Assuming the above batch file is saved in mpi_batch.sh, you would submit it by running the following on a TACC system like Frontera or Vista:
The output and error files from the job are generated in your current directory. Note that the MPI environment in the batch job should match the MPI environment that you used to compile the code. At TACC, this will happen automatically if the correct environment modules are loaded when you submit your batch job.
For more a more in-depth explanation about how to use Slurm for running your MPI jobs on TACC systems, refer to the appropriate user guide, e.g., the Frontera User Guide or the Vista User Guide. Also, the Cornell Virtual Workshop roadmap Getting Started on Frontera offers a complete introduction to using compilers, libraries, and batch jobs on Frontera.
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)