Example Batch Script
The best place to look for Slurm scripts relevant to Frontera is the User Guide, which offers a number of customizable batch scripts that you may be able to adapt to fit your needs. They cover all the scenarios described previously in Ways to Run Jobs.
Here, we examine just one sample batch script that illustrates several of the techniques that were described in Optional Job Attributes. We also introduce a new trick for launching multiple MPI runs within a single batch job using ibrun
:
#!/bin/bash
#SBATCH -J meaningfulName
#SBATCH -o meaningfulName.o%j
#SBATCH -e meaningfulName.e%j
#SBATCH -n 70
#SBATCH -N 5
#SBATCH -p normal
#SBATCH -t 00:30:00
#SBATCH -A P123456
ibrun -n 14 -o 0 ./mpi_prog1 &
ibrun -n 14 -o 14 ./mpi_prog2 &
ibrun -n 14 -o 28 ./mpi_prog3 &
ibrun -n 14 -o 42 ./mpi_prog2 &
ibrun -n 14 -o 56 ./mpi_prog3 &
wait
Usually, ibrun ./mpi_prog
is sufficient to start an MPI program. This command will automatically use the full number of tasks and nodes assigned to the batch job. Here, however, the -n
and -o
flags to ibrun
are used to specify the task counts and hostfile offsets respectively for 5 separate MPI runs. Given that the above batch script specifies a total of 70 tasks on 5 nodes, Slurm's hostfile is going to be constructed so that the first 14 entries refer to the first node, the next 14 entries refer to the second node, etc.; thus, each of the 5 runs is confined to a single node. (You can view Slurm's hostfile for a given job by issuing srun hostname
in your batch script or interactive session.)
Other things to notice above are the ampersands (&
) to launch the MPI runs in the background, allowing them to run simultaneously; and the wait
command to pause the script until all the background runs complete.
Of course, the 5 MPI runs initiated by the script should all take about the same amount of time to complete; otherwise, you would be wasting compute nodes, as well as your allocation. The example also assumes that each MPI task needs 4 cores in order to run successfully—perhaps because each task is multithreaded with OpenMP, or because each task uses 4 cores' worth of memory. If not, each task would use only a fraction of the resources available on a 56-core Frontera node.