Steve Lantz (2021 author), Aaron Birkland (2014 author), with contributions from TACC
Cornell Center for Advanced Computing

Revisions: 8/2021, 4/2014 (original)

In this topic we present a quick-start review of how Slurm functions as a resource manager, touching upon its methods for allocating resources, executing workloads, and enforcing security, before launching into more advanced topics.

Objectives

After you complete this topic, you should be able to:

  • Explain in general terms how Slurm allocates computational resources, executes specified workloads, and maintains security
  • Explain the purposes of the sbatch, srun, and salloc commands
  • Identify different scenarios where the above commands would be used
  • Describe the sequence of events that occur within the sbatch and srun commands
  • Name the advantages conferred by using idev at TACC
  • Explain how Slurm ensures secure access to the resources that are assigned to a batch job
Prerequisites
  • Basic understanding of what Slurm is, how it generally works, and how it is used to submit jobs on an HPC cluster
  • Knowledge of the roles of MPI and OpenMP in applications that run on HPC systems
  • Familiarity with Linux and scripting languages (e.g., scripting in bash or Python)
 
©   Cornell University  |  Center for Advanced Computing  |  Copyright Statement  |  Inclusivity Statement