Security
Aside from scheduling and queuing jobs from multiple users in order to fairly divide work among the compute nodes of a cluster, Slurm provides additional measures to assure that jobs and users do not interfere with one another. Perhaps the most useful to understand are features related to SSH access, job cleanup, and networking.
SSH Access
Slurm provides an optional Pluggable Authentication Module (PAM) to allow logins to compute nodes under certain circumstances. Policies can be implemented that allow users to ssh
into nodes allocated to their own jobs, while denying access to nodes allocated to other users. In other words, if you attempt to ssh
into a compute node that is assigned to someone else's job, you will be denied; if you attempt to ssh
into compute nodes that are running your job(s), you will succeed.
SSH can be a useful tool for interacting with running jobs. When invoked from within a batch script, ssh
can be used to execute commands on particular nodes within the allocation. Tools such as the TACC launcher are based on this principle. Likewise, it is possible to manually ssh
into allocated nodes to access the node in an interactive shell. This can be useful for inspecting the state of a node running a job (e.g. with the top
utility). CPU usage, I/O waiting, and other characteristics can be quickly and informally observed this way.
Job Cleanup
After a job completes, Slurm may be configured to clean up each node of the allocation by running an epilog script as root. Many clusters leverage this ability in order to clean up the /tmp
storage on a node and running processes once a job terminates. Any processes owned by the previous user will be terminated (even those not directly initiated by Slurm, including independent ssh
sessions described above).
Networking
While SSH access is restricted via PAM, Slurm has no influence over arbitrary communication ports that might be opened by applications that are running as part of a job. The remote desktop application VNC is a good example of a convenient tool that can open a port for incoming connections. Rather than have security depend entirely on the ability of applications like VNC to manage connections, Stampede2 and Frontera impose networking rules that disallow any direct network connections between the compute nodes and the outside world. If it is necessary to connect to a compute node from outside Stampede2 or Frontera, SSH tunneling must be used: first to a login node, then to the desired compute node.