Managing Jobs
Zilu Wang, Steve Lantz
Cornell Center for Advanced Computing
12/2025 (original)
Vista is the forerunner to Horizon, which will ultimately be one of the largest academic supercomputers in the world. It is a resource provided through The University of Texas at Austin's Texas Advanced Computing Center (TACC), where it serves as a bridge to the NSF's Leadership Class Computing Facility (LCCF). Vista, like Horizon, is targeted towards scientific computing projects that require highly capable resources for AI and other HPC applications. It accordingly features NVIDIA "superchips" that closely couple CPUs with GPUs.
This topic shows you how to use different Slurm commands to track and control the progress of your batch job, and suggests what to try if something goes wrong.
Objectives
After you complete this topic, you should be able to:
- Explain how to monitor a job’s progress while it is running
- Discuss creating job dependencies
- Explain how to assign job attributes and why doing so may be useful
- Name key troubleshooting measures when a job does not run as expected
Prerequisites
Vista is intended as a bridge to a leadership-class system, so its prospective users are already likely to have a high degree of familiarity and experience with HPC and parallel computing. The pace of this topic is meant to be relatively brisk, for that reason.
With that being understood, there are no formal prerequisites for this Virtual Workshop topic. A working knowledge of Linux is recommended; if you need more preparation in Linux, try working through the Linux roadmap first.
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)