Cornell Virtual Workshop > Introduction to CUDA > CUDA Concepts

Submit a Job to Slurm

Running GPU code on a personal computer is very different from running it on a large cluster system like Frontera or Vista. On a personal computer, you install the required packages, then compile and execute the application. On an HPC system like Frontera or Vista, there are more steps: first you run the module command to load CUDA-related utilities such as nvcc; then compile your GPU code on one of the login nodes; then prepare a batch script and submit the job to one of the GPU queues, using Slurm.

This section provides step-by-step instructions for running a simple GPU job on Frontera or Vista. With some modification, the same steps should work on any cluster that features NVIDIA GPUs and a Slurm scheduler. Programs for the other lab exercises in this roadmap can be compiled and run in a similar manner.

Compilation

The first and most important step is to load the CUDA module, to gain access to the nvcc compiler. You will use the NVIDIA nvcc compiler to compile CUDA code on Frontera or Vista. The following command will load the CUDA module:

In this roadmap, all lab exercises can be compiled using the following command and compiler flags.

compute_75 and sm_75 instruct the compiler to compile source code for Turing-based NVIDIA GPUs, which is the family that the Quadro RTX 5000 GPUs on Frontera belong to. The -o flag specifies the name of the output executable.

Batch File Preparation

Your GPU job batch script is a typical job script, but you can only submit your GPU jobs to a GPU queue:

No GPU device is attached to the login nodes, so you cannot test your code there
Not every Frontera and Vista compute node is equipped with a CUDA device—only the nodes in the GPU queues
Note, you can request more than 1 GPU node
In Frontera's GPU queues, each node has 4 Quadro RTX 5000 cards and 2 Intel Xeon E5-2620 v4 (“Broadwell”) CPUs
In Vista's GPU queues, each node has 1 GH200 Grace-Hopper Superchip, consisting of 1 "Hopper" (H200) GPU and and 1 "Grace" CPU

Frontera and Vista schedule batch jobs using the Slurm resource manager. A sample Slurm batch script follows. The queue name used here is Frontera's GPU development queue, rtx-dev. The rtx queue also works for GPU jobs, but you may have to wait longer in the queue.

The following tables give the names and time limits of the Slurm partitions (queues) suitable for GPU jobs on Frontera and Vista, respectively.

GPU queues on Frontera.
Partition Name	Time Limit	Description
rtx	48 hrs	GPU nodes
rtx-dev	2 hrs	GPU development nodes

GPU queues on Vista.
Partition Name	Time Limit	Description
gh	48 hrs	GPU nodes
gh-dev	2 hrs	GPU development nodes

Job Submission

Once your source code is compiled and your batch file is prepared, submit your job using the sbatch command:

Back

© | Cornell University | Center for Advanced Computing | Copyright Statement | Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)