Scripting can be helpful for looping over a set of parameters (numerical values, input file names, etc.) to produce one job submission per parameter. Of course, this technique can only succeed if the computations for the different parameter values are completely independent of one another.

Given the queue limits on Stampede2 and Frontera—where even the most expansive queues allow no more than 50 and 100 jobs per user, respectively—this approach is most useful when:

  • the number of parameters is small (i.e., less than 50 or 100), but
  • the run time for all parameters is expected to surpass the time limit of a single job, and
  • the run time varies strongly by parameter, so that one unified parallel job would likely be inefficient, due to poor load balance.

As a side note for Frontera: if each individual run is serial, then the "small" queue must be utilized, and there, the per-user job limit is much more restrictive (20).

Parameters are commonly passed to batch jobs via environment variables. In other words, the batch script is constructed to look at the values of certain environment variables to obtain the parameters it needs. Therefore, it becomes the responsibility of the submission script to define such variables appropriately for each sbatch invocation.

Alternatively, the parameters may be passed to the batch script via command-line arguments to the batch script itself. This method works very similarly to what is described below, and it will be covered in the exercise at the end of this section.

The following example illustrates an effective mechanism for scripting batch parameter sweeps:

First, write a batch submission script that expects to take in a parameter through an environment variable. In this example, that variable is MY_PARAM. my_program is executed with a command line option --some-param using ${MY_PARAM} as an argument. It is assumed that the value of this environment variable is passed to the batch script at launch time.

Next, write a shell script (or use your favorite scripting language) to loop through a set of parameters and export the environment variable MY_PARAM each time before running sbatch. The shell script will contain a loop like this:

The real question is whether each submission of the batch script will inherit the environment that existed in the above shell script at the time when sbatch was invoked. The answer is yes on Stampede2 and Frontera, because on those systems, the environment at the time of submission is automatically exported to the batch script when the job launches. But this may not be true in every Slurm configuration; on other systems, it may be necessary to use the --export option to ensure that one or more (or all) environment variables are passed along to the batch script.

 
©   Cornell University  |  Center for Advanced Computing  |  Copyright Statement  |  Inclusivity Statement