Computational work on Frontera is scheduled through the Slurm Workload Manager. You use Slurm commands to submit, monitor, and control your jobs, even interactive ones. Jobs submitted to the scheduler are queued, then run on the compute nodes when resources become available. Frontera's compute nodes are inaccessible to you until they have been allocated to your job by Slurm.

Warning: Don't run jobs on the login node

If you just run your application code, and you haven't gone through the scheduler (or equivalent), you are actually running on the front end (login) nodes.

Running on the login nodes violates the Good Conduct guidelines mentioned earlier, and it may even result in your account being suspended. Therefore:

  • Never launch a resource-heavy executable directly from the command line of a login node
  • Note this policy applies to building your codes, too: make -j 4 is fine, but make -j 64 isn't

In this topic, we focus on the process of submitting jobs to run on the compute nodes. In the subsequent Managing Jobs topic, we look at how to monitor and adjust your jobs after they have been submitted.

Job Accounting

In Frontera's accounting system, 1 Service Unit (SU) equals the use of a typical compute node for one hour, or in other words, 1 node-hour on a normal node. Basic node-hours are multiplied by a charge rate which may differ from 1.0 for jobs that run in specialized queues.

Info: Thus, for any given job, the total cost in SUs is computed as:

SUs billed = (# nodes) x (job duration in wall clock hours) x (charge rate per node-hour)

For example, a job that runs in the normal queue for 2 hours using 4 nodes will cost 8 SUs: \[4 \text{ nodes} \times 2 \text{ hours} \times 1.0 = 8 \text{ SUs}\]

Note that the accounting system will only charge you for the resources you are allocated while your job actually runs, not for the time length you requested. Still, your job will generally spend less time waiting in the queue if you set a time limit that is fairly close to what you truly need for the job. The scheduler will find it easier to come up with a slot for the 2 hours you really need instead of the 24 hours you might have requested arbitrarily in your job script.

To display a summary of your TACC project balances and disk quotas at any time, execute:

$ /usr/local/etc/taccinfo
 
©   Cornell University  |  Center for Advanced Computing  |  Copyright Statement  |  Inclusivity Statement