Skip to content


A job is an allocation of resources such as compute nodes assigned to a user for an amount of time. Jobs can be interactive or batch (e.g. a script) scheduled for later execution.


NERSC provides an extensive set of example job scripts

Once a job is assigned a set of nodes, the user is able to initiate parallel work in the form of job steps (sets of tasks) in any configuration within the allocation.

When you login to a NERSC system you land on a login node. Login nodes are for editing, compiling, preparing jobs. They are not for running jobs. From the login node you can interact with Slurm to submit job scripts or start interactive jobs.

NERSC supports a diverse workload including high-throughput serial tasks, full system capability simulations and complex workflows.


NERSC uses Slurm for cluster/resource management and job scheduling. Slurm is responsible for allocating resources to users, providing a framework for starting, executing and monitoring work on allocated resources and scheduling work for future execution.

Submitting jobs


sbatch is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks.

When you submit the job, Slurm responds with the job's ID, which will be used to identify this job in reports from Slurm.

nersc$ sbatch
Submitted batch job 864933

Slurm will also check your file system usage and reject the job if you are over your quota in your scratch or home file system. See here for more details.


salloc is used to allocate resources for a job in real time. Typically this is used to allocate resources and spawn a shell. The shell is then used to execute srun commands to launch parallel tasks.


srun is used to submit a job for execution or initiate job steps in real time. A job can contain multiple job steps executing sequentially or in parallel on independent or shared resources within the job's node allocation. This command is typically executed within a script which is submitted with sbatch or from an interactive prompt on a compute node obtained via salloc.


At a minimum a job script must include number of nodes, time, type of nodes (constraint), and quality of service (QOS). If a script does not specify any of these options then a default may be applied.


It is good practice to always set the account option (--account = <NERSC Repository>).

The full list of directives is documented in the man pages for the sbatch command (see. man sbatch). Each option can be specified either as a directive in the job script:


Or as a command line option when submitting the script:

nersc$ sbatch -N 2 ./

The command line and directive versions of an option are equivalent and interchangeable. If the same option is present both on the command line and as a directive, the command line will be honored. If the same option or directive is specified twice, the last value supplied will be used.

Also, many options have both a long form, eg --nodes=2 and a short form, eg -N 2. These are equivalent and interchangable.

Many options are common to both sbatch and srun, for example sbatch -N 4 ./ allocates 4 nodes to, and srun -N 4 uname -n inside the job runs a copy of uname -n on each of 4 nodes. If you don't specify an option in the srun command line, srun will inherit the value of that option from sbatch.

In these cases the default behavior of srun is to assume the same options as were passed to sbatch. This is achieved via environment variables: sbatch sets a number of environment variables with names like SLURM_NNODES and srun checks the values of those variables. This has two important consequences:

  1. Your job script can see the settings it was submitted with by checking these environment variables

  2. You should not override these environment variables. Also be aware that if your job script does certain tricky things, such as using ssh to launch a command on another node, the environment might not be propagated and your job may not behave correctly


Option Cori
nodes 1
time 10 minutes
qos debug
constraint _
account set in Iris

Available memory for applications on compute nodes

Since OS uses some memory from the total memory of 128 GB on a Haswell compute node and 96 GB on a KNL compute node, the available memory we set in Slurm for applications to use is 118 GB on a Haswell node, and 87 GB on a KNL node.

Monitoring jobs

Continuously running squeue/sqs with e.g. watch and especially multiple instances of "watch squeue/sqs" is not allowed. When many users are doing this at once it impacts the performance of the job scheduler which is a shared resource.

If you must monitor your workload run only single instances of squeue or sqs If watch is essential to your workflow then limit the refresh interval to 1 min (watch -n 60) and be sure to terminate the process when you are not actively using it.

Additionally the sacct command (sacct -X -s pd,r) uses less expensive queries for much of the same information, but the same advice about watch applies.

For users who are interested in monitoring their job's resource usage while the job is running, NERSC provides the ssh_job command.


sacct is used to report job or job step accounting information about active or completed jobs.

nersc$ sacct -j 864932
       JobID    JobName  Partition    Account  AllocCPUS      State ExitCode 
------------ ---------- ---------- ---------- ---------- ---------- -------- 
864932     regular         m1234         64  COMPLETED      0:0 

More information about sacct output format options can be found at the SchedMD website.


sqs is used to view job information for jobs managed by Slurm. This is a custom script provided by NERSC which incorportates information from several sources.


See man sqs for details about all available options, and man squeue for information about job state and reason codes.

nersc$ sqs
864933  PD  elvis  first-job.*  2     10:00     0:00  2018-01-06T14:14:23  regular   avail_in_~48.0_days  None


sstat is used to display various status information of a running job or job step. For example, one may wish to see the maximum memory usage (resident set size) of all tasks in a running job.

nersc$ sstat -j 864934 -o JobID,MaxRSS
       JobID     MaxRSS 
------------ ---------- 
864934.0          4333K 

Information about sstat format options and other sstat examples are located at the SchedMD website.

Email notification

#SBATCH --mail-type=begin,end,fail


To allow users to check resource usage of a job while it is running (e.g., by running top), NERSC provides the ssh_job command. This command must be run from a login node, and the syntax is

nersc$ ssh_job <job ID>

where the argument to the command is the Slurm job ID of interest. For multi-node jobs, this will SSH into the first node in the allocation. From this node, you can then SSH to other nodes allocated to the job.

Updating jobs

The scontrol command allows certain charactistics of a job to be updated while it is still queued. Once the job is running most changes will not be applied.

Change timelimit

nersc$ scontrol update jobid=$jobid timelimit=$new_timelimit

Change QOS

nersc$ scontrol update jobid=$jobid qos=$new_qos

Change repository

nersc$ scontrol update jobid=$jobid account=$new_repo_to_charge


The new repo must be eligible to run the job.

Hold jobs

Prevent a pending job from being started:

nersc$ scontrol hold $jobid

Release held jobs

Allow a held job to accrue priority and run:

nersc$ scontrol release $jobid

Cancel jobs

Cancel a specific job:

nersc$ scancel $jobid

Cancel all jobs owned by a user

nersc$ scancel -u $USER


This only applies to jobs which are associated with your accounts.

Quota Enforcement

Users will not be allowed to submit jobs if they are over quota in their scratch or home directories. This quota check is done twice, first when the job is submitted and again when the running job invokes srun. This could mean that if you went over quota after submitting the job, the job could fail when it runs. Please check your quota regularly and delete or archive data as needed.