Skip to content

Queues and Charges

This page details the QOS and queue usage policies. Examples for each type of Cori job are available.


When a job runs on a NERSC supercomputer, charges accrue against one of the user's projects. The unit of accounting for these charges is the "NERSC Hour". The total number of NERSC hours a job costs is a function of:

  • the number of nodes and the walltime used by the job,
  • the QOS of the job, and
  • the "charge factor" for the system upon which the job was run.

Job charging policies, including the intended use of each queue, are outlined in more detail under "Policies". This page summarizes the limits and charges applicable to each queue.

Selecting a Queue

Jobs are submitted to different queues depending on the queue constraints and the user's desired outcomes. Each queue corresponds to a "Quality of Service" (QOS): each queue has a different service level in terms of priority, run and submit limits, walltime limits, node-count limits, and cost. At NERSC, the terms "queue" and "QOS" are often used interchangeably.

Most jobs are submitted to the "regular" queue, but a user with a particularly urgent scientific emergency may decide to submit to the premium queue for faster turnaround. Another user who does not need the results of this run for many weeks may elect to use the low queue to cut down on costs. And a user who needs fast turnaround while they are using the large telescope could prearrange with NERSC to use the realtime queue for these runs. The user with the scientific emergency will incur a higher-than-regular charge to use the premium queue, while a user who can be flexible about their required runtime is rewarded with a substantial discount.

Assigning Charges

Users who are members of more than one project can select which one should be charged for their jobs by default. In Iris, under the "Compute" tab in the user view, select the project you wish to make default.

To charge to a non-default project, use the -A projectname flag in Slurm, either in the Slurm directives preamble of your script, e.g.,

#SBATCH -A myproject

or on the command line when you submit your job, e.g., sbatch -A myproject ./myscript.sl.

To use GPU nodes on Perlmutter, the project's GPU allocation account name (that is, the name that ends in _g, as in myproject_g) must be used. Otherwise, job submission will fail.

Warning

For users who are members of multiple NERSC projects, charges are made to the default project, as set in Iris, unless the #SBATCH --account=<NERSC project> flag has been set.

Calculating Charges

The cost of a job is computed in the following manner: $$ \text{walltime in hours} \times \text{number of nodes} \times \text{QOS factor} \times \text{charge factor} $$.

Example

The charge for a job that runs for 40 minutes on 3 Haswell nodes in the premium QOS (QOS factor of 2) would be calculated $$ \frac{40\ \text{mins}}{60\ \text{min/hr}} \times 3\ \text{nodes} \times 2 \times 140\ \text{NERSC-hours/node-hour} = \frac{2}{3} \times 3 \times 2 \times 140 = 560\ \text{NERSC-hours}.$$

Example

A job which ran for 35 minutes on 3 KNL nodes on Cori with the regular QOS would be charged: $$ (35/60)\ \text{hours}*3\ \text{nodes} * 80 = 140\ \text{NERSC hours} $$

Note

Jobs in the "shared" QOS are only charged for the fraction of the node used.

Example

A job which ran for 12 hours on 4 physical cores (each core has 2 hyperthreads) on Cori Haswell with the shared QOS would be charged: $$ 12\ \text{hours} * (2*4\ \text{cores}/64) * 140 = 210\ \text{NERSC hours} $$

Note

Jobs are charged according to the resources they made unavailable for other jobs, i.e., the number of nodes reserved (regardless of use) and the actual walltime used (regardless of the specified limit).

Charge Factors for AY2021

These are incorporated into the total Charge per Node-Hour in the tables below.

Architecture Charge Factor
Cori Haswell 140
Cori KNL 80
Cori Large Memory Nodes (cmem, bigmem) 140
Perlmutter (pre-production in 2021) 0

Note

During Allocation Year 2021 jobs run on Perlmutter will be free of charge.

Charge Factors for AY2022

Charge factors for Allocation Year 2022 are being renormalized around the performance of Perlmutter CPU and GPU nodes.

Architecture Charge Factor Conversion: AY21 to AY22
Cori Haswell 0.35 Multiply by 0.0025 or divide by 400
Cori KNL 0.2 Multiply by 0.0025 or divide by 400
Cori Large Memory Nodes (cmem, bigmem) 0.35 Multiply by 0.0025 or divide by 400
Perlmutter CPU 1 1 N/A
Perlmutter GPU 1 1 N/A

Note

Perlmutter GPU is allocated separately from the rest of the resources.

QOS Cost Factor: Charge Multipliers and Discounts

The QOS cost factor is a function of which queue a job runs in. If a job must be urgently run, then a user might submit it to the premium queue, and incur a 2x charge factor. Jobs in the flex queue, on the other hand, receive a substantial discount in exchange for flexibility about walltime.

QOS QOS Factor Conditions
regular 1 (standard charge factor)
flex 0.25 uses Cori KNL nodes
flex 0.5 uses Cori Haswell nodes
premium 2 less than 20% of allocation has been used in premium queue
premium 4 more than 20% of allocation has been used in premium queue

QOS Limits and Charges

Perlmutter

QOS Max nodes Max time (hrs) Submit limit Run limit Priority QOS Factor Charge per Node-Hour
regular 128 6 5000 - medium - 0
interactive 4 4 5000 2 high - 0
debug 4 2 5000 5 medium - 0
preempt 128 24 (preemptible after two hours) 5000 - medium - 0
early_science - 6 5000 10 medium-high - 0
sow - - 5000 - - - 0
  • Nodes allocated by a "regular" QOS job are exclusively used by the job.

  • Jobs in the preemptible queue can be preempted after two hours. Jobs can be automatically requeued after preemption using the --requeue sbatch flag. See the Preemptible Jobs section for details.

  • Nodes allocated by an "early_science" jobs are not shared with other jobs. Access is gated by membership in the early_science QOS in Iris.

  • The "sow" QOS is for NERSC staff's testing of the system.

Cori Haswell

QOS Max nodes Max time (hrs) Submit limit Run limit Priority QOS Factor Charge per Node-Hour
regular 19322 48 5000 - 4 1 140
shared3 0.5 48 10000 - 4 1 1403
interactive 644 4 2 2 - 1 140
debug 64 0.5 5 2 3 1 140
premium 1772 48 5 - 2 2 -> 45 2805
flex 64 48 5000 - 6 0.5 70
overrun 1772 4 5000 - 5 0 0
xfer 1 (login) 48 100 15 - - 0
bigmem 1 (login) 72 100 1 - 1 140
realtime custom custom custom custom 1 custom custom
compile 1 (login) 24 5000 2 - - 0

Cori KNL

QOS Max nodes Max time (hrs) Submit limit Run limit Priority QOS Factor Charge per Node-Hour
regular 9489 48 5000 - 4 1 80
interactive 644 4 2 2 - 1 80
debug 512 0.5 5 2 3 1 80
premium 9489 48 5 - 2 2 -> 45 1605
low 9489 48 5000 - 5 0.5 40
flex 256 48 5000 - 6 0.25 20
overrun 9489 4 5000 - 7 0 0

JGI Accounts

There are 192 Haswell nodes reserved for the "genepool" and "genepool_shared" QOSes combined. Jobs run with the "genepool" QOS uses these nodes exclusively. Jobs run with the "genepool_shared" QOS can share nodes.

QOS Max nodes Max time (hrs) Submit limit Run limit Priority
genepool 16 72 500 - 3
genepool_shared 0.5 72 500 - 3

Discounts

  • Big job discount The "regular" QOS charges on Cori KNL are discounted by 50% if a job uses 1024 or more nodes. This discount is available only in the regular QOS for Cori KNL.

    System Architecture Big Job Discount Conditions
    Cori KNL 0.5 Job using 1024 or more nodes in regular queue

In addition several QOS's offer reduced charging rates:

  • The "low" QOS (available on Cori KNL only) is charged 50% as compared to the "regular" QOS, but no extra large job discount applies.

  • The "flex" QOS is charged 50% as compared to the "regular" QOS on Haswell and 25% as compared to the "regular" qos on KNL.

  • The "overrun" QOS is free of charge and is only available to projects that are out of allocation time. Please refer to the overrun section for more details.


  1. Until Perlmutter is officially in production, users will not be charged for time used on these resources. The start date for charging will be announced in advance. 

  2. Batch jobs submitted to the Haswell partition requesting more than 512 nodes must go through a compute reservation

  3. Shared jobs are only charged for the fraction of the node resources used. 

  4. Batch job submission is not enabled and the 64-node limit applies per project not per user. 

  5. The charge factor for "premium" QOS will be doubled once a project has spent more than 20 percent of its allocation in "premium".