NERSC Resource Usage Policies¶
NERSC allocates time on compute nodes and space on its file systems and HPSS system - accounting and charging for use of these resources are addressed below at Compute Node Usage Charging and HPSS Charges.
Policies for the use of shared login node resources are described below in the NERSC Login Node Policy.
Queue usage policies outlined below include Intended Purpose of Available QOSes and Held and
InvalidQOS jobs are deleted after 12 weeks
NERSC Login Node Policy¶
Do not run compute- or memory-intensive applications on login nodes. These nodes are a shared resource. NERSC may terminate processes which are having negative impacts on other users or the systems.
On login nodes, typical user tasks include
- Compiling codes (limit the number of threads, e.g.,
make -j 8)
- Editing files
- Submitting jobs
Some workflows require interactive use of applications such as IDL, MATLAB, NCL, python, and ROOT. For small datasets and short runtimes it is acceptable to run these on login nodes. For extended runtimes or large datasets these should be run in the batch queues.
NERSC has implemented usage limits on Perlmutter login nodes via Linux cgroup limits. These usage limits prevent inadvertent overuse of resources and ensure a better interactive experience for all NERSC users.
The following memory and CPU limits have been put in place on a per-user basis (i.e., all processes combined from each user) on Perlmutter.
Processes will be throttled to CPU limits.
Processes may be terminated with a message like "Out of memory" when exceeding memory limits.
If you must use the
watch command, please use a much longer interval such as 5 minutes (=300 sec), e.g.,
watch -n 300 <your_command>.
NERSC provides a wide variety of QOSs
To help identify processes that make heavy use of resources, you can use:
top -u $USER
/usr/bin/time -v ./my_command
In JupyterLab, there is a widget on the lower left-hand side of the UI that shows aggregate memory usage.
To access NERSC systems, see Connecting to Login Nodes.
File System Allocations¶
Each user has a personal quota in their home directory and on the scratch file system, and each project has a shared quota on the Community File System. NERSC imposes quotas on space utilization as well as inodes (number of files). For more information about these quotas please see the file system quotas page.
HPSS charging is based on allocations of space in GBs which are awarded into accounts called HPSS projects. If a login name belongs to only one HPSS project, all its usage is charged to that project. If a login name belongs to multiple HPSS projects, its daily charge is apportioned among the projects using the project percents for that login name. Default project percents are assigned by Iris based on the size of each project's storage allocation.
Users can view their project percents on the "Storage" tab in the user view in Iris. To change your project percents, change the numbers in the "% Charge to Project" column.
For more detailed information about HPSS charging please see HPSS charging.
Compute Usage Charging¶
When a job runs on a NERSC supercomputer, charges accrue against one of the user's projects (sometimes referred to as "repos"). The total charge for a job is the product of:
- the number of nodes used by the job
- the amount of (wallclock) time in hours used by the job
- a job priority charge factor
- a machine charge factor, which depends on with node architecture the job used
(Sometimes, NERSC offers a big job discount on its systems. For details, refer to the policy page.
The job priority charge factor is set by selecting a QOS for the job.
Machine charge factors are set by NERSC to account for the relative performance of the architecture and the scarcity of the resource.
For 2022, the Machine Charge Factors are:
- Perlmutter CPU Nodes: 1.0
- Perlmutter GPU Nodes: 1.0
and the charge units are "CPU Node Hours" or "GPU Node Hours", respectively.
For example, a job that runs for 5 hours and uses 100 Perlmutter CPU nodes in the regular QOS (priority factor = 1) is charged:
The job-cost formula, along with charge factors for each system and queue, are outlined in Queues and Charges
Charges are based on resources allocated for the job¶
Job charges are based on the footprint of the job: the space (in terms of nodes) and time (in terms of wallclock hours) that the scheduler has reserved for the job based on what the user requested. A job that was allocated 100 nodes and ran on only one of the nodes will still be charged for the use of 100 nodes. However, the job will be charged the actual wall time used by the job script until it terminates.
Because a reservation takes up space and time that could be otherwise used by other users' jobs, users are charged for the entirety of any reservation they request, including any time spent rebooting nodes and any gaps in which no jobs are running in the reservation.
Reservations are always charged at standard rates and are not eligible for any discounts, no matter the size.
Running out of Allocation¶
Accounting information for the previous day is finalized in Iris once daily (in the early morning, Pacific Time). At this time actions are taken if a project or user balance is negative.
If a project runs out of time (or space in HPSS) all login names which are not associated with another active project are restricted:
- On computational machines, restricted users are able to log in, but cannot submit batch jobs or run parallel jobs, except to the "overrun" partition.
- For HPSS, restricted users are able to read data from HPSS and delete files, but cannot write any data to HPSS.
Login names that are associated with more than one project (for a given resource -- compute or HPSS) are checked to see if the user has a positive balance in any of their projects (for that resource). If they do have a positive balance (for that resource), they will not be restricted and the following will happen:
- On computational machines the user will not be able to charge to the restricted project. If the restricted project had been the user's default project, they will need to change their default project through Iris, or specify a different project with sufficient allocation when submitting a job, or run jobs in overrun only.
Likewise, when a user goes over their individual user quota in a given project, that user is restricted if they have no other project to charge to. A PI or Project Manager can change the user's quota.
In Iris, users can view graphs of their own compute and storage usage under the "Jobs" and "Storage" tabs in the user view, respectively. Likewise a user can view the compute and storage usage of their projects under the same tabs in the project view in Iris.
In addition, there is a "Reports" menu at the top of the page from which users can create reports of interest. For more information please see the Iris Users Guide.
Intended Purpose of Available QOSes¶
There are many different QOSes at NERSC, each with a different purpose. Most jobs should use the "regular" QOS.
The standard queue for most production workloads.
Code development, testing, and debugging. Production runs are not permitted in the debug QOS. User accounts are subject to suspension if they are determined to be using the debug QOS for production computing. In particular, job "chaining" in the debug QOS is not allowed. Chaining is defined as using a batch script to submit another batch script.
Code development, testing, debugging, analysis and other workflows in an interactive session. Jobs should be submitted as interactive jobs, not batch jobs.
Jobs which can handle premature termination (via checkpoint/restart or other methods) are well-suited for the preempt QOS. To encourage use of the preempt QOS, job costs are discounted; see our QOS cost table for the most up-to-date charge factors. To deter misuse, preempt QOS jobs are charged for a minimum value of the guaranteed non-preemptable walltime available via the QOS. That is, based on the current job limits and charges, preempt QOS jobs are charged for 2 hours of walltime at a minimum, regardless of actual walltime elapsed.
The intent of the premium QOS is to allow for faster turnaround for unexpected scientific emergencies where results are needed right away. NERSC has a target of keeping premium usage at or below 10 percent of all usage. Premium should be used infrequently and with care; the charge factor for premium will increase once a project has used 20 percent of its allocation on premium. PIs will be able to control which of their users can use premium for their allocation. Note that premium jobs are not eligible for discounts.
Projects whose NERSC-hours balance is zero or negative. The charging rate for this QOS is 0 and it has the lowest priority on all systems. Overrun QOS jobs must be submitted with the
--time-min flag and are subject to preemption after 2 hours by higher priority workloads under certain circumstances.
Users who have a zero or negative balance in a project that has a positive balance cannot submit to the Overrun queue. PIs can adjust a project member's share of the project allocation in Iris, instructions are in the Iris guide for PIs.
The "realtime" QOS is available only via special request. It is intended for jobs that are connected with an external realtime component that requires on-demand processing.
InvalidQOS jobs are deleted after 12 weeks ¶
Jobs held by users and jobs in the
InvalidQOS state are deleted after 12 weeks in the queue.