Opening an SSH connection to NERSC systems results in a connection to a login node. Typically systems will have multiple login nodes which sit behind a load balancer. New connections will be assigned a random node. If an account has recently connected the load balancer will attempt to connect to the same login node as the previous connection.
Do not run compute- or memory-intensive applications on login nodes. These nodes are a shared resource. NERSC may terminate processes which are having negative impacts on other users or the systems.
On login nodes, typical user tasks include
- Compiling codes (limit the number of threads, e.g.,
make -j 8)
- Editing files
- Submitting jobs
Some workflows require interactive use of applications such as IDL, MATLAB, NCL, python, and ROOT. For small datasets and short runtimes it is acceptable to run these on login nodes. For extended runtimes or large datasets these should be run in the batch queues.
NERSC has implemented usage limits on Cori login nodes via Linux cgroup limits. These usage limits prevent inadvertent overuse of resources and ensure a better interactive experience for all NERSC users.
The following memory and CPU limits have been put in place on a per-user basis (i.e., all processes combined from each user) on Cori.
|Node type||memory limit||CPU limit|
Processes will be throttled to CPU limits
Processes may be terminated with a message like "Out of memory" when exceeding memory limits.
If you must use the
watch command, please use a much longer interval such as 5 minutes (=300 sec), e.g.,
watch -n 300 <your_command>.
NERSC provides a wide variety of qos's
To help identify processes that make heavy use of resources, you can use:
top -u $USER
/usr/bin/time -v ./my_command
On Jupyter, there is a widget on the lower left-hand side of the JupyterLab UI that shows aggregate memory usage.