Skip to content

Interactive Jobs

Allocation

salloc is used to allocate resources in real time to run an interactive batch job. Typically, this is used to allocate resources and spawn a shell. The shell is then used to execute srun commands to launch parallel tasks.

"interactive" QOS on Perlmutter

Perlmutter has a dedicated interactive QOS to support medium-length interactive work. This queue is intended to deliver nodes for interactive use within 6 minutes of the job request.

Warning

On Perlmutter, if you have not set a default account, salloc may fail with the following error message:

salloc: error: Job request does not match any supported policy.
salloc: error: Job submit/allocate failed: Unspecified error

Perlmutter GPU nodes

salloc --nodes 1 --qos interactive --time 01:00:00 --constraint gpu --gpus 4 --account=mxxxx

When using srun, you must explicitly request for GPU resources

One must use the --gpus (-G), --gpus-per-node, or --gpus-per-task flag to make the allocated node's GPUs visible to your srun command.

Otherwise, you will see errors / complaints similar to:

 no CUDA-capable device is detected

 No Cuda device found

When requesting for an interactive node on the Perlmutter GPU compute nodes

One must use the project name that ends in _g (e.g., mxxxx_g) to submit any jobs to run on the Perlmutter GPU nodes. The -C (constraint flag) must also be set to GPUs for any interactive jobs (-C gpu or --constraint gpu).

Otherwise, you will notice errors such as:

sbatch: error: Job request does not match any supported policy.
sbatch: error: Batch job submission failed: Unspecified error

Perlmutter CPU nodes

salloc --nodes 1 --qos interactive --time 01:00:00 --constraint cpu --account=mxxxx

Perlmutter "debug" QoS

To get access to CPU node, you can run the following command:

salloc --nodes 1 --qos debug --time 20:00 --constraint cpu --account=mxxxx

For GPU node, you can run:

salloc --nodes 1 --qos debug --time 20:00 --constraint gpu --account=mxxxx

Limits

There is a 6 minute limit on the time to wait for an interactive job, if reservation is not granted within the allotted time, the job will be cancelled.

There is a maximum node limit of 4 nodes for interactive jobs on both cpu and gpu partitions. For more details see QOS Limits and Charges.