Skip to content

Migrating from Cori to Perlmutter

Cori Retirement Plan

Cori had its first users in 2015, and since then, NERSC's longest running system has been a valuable resource for thousands of users and projects. With the complete Perlmutter system to be operational during the 2023 allocation year, NERSC plans to decommission Cori on May 31, 2023 at noon, Pacific time.

We will begin decommissioning auxiliary systems associated with Cori at the end of March. On March 31 at noon, the Cori large memory nodes will be taken offline in preparation for their migration to Perlmutter, and the Cori GPU nodes will be retired. The Haswell and KNL nodes will continue to operate through May 31, at noon, Pacific time.

Cori Retirement Timeline

  • Oct 2022: Software freeze (no new user-facing software installed by NERSC)
  • Allocation Year 2023: All allocations based on Perlmutter’s capacity only
  • Nov 2022 - Jan 2023: Cori to Perlmutter transition training focus & office hours
  • Jan 30, 2023: Cori retirement date (end of Apr 2023) announced
  • Feb - May 2023: More Cori to Perlmutter transition training focus & office hours
  • Mar 31, 2023, at noon: Retire Cori GPU nodes, and take large memory nodes offline
  • **Apr 17, 2023: ** Final date (T, i.e., May 31, 2023 at noon, Pacific time) for decommissioning announced
  • T - 1 week: Implement reservation, preventing new jobs from running effective T
  • T: Delete all jobs from queue, no new jobs can be submitted; continue to allow login to retrieve files from Cori scratch
  • T + 1 week: Close login nodes permanently
  • T + 1 month: Disassembly begins

System Architectures

Cori has 1,900 Intel Haswell and 9,300 Intel KNL CPU nodes. Perlmutter has 1,536 A100 Nvidia GPU nodes and 3072 AMD CPU-only nodes.

Detailed system architectures info can be found at: Cori Architecture and Perlmutter Architecture. Below is a quick comparison table:

Attribute Cori Perlmutter
Peak Performance ~30 PF ~120 PF
System Memory >1 PB >2 PB
Node Performance >3 TF >70 TF
Node Processors Intel KNL + Intel Haswell AMD EPYC (Milan) + Nvidia A100 GPUs
# of Nodes 9300 KNL + 1900 Haswell 1536 GPU Accelerated + 3072 CPU-only
Intra-Node Interconnect N/A NVLink across GPUs; PCIe
Inter-Node Interconnect Aries Slingshot
File System 28 PB, 0.75 TB/s 35PB All-Flash; > 4TB/s

Cori / Perlmutter Comparison: Similarities

Perlmutter and Cori have similar Cray user environments. There are PrgEnv-xxx modules, and compiler wrappers (ftn, cc, and CC) are used to build applications. Here xxx can be gnu, nvidia, and cray for Perlmutter.

The batch scheduler on both systems is Slurm, with familiar interactive, debug, regular, premium, shared, and overrun queues, etc.

Both Perlmutter and Cori have CPU nodes with standard CPU architectures that function similarly. Perlmutter CPU nodes are AMD processors while Cori CPU nodes are Intel processors. Perlmutter CPU nodes have similar clock speed to Cori Haswell nodes, and have similar number of cores per nodes to Cori KNL nodes.

There are Python, Jupyter, various profiling and debugging tools, workflow tools, science application packages, Data Analytics, and ML/DL packages installed on Perlmutter CPUs, as we have on Cori. The Extreme-scale Scientific Software Stack (E4S) is also available on Perlmutter.

Migrating applications from Cori Haswell to Perlmutter CPU-only nodes is straightforward.
Jobs running on Perlmutter CPU nodes are charged against your CPU allocations, like runs on Cori Haswell or KNL nodes.

Cori / Perlmutter Comparison: Differences

Perlmutter uses Lmod modules, which differ slightly from the Tcl modules on Cori. Most module commands are the same, but the way the modules are organized is different -- Lmod is hierarchical. For example, with Lmod, modules may not be initially visible because of dependencies; you should use module spider instead of module avail to search for modules in this hierarchical organization scheme.

Cori supports Intel (default), CCE, and GCC compilers. Perlmutter supports GCC (default), Nvidia, CCE, LLVM, and Intel compilers.

Perlmutter also has Nvidia GPU nodes, which require substantially different programming models in order to exploit the GPU. User codes may have different GPU-compatible and CPU-only versions. More profiling and debugging tools, science application packages, Data Analytics, and ML/DL packages are installed on Perlmutter for use on GPUs.

Jobs running on Perlmutter GPU nodes are charged against your GPU allocations.

Compiling and Running on CPU Nodes

We recommend using module load cpu to set a cleaner cpu environment, because the gpu environment is loaded as default (see the GPU section below. In most cases, the GPU settings do not have an impact for CPU applications, so this step may not be necessary. Compile and run on Perlmutter CPU is very similar to those on Cori Haswell.

To compile on Perlmutter:

  • The default compiler is GCC.
  • Using the compiler wrappers (ftn, cc, and CC) will link the default cray-mpich libraries.
  • Use module load PrgEnv-xxx to switch to another compiler (where xxx can be nvidia, cray, or gnu). There is no need to do module swap PrgEnv-xxx PrgEnv-yyy as on Cori.
  • To enable OpenMP, use the -fopenmp flag for the GCC and CCE compilers, and the -mp flag for the Nvidia compiler.

A few quick tips may be helpful for compiling older codes (that worked on Cori) on Perlmutter with the default GCC compiler:

  • Fortran: Try -fallow-argument-mismatch first, followed by the more extensive flag -std=legacy to reduce strictness.
  • C/C++: Look for flags that reduce strictness, such as -fpermissive.
  • C/C++: -Wpedantic can warn you about lines that break code standards.

Running jobs on Perlmutter CPU nodes is very similar to running on Cori Haswell nodes. One particular thing to point out is how to set the -c value in the srun line. The table below shows the compute node comparisons and how the -c value is calculated for Cori Haswell, Cori KNL, Perlmutter CPU nodes, and the CPU on Perlmutter GPU nodes. (Note: "tpn" = tasks per node)

- Cori Haswell Cori KNL Perlmutter CPU CPU on Perlmutter GPU
Physical cores 32 68 128 64
Logical CPUs per physical core 2 4 2 2
Logical CPUs per node 64 272 256 128
NUMA domains 2 1 8 4
-c value for srun floor(32/tpn)*2 floor(68/tpn)*4 floor(128/tpn)*2 floor(64/tpn)*2

Below are some sample batch script comparisons:

Cori-Haswell Pure MPI, 40 nodes, 1280 MPI tasks

#SBATCH --qos=regular
#SBATCH --constraint=haswell
#SBATCH --time=1:00:00
#SBATCH --nodes=40

srun -n 1280 -c 2 --cpu-bind=cores ./mycode.exe

Perlmutter Pure MPI, 10 nodes, 1280 MPI tasks

#SBATCH --qos=regular
#SBATCH --constraint=cpu
#SBATCH --time=1:00:00
#SBATCH --nodes=10

srun -n 1280 -c 2 --cpu-bind=cores ./mycode.exe

Cori-Haswell MPI/OpenMP, 40 nodes, 160 MPI tasks, 8 OpenMP threads per node

#SBATCH --qos=regular
#SBATCH --constraint=haswell
#SBATCH --time=1:00:00
#SBATCH --nodes=40

export OMP_PLACES=threads
export OMP_PROC_BIND=spread
srun -n 160 -c 16 --cpu-bind=cores ./mycode.exe

Perlmutter CPU MPI/OpenMP, 10 nodes, 160 MPI tasks, 8 OpenMP threads per node

#SBATCH --qos=regular
#SBATCH --constraint=cpu
#SBATCH --time=1:00:00
#SBATCH --nodes=10

export OMP_PLACES=threads
export OMP_PROC_BIND=spread

srun -n 160 -c 16 --cpu-bind=cores ./mycode.exe

Please see more information at Migrating from Cori to Perlmutter: CPU Codes. You can find example batch scripts on Perlmutter CPU nodes. The Job Script Generator in Iris can help you create a job script template using the job parameters you choose.

Compiling and Running on GPU Nodes

CUDA-aware MPI is enabled by default. Modules cudatoolkit, craype-accel-nvidia80, and gpu are loaded by default. The gpu module also sets MPICH_GPU_SUPPORT_ENABLED to 1.

This table summarizes the GPU programming models supported on Perlmutter's GPU nodes (where Nvidia, CCE, and GNU are vendor supported, while LLVM and Intel are available and in progress via NERSC effort):

- Fortran/C/C++ CUDA OpenACC 2.x OpenMP 5.x CUDA Fortran Kokkos/Raja MPI HIP DPC++/SYCL
nvidia x x x x x x x
CCE x x x x x
GNU x x x x x x
LLVM x x x x x x x
Intel x x x x x x

And the table below shows the recommended Programming Environment for various Programming Models:

Programming Model Programming Environment
CUDA PrgEnv-nvidia or PrgEnv-gnu
Kokkos PrgEnv-nvidia or PrgEnv-gnu
OpenMP offload PrgEnv-nvidia or PrgEnv-gnu
OpenACC PrgEnv-nvidia
stdpar PrgEnv-nvidia

Please see more information at Migrating from Cori to Perlmutter: GPU Codes and the Transitioning Applications to Perlmutter webpage for a wealth of useful information on how to transition your applications for Perlmutter GPU.

For running jobs, refer to Running Jobs on Perlmutter page for many example job scripts on Perlmutter GPU nodes, and more info on GPU Affinity. The Job Script Generator in Iris can help you to create a job script template with the job parameters you select. Also refer to the CPU on Perlmutter GPU Column in the Compute Nodes Comparison for CPU Affinity table in the CPU section above to determine how the -c value is calculated.

File Systems and Data Considerations

The files you see on Cori in your home directory, and in the global common and community file systems can be accessed from Perlmutter in the same way that they are accessed from Cori, so there is no need to do anything special with them before Cori retires.

Your files on Cori scratch are inaccessible directly from Perlmutter, since Perlmutter and Cori have separate scratch file systems. We will retire Cori scratch along with Cori, so be sure to back up Cori scratch files before Cori retires, or migrate Cori scratch data onto CFS or HPSS via Globus or scp first, then access them on Perlmutter.

One mechanism for large file transfer from Cori scratch to Perlmutter scratch is via Globus, using the Globus end point on Cori for Cori scratch, and the Globus end point on the DTNs for Perlmutter scratch. Please see more information at Transferring Data to and from Perlmutter scratch.

Remove references to the project file system from old scripts

The old symlink /global/project/projectdirs to CFS on Cori does not exist on Perlmutter; be sure to replace it with /global/cfs/cdirs in any script you are porting to Perlmutter.

Data Analytics on Perlmutter

Users with more advanced data workflow needs please refer to the New User Training, Sept 2022 afternoon materials and the Data Day, Oct 2022 talks and tutorials, on topics such as Workflows, Python/Julia, Jupyter, IO, Containers/Shifter, and Deep Learning; and refer to the abundant Perlmutter user documentations, such as:

More Resources

Slides and videos are available from the training events below: