Skip to content

Perlmutter Timeline

This page records a brief timeline of significant events and user environment changes on Perlmutter.

January 11, 2022

  • Upgraded to CPE 21.12. Major changes include:
    • MPICH upgraded to v8.1.12 (from 8.1.11)
  • The previous programming environment can now be accessed using the cpe module.
  • Numerous internal upgrades to improve configuration and performance.

December 21, 2021

  • GPUs are back in "Default" mode (fixes Known Issue "GPUs are in "Exclusive_Process" instead of "Default" mode")
  • User access to hardware counters restored (fixes Known Issue "Nsight Compute or any performance profiling tool requesting access to h/w counters will not work")
  • Cuda 11.5 compatability libraries installed and incorporated into Shifter
  • QOS priority modified to encourage wider job variety
  • Numerous internal upgrades

December 6, 2021

  • Major changes to the user environment. All users should recompile their code following our compile instructions
  • The cuda, cray-pmi, and cray-pmi-lib modules have been removed from the default environment
  • The darshan v3.3.1 module has been added to the default environment
  • Default NVIDIA compiler upgraded to v21.9
    • Users must load a cudatoolkit module to compile GPU codes
  • Upgraded to CPE 21.11
    • MPICH upgraded to v8.1.11 (from 8.1.10)
    • PMI upgraded to v6.0.16 (from 6.0.14)
    • FFTW upgraded to 3.3.8.12 (from 3.3.8.11)
    • Python upgraded to 3.9 (from 3.8)
  • Upgrade to SLES15sp2 OS
  • Numerous internal upgrades

November 30, 2021

  • Upgraded Slingshot (internal high speed network) to v1.6
  • Upgraded Lustre server
  • Internal configuration upgrades

November 16, 2021

This was a rolling update where the whole system was updated with minimal interruptions to users.

  • Set MPICH_ALLGATHERV_PIPELINE_MSG_SIZE=0 to improve MPI communication speed for large buffer size.
  • Added gpu and cuda-mpich Shifter modules to better support Shifter GPU jobs
  • Deployed fix for CUDA Unknown Error errors that occasionally happen for Shifter jobs using the GPUs
  • Changed ssh settings to reduce frequency of dropped ssh connections
  • Internal configuration updates

November 2, 2021

  • Updated to CPE 21.10. A recompile is recommended but not required. See the documentation of CPE changes from HPE for a full list of changes. Major changes of note include:
    • Upgrade MPICH to 8.1.10 (from 8.1.9)
    • Upgrade DSMML to 0.2.2 (from 0.2.1)
    • Upgraded PMI to 6.0.14 (from 6.0.13)
  • Adjusted qos configurations to facilitate Jupyter notebook job scheduling.
  • Added preempt QOS. Jobs submitted to this QOS may get preempted after two hours, but may start more quickly. Please see our instructions for running preemptible jobs for details.

October 20, 2021

External ssh access enabled for Perlmutter login nodes.

October 18, 2021

  • Updated slurm job priorities to more efficiently utilize the system and improve the diversity of running jobs.

October 14, 2021

  • Updated NVIDIA driver (to 450.162). This is not expected to have any user impact.
  • Upgraded internal management framework.

October 9, 2021

  • Screen and tmux installed
  • Installed boost v1.66
  • Upgraded nv_peer_mem driver to 1.2 (not expected to have any user impact)

October 5, 2021

Deployed sparewarmer QOS to assist with node-level testing. This is not expected to have any user impact.

October 4, 2021

Limited the wall time of batch jobs to 6 hours to allow a variety of jobs to run during testing. If you need to run jobs for longer than 6 hours, please open a ticket.

September 29, 2021

  • Numerous internal network and management upgrades.

New batch system structure deployed

  • Users will need to specify a QOS (with -q regular, debug, interactive, etc.) as well as a project GPU allocation account name which ends in _g (e.g., -A m9999_g)
  • Please see our Running Jobs Section for examples and an explanation of new queue policies

September 24, 2021

  • Upgraded internal management software
  • Upgraded system I/O forwarding software and moved it to a more performant network
  • Fixed csh environment
  • Performance profiling tool that request access to hardware counters (such as Nsight Compute) should work now

September 16, 2021

  • Deployed numerous network upgrades and changes intended to increase responsiveness and performance
  • Increased robustness for login node load balancing

September 10, 2021

  • Updated to CPE 21.09. A recompile is recommended but not required. Major changes of note include:
    • Upgrade MPICH to 8.1.9 (from 8.1.8)
    • Upgrade DSMML to 0.2.1 (from 0.2.0)
    • Upgrade PALS to 1.0.17 (from 1.0.14)
    • Upgrade OpenSHMEMX to 11.3.3 (from 11.3.2)
    • Upgrade craype to 2.7.10 (from 2.7.9)
    • Upgrade CCE to 12.0.3 (from 12.0.2)
    • Upgrade HDF5 to 1.12.0.7 (from 1.12.0.6)
    • GCC 11.2.0 added
  • Added cuda module to the list of default modules loaded at startup
  • Set BASH_ENV to Lmod setup file
  • Deployed numerous network upgrades and changes intended to increase responsiveness and performance
  • Performed kernel upgrades to login nodes for better fail over support
  • Added latest CMake release as cmake/git-20210830, and is set as the default cmake on the system

September 2, 2021

  • Updated NVIDIA driver (to nvidia-gfxG04-kmp-default-450.142.00_k4.12.14_150.47-0.x86_64). This is not expected to have any user impact.

August 30, 2021

Numerous changes to the NVIDIA programming environment

  • Changed default NVIDIA compiler from 20.9 to 21.7
  • Installed needed CUDA compatibility libraries
  • Added support for multi-CUDA HPC SDK
  • Removed the cudatoolkit and craype-accel-nvidia80 modules from default

Tips for users:

  • Please use module load cuda and module av cuda to get the CUDA Toolkit, including the CUDA C compiler nvcc, and associated libraries and tools.
  • Cmake may have trouble picking up the correct mpich include files. If it does, you can use set ( CMAKE_CUDA_FLAGS "-I/opt/cray/pe/mpich/8.1.8/ofi/nvidia/20.7/include") to force it to pick up the correct one.

June, 2021

Perlmutter achieved 64.6 Pflop/s, putting it at No. 5 in the Top500 list.

May 27, 2021

Perlmutter supercomputer dedication.

November, 2020 - March, 2021

Perlmutter Phase 1 delivered.