Skip to content

How to use Shifter and NVIDIA GPUs

This guide is meant to provide a brief overview of using Shifter on GPUs with a specific focus for Perlmutter.

Summary of GPU-relevant Shifter modules on Perlmutter

Module Name Function
gpu (default) Loads CUDA user driver, compatibility libraries, and tools like nvidia-smi
mpich (default) Allows communication between nodes using the high-speed interconnect
cuda-mpich Allows CUDA-aware communication using the high-speed interconnect (experimental)
none Turns off all modules

Loading one or more specific Shifter modules will unload defaults

If you would like to unload one of the Shifter default modules, you will just need to specify the modules that you do want. For example, shifter --module=gpu will load only the gpu module.

Shifter gpu module

The default gpu module provides tools like nvidia-smi, a CUDA user driver, and the corresponding CUDA compatibility libraries. The compatibility libraries are designed to provide backwards compatibility for CUDA versions running inside Shifter. For example, running shifter nvidia-smi may report CUDA 11.4, but the compatibility libraries enable Shifter to also run older versions of CUDA like 11.0. If for some reason you are unable to run your CUDA application with our current compatibility configuration, please let us know at

Shifter mpich module

The default mpich module provides CPU-only (i.e. non-CUDA-aware) Cray MPICH functionality.

Shifter Open MPI users

Open MPI users (or anyone who does not want the mpich module functionality) can unload it simply by speficying shifter --module=gpu. Shifter Open MPI users should also specify --mpi=pmi2. A sample srun could look like srun -N 2 --mpi=pmi2 --module=gpu shifter <Open MPI.program>

Shifter cuda-mpich module

Although the mpich module is loaded by default, it does not provide any support for CUDA-aware MPI. For this, users will need to load the experimental cuda-mpich module. To use this module, users can build against MPICH like in our Shifter and mpi4py example. However, when running the Shifter container, users will need to ensure the HPE GTL library is manually linked. For example, a user could write the following wrapper file called


LD_PRELOAD=/opt/udiImage/modules/cuda-mpich/lib64/ $@

Using this wrapper will ensure the library gets linked to the libraries within the Shifter container at runtime. Our example is a Python mpi4py test.

MPICH_GPU_SUPPORT_ENABLED=1 srun -C gpu -n 1 --gpus-per-node=1 --module=cuda-mpich shifter ./ python3

We are working with our vendor to determine if a more seamless solution is possible. If you have trouble using the cuda-mpich module or linking to GTL, please let us know at