Skip to content

How to use Python on NERSC systems

There are 4 options for using and configuring your Python environment at NERSC. We provide a brief overview here and will explain each option in greater detail below. These options are the same for both Cori and Perlmutter.

  1. Module only
  2. Module + conda activate (most popular)
  3. Use a Shifter container (best practice for 10+ nodes)
  4. Install your own Python

If you intend to run at large scale (10+ nodes), Shifter is the best option. You can also install your Python installation or conda environment on our faster /global/common/software filesystem. We provide more discussion about how to achieve good performance by choosing the right filesystems.

Option 1: Module only

In this mode, you just module load python and use it however you like. This is the simplest option but also the least flexible. If you require a package that is not in our default modules this option will not work for you.

Who should use Option 1?

Option 1 is best for users who want to get started quickly and who do not require special libraries or custom packages.

Option 2: Module + conda activate

In this mode, you first module load python and then build and use a conda environment on top of our module. To use this method:

module load python
conda activate myenv

To leave your environment

conda deactivate

and you will return to the base Python environment.

To create a custom environment using Option 2

module load python
conda create --name myenv python=3.8
conda activate myenv
conda install <your package>

Who should use Option 2?

This is our most popular option. It is good for anyone who would like to use packages that not avaible in the Python module.

Option 3: Install/Use Python inside a Shifter container

We strongly suggest this option for any user who needs to run Python on 10+ nodes. This will result in better performance for your own application, make you less vulnerable to filesystem slowdowns caused by other users, and of course prevent causing filesystem slowdowns for other users. Please see our Python in Shifter documentation and examples.

Who should use Option 3?

Option 3 is suitable for users willing to build their own software stack inside of a container. mpi4py works best at scale in Shifter.

Option 4: Install your own Python

You don't have to use any of the Python options we described above- you are free to install your own Python via Miniconda, Anaconda, Intel Python, or a custom collaboration install to have complete control over your stack.

Collaborations, projects, or experiments may wish to install a shareable, managed Python stack to /global/common/software independent of the NERSC modules. You are welcome to use the Anaconda installer script for this purpose. In fact you may want to consider the more "stripped-down" Miniconda installer as a starting point. That option allows you to start with only the bare essentials and build up. Be sure to select Linux version in either case. For instance:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b \
    -p /global/common/software/myproject/env
[installation messages]
source /global/common/software/myproject/env/bin/activate
conda install <only-what-my-project-needs>

You can customize the path with the -p argument. The installation above would go to $HOME/miniconda3 without it. You should also consider the PYTHONSTARTUP environment variable which you may wish to unset altogether. It is mainly relevant to the system Python we advise against using.

Who should use Option 4?

Option 4 is suitable for individuals or collaborations who would like to install, maintain, and control their own Python stack. Users who choose Option 4 should not combine their custom Python installations with our NERSC Python modules.

Using conda, mamba, and pip to install packages and manage environments

Overview of conda

Anaconda provides a conda cheat sheet you may find helpful.

To find availble packages, you can use the conda search tool. To install packages, you can use the conda install command.

conda search numpy
conda install numpy

Conda has several default channels that will be used first for package installation. If you want to use another channel beyond the defaults channel, you can, but we suggest that you select your channel carefully. We also suggest that you choose channels as you need them rather than permanently adding them to your conda config or .condarc file. For example, conda install numpy --channel conda-forge is better than conda config --add channels conda-forge.

Conda will search the default channels first. This is good because it means that MKL-enabled NumPy will be installed which generally performs well on Cori's Intel hardware.

If however you have added other channels to your search path, for example conda-forge, the packages that conda-forge will decide to install may not be optimal for NERSC. In this example, you will likely get a version of NumPy that uses OpenBLAS instead of MKL and this can be substantially slower on Cori.

If you find conda is slow, try mamba instead

The conda tool can sometimes be very slow when it's resolving packages in large and complex environments. You can try mamba instead of conda by simply replacing conda with mamba.

Installing libraries via pip

We support pip at NERSC via our Python modules, but users should be aware of several features of pip behavior that can cause problems. Anaconda provides some Best practices for using pip with conda. Our suggested use of pip is inside a conda environment. This makes it very easy to know exactly where packages are installed and also easy to clean them up completely when you are done. We suggest the following:

module load python
conda activate myenv
pip install numpy

The following pip install options are useful for situations where you need to build a package from source on NERSC systems (such as mpi4py or parallel h5py).

  • -v: verbose output, useful for debugging and confirming expected behavior.
  • --force-reinstall: forces a reinstall/rebuild in case the package is already installed.
  • --no-cache-dir: don't use the local package cache, we want a fresh download of the source code.
  • --no-binary: we want to build the package from source so don't use existing binaries.
  • --no-build-isolation: build the package using dependencies from the current environment.
  • --no-deps: don't install dependent packages, we want to use the ones in the current environment.

See the pip documentation for more information.

pip search path can find incompatible packages

When you pip install <package>, the pip tool with traverse its search path and may discover an old version of is already installed. However, this package may be incompatible with your current setup. It may have even been built on a different sytem. To be safe, it's best to pip install with the --force-reinstall and --no-cache-dir options to ensure a new and compatible package will be installed. This is even more important now that our Cori and Perlmutter systems are sharing filesystems.

Using conda clone

Cloning conda environments gives you the ability to copy a preexisting conda environment and modify it as you like. One example of a good use of conda clone is to copy the NERSC machine learning modules like TensorFlow so you can install your own packages. You can find the location of the environment you'd like to clone by using module show tensorflow, for example.

module load python
conda create --name my-tensoflow --clone /usr/common/software/tensorflow/intel-tensorflow/2.2.0-py37
source activate my-tensorflow
python -m ipykernel install --user --name my-tensorflow --display-name my-tensorflow
conda install <new package>

If you have questions about this, please don't hesitate to submit a ticket.

Moving your conda setup to /global/common/software

For better performance or if you plan to run your application at scale, consider installing your custom environment in your project's directory on /global/common/software:

conda create --prefix /global/common/software/myproject/myenv python=3.8
source activate /global/common/software/myproject/myenv
conda install numpy scipy astropy

You can also change your default conda location to /global/common/software. An easy way to do this is to change the settings in your $HOME/.condarc file

envs_dirs:
  - /global/common/software/<your project>/conda

pkgs_dirs:
  - /global/common/software/<your project>/conda

channels:
 - defaults

This will place all of your environments in this directory by default, and you won't have to worry about specifying the full prefix to your environment when installing it or activating it.

We are aware the project directory quotas on /global/common/software are small. Please open a ticket at help.nersc.gov if you need more space.

Now deprecated: conda init + conda activate

We previously supported a different Option 3 in which a user can configure their conda setup via

module load python
conda init

We have now deprecated this and do not recommend it for several reasons.

  • .bashrc file is shared between Cori and Perlmutter
  • No NERSC-provided settings like PYTHONUSERBASE
  • Confusing interactions between conda init Python setup and python module

To stop using this option, you can run the command conda init --reverse or simply delete the lines that conda init has added to your .bashrc file.

These may look like:

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/etc/profile.d/conda.sh" ]; then
        . "/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/etc/profile.d/conda.sh"
    else
        export PATH="/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

You can continue to use your existing conda environments with our directions in Option 2. If you have questions about this, please open a ticket at our helpdesk.