How to use Python on NERSC systems¶
There are 4 options for using and configuring your Python environment at NERSC. We provide a brief overview here and will explain each option in greater detail below.
- Module only
- Module + source activate
- Conda init + conda activate
- Install your own Python
Our data show that about 80 percent of our NERSC Python users are using custom conda environments (Options 2 and 3)- you might find these are a good solution for you, too.
Option 1: Module only¶
In this mode, you just
module load python and use it however you like. This is the simplest option but also the least flexible. If you require a package that is not in our default modules this option will not work for you.
Who should use Option 1?
Option 1 is best for users who want to get started quickly and who do not require special libraries or custom packages.
Option 2: Module + source activate¶
In this mode, you first
module load python and then build and use a conda environment on top of our module. To use this method:
module load python source activate myenv
To leave your environment
and you will return to the base Python environment.
Who should use Option 2?
Option 2 is a good choice for any user who doesn't want a specific version of Python loaded automatically when they log on to Cori. It is also good for users who prefer to use the most recent Python module.
Option 3: Conda init + conda activate¶
In this mode, you will configure your environment one time via:
module load python conda init
This will add the following to your
# >>> conda initialize >>> # !! Contents within this block are managed by 'conda init' !! __conda_setup="$('/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/bin/conda' 'shell.bash' 'hook' 2> /dev/null)" if [ $? -eq 0 ]; then eval "$__conda_setup" else if [ -f "/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/etc/profile.d/conda.sh" ]; then . "/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/etc/profile.d/conda.sh" else export PATH="/global/common/cori_cle7/software/python/3.7-anaconda-2019.10/bin:$PATH" fi fi unset __conda_setup # <<< conda initialize <<<
After you have configured your environment, when you log on to Cori you should only:
conda activate myenv
To leave your environment:
and you will return to the base Python environment.
What should you do if you decide you don't like Option 3? You can simply delete the lines that
conda init has added to your
.bashrc. file and choose another Python option.
Who should use Option 3?
Option 3 is suitable for any user who would like a particular Python environment loaded by default whenever they access Cori. However, the user must be willing to manually monitor and update their configuration. Users who choose Option 3 should not combine their conda-init configured Python environment with our NERSC Python modules.
Option 4: Install your own Python¶
You don't have to use any of the Python options we described above- you are free to install your own Python via Miniconda, Anaconda, Intel Python, or a custom collaboration install to have complete control over your stack. Furthermore, you are free to build this installation with or without containers.
Option 4a: Install your own Python without containers¶
Individuals may prefer to install and maintain their own Python stack. Collaborations, projects, or experiments may wish to install a shareable, managed Python stack to
/global/common/software independent of the NERSC modules. You are welcome to use the Anaconda installer script for this purpose. In fact you may want to consider the more "stripped-down" Miniconda installer as a starting point. That option allows you to start with only the bare essentials and build up. Be sure to select Linux version in either case. For instance:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh bash Miniconda3-latest-Linux-x86_64.sh -b \ -p /global/common/software/myproject/env [installation messages] source /global/common/software/myproject/env/bin/activate conda install <only-what-my-project-needs>
You can customize the path with the
-p argument. The installation above would go to
$HOME/miniconda3 without it. You should also consider the
PYTHONSTARTUP environment variable which you may wish to unset altogether. It is mainly relevant to the system Python we advise against using.
Who should use Option 4a?
Option 4a is suitable for individuals or collaborations who would like to install, maintain, and control their own Python stack. Users who choose Option 4a should not combine their custom Python installations with our NERSC Python modules.
Option 4b: Install your own Python in a container¶
Users may prefer to build their own software stack inside of a container for improved portability, performance, and configurability. Like in Option 4a, users may choose to install Miniconda, Anaconda, Intel Python, or start with a pre-built container from NVIDIA, for example.
To get started using Docker containers, see here.
To use Docker containers at NERSC via Shifter, see here.
Who should use Option 4b?
Option 4b is suitable for users willing to build their own software stack inside of a container. Anyone who plans to run mpi4py jobs at scale is strongly encouraged to use Option 4. Please see here for more information.
Creating conda environments¶
Creating custom conda environments is usually quick and easy. If you require a package that is not available in our default module, this is the option you must use.
If you are using Option 2 (source activate):
module load python conda create --name myenv python=3.8 source activate myenv conda install numpy scipy astropy
If you are using Option 3 (conda activate):
conda create --name myenv python=3.8 conda activate myenv conda install numpy scipy astropy
Installing libraries via conda channels¶
Conda has several default channels that will be used first for package installation. If you want to use another channel beyond the defaults channel, you can, but we suggest that you select your channel carefully.
Here is an example that demonstrates why your channels matter. If we
conda install numpy
it will search the default channels first. This is good because it means that MKL-enabled NumPy will be installed which generally performs well on Cori's Intel hardware.
If however you have added other channels to your search path, for example
conda-forge, the packages that
conda-forge will decide to install may not be optimal for NERSC. In this example, you will likely get a version of NumPy that uses OpenBLAS instead of MKL and this can be substantially slower on Cori.
Don't permanently add other channels to your conda config, i.e.
conda config --add channels conda-forge
Do this instead:
conda install numpy --channel conda-forge
It's better to append the channel you need with a
-channel conda-forge. This uses
conda-forge only when you ask for it and not all the time.
Installing libraries via pip¶
Pip is available under Anaconda Python. If you create a conda environment but you are unable to find a conda build of whatever package (or version of that package) you want to install, then pip is one viable alternative. However, pip users at NERSC should be aware of the following:
Users of the pip command may want to use the "--user" flag for per-user site-package installation following the PEP370 standard. On Linux systems this defaults to
$HOME/.local, and packages can be installed to this path with "pip install --user package_name." This can be overridden by defining the
To prevent per-user site-package installations from conflicting across machines and module versions, at NERSC we have configured our Python modules so that
PYTHONUSERBASEis set to
$HOME/.local/$NERSC_HOST/versionwhere "version" corresponds to the version of the Python module loaded.
Mixing pip and conda: an example¶
We have observed that users often don't realize that the per-user site-package directories are included in the search path from all their conda environments created with the same module. What does this mean? We'll demonstrate with an example. If you have done the following:
module load python pip install numpy --user
Any conda environment you have created based on this Python module will have this pip-installed NumPy in its search path.
It can be easy to forget you've done "pip install --user" and then create a new conda environment and be confused by how it works (or doesn't).
If you're using a conda environment anyway, think about whether you really want a pip-installed package to be accessible to multiple conda environments. If you don't, just drop the "--user" part and install it into your conda environment:
module load python source activate myenv pip install numpy