Brief introduction to Python at NERSC¶
The Python we provide at NERSC is Anaconda Python. We believe that Anaconda provides a good compromise between productivity and performance. What does this mean for you?
You have 4 options for using Python at NERSC:
- Module only
- Module + source activate
- Conda init + conda activate
- Install your own Python
For more details about these 4 options, please see this page. Our data show that about 80 percent of our NERSC Python users are using custom conda environments (Options 2 and 3)- you might find that these are a good solution for you, too.
If you have Python questions or problems, you can always submit a ticket to
help.nersc.gov. We also encourage you to take a look at our FAQ and troubleshooting page. If you would like to make any edits or contributions to our docs, please see here.
Python on your laptop vs. Python at NERSC¶
There are a few key differences in using Python on your laptop/desktop and on our large supercomputing systems.
- To take advantage of our large systems, you'll want to parallelize your code in some way. Please see our parallel-python page for more information.
- To improve performance within Anaconda, you should use conda channels and libraries that can take advantage of our architecture (by using the Intel MKL library, for example.) For more information about conda channels at NERSC, please see this page.
- You should consider the location of your software stack and data. The best and fastest place for your code and conda environment is
/global/common/software. The best and fastest place for your data is
How to run Python jobs at NERSC¶
You have many options for running Python at NERSC:
- Our login nodes (only for very small testing and debugging). Please see our login node policies here.
- Jupyter for interactive notebooks well-suited for visualization and machine learning tasks.
- Compute nodes for any substantial computation (either interactively or via a batch job)
Running Python on an interactive compute node¶
To get an interactive Haswell node
salloc -N 1 -t 30 -C haswell -q interactive
You can source python either via a module or your conda environment (see here for more info).
To run a serial Python job
To run an mpi-enabled job you must use
srun to launch
srun -n 10 python hello-world-mpi.py
Running Python in a batch job¶
To run a serial job in a conda environment via a batch script
#!/bin/bash #SBATCH --constraint=haswell #SBATCH --nodes=1 #SBATCH --time=5 module load python source activate myenv python hello-world.py
And then submit by typing
To run an mpi-enabled job on 3 nodes using our python module, you can create a file called
#!/bin/bash #SBATCH --constraint=haswell #SBATCH --nodes=3 #SBATCH --time=5 module load python srun -n 96 -c 2 python hello-world-mpi.py
And then submit by typing
For more information about running jobs at NERSC please see this page.