Skip to content

Environment

Note

This page outlines tips on managing the user shell environment and startup scripts for NERSC systems. Please see the shell startup page for a detailed explanation of shells, startup shell files, and different shell modes.

NERSC User Environment

NERSC Shell Support Policy

By default, the login shell for a user account is bash. Users also have the option to change their default shell to csh, tcsh, zsh, or ksh. Only bash is fully supported by NERSC at this time; all other shells are supported on a basic level.

For fully supported shells (bash):

  • NERSC will test tools, modules, and scripts upon system updates, and fix errors with high priority.
  • We reserve the right to take unscheduled outages to fix critical issues.

For basic supported shells:

  • NERSC will deploy the version supplied by the underlying OS and a basic skeleton configuration.
  • We do not guarantee that tools and scripts deployed by NERSC staff will work with these shells, and fixes for reported issues will be treated as lower priority.

Dotfiles and Customizing Your Environment

NERSC does not populate shell initialization files (also known as "dotfiles") in users' home directories. The same home file system is mounted on all NERSC resources, meaning that the same dotfile is used across all compute systems.

NERSC provides template dotfiles that can be found at https://software.nersc.gov/NERSC/dotfiles Before copying the NERSC dotfiles, we recommend creating a backup for your dotfiles. Then, copy the content of dotfiles into your ~/.bashrc or ~/.tcshrc.

Note

Home directories are shared between Perlmutter and DTN, which means your dotfiles must be compatible with both systems; otherwise, you will run into errors.

You can create your own dotfiles instead of using our template files. We recommend you test your changes by starting a new shell and see if configuration changes match your expectation.

No more .ext dotfiles at NERSC since February 21, 2020.

NERSC used to reserve the standard dotfiles (~/.bashrc, ~/.bash_profile, ~/.cshrc, ~/.login, etc.) for system use, and users put their shell modifications into the corresponding .ext files (e.g., ~/.bashrc.ext, ~/.bash_profile.ext, etc.). This is not the case any more! You can now modify the standard dotfiles for your personal use.

The actual dotfile transition occurred during the center maintenance on February 21-25, 2020. To mitigate any interruptions to existing workloads, we preserved shell environments by replacing dotfiles with template dotfiles that source .ext files. For example, we changed the ~/.bashrc file to look like,

# begin .bashrc
if [ -z "$SHIFTER_RUNTIME" ]
then
    . $HOME/.bashrc.ext
fi
# end .bashrc

We recommend that users whose accounts were created before February 2020 move the contents of their ~/.bashrc.ext file into their ~/.bashrc file (and remove the .ext files afterwards).

Changing Your Default Login Shell

Use Iris to change your default login shell. Log in, then under the "Profile" tab look for the "Server Logins" section. Click on "Edit" under the "Actions" column.

Customizing Your Shell Environment

bash users can add startup configurations in the ~/.bashrc file, e.g., environment variables, aliases, and functions, to make them accessible in subshells. The ~/.bashrc file is sourced in non-interactive shell invocations (an example of this is running a shell script). csh users should specify their configuration in ~/.tcshrc, which will be available in interactive login and interactive non-login mode.

System-specific Customizations

All NERSC systems share the same global home file system; a user's $HOME macro points to the same directory on every NERSC platform. To make system-specific customizations, use the pre-defined environment variable NERSC_HOST.

Don't set NERSC_HOST

Some older dotfiles set NERSC_HOST without checking whether this variable is set first. Generally you should not need to do this, so we advise you not set NERSC_HOST in your dotfiles. If you must set NERSC_HOST for some reason, it's good practice to check whether this variable is set first before overwriting it. In bash you can do this with if [ -z "$var" ]; then var="mysettings"; fi.

Example

case $NERSC_HOST in
    "perlmutter")
        : # settings for Perlmutter
        export MYVARIABLE="value-for-perlmutter"
        ;;
    "datatran")
        : # settings for DTN nodes
        export MYVARIABLE="value-for-dtn"
        ;;
    *)
        : # default value for other nodes
        export MYVARIABLE="default-value"
        ;;
esac

Shifter

If you run shifter applications, you may want to skip the dotfiles. You can use the following if block in your dotfiles:

if [ -z "$SHIFTER_RUNTIME" ]; then
    : # Settings for when *not* in shifter
fi

Missing NERSC Variables

If any NERSC-defined environment variables such as $SCRATCH, are missing in your shell invocations, you can add them in your ~/.bashrc file as follows:

if [ -z "$SCRATCH" ]; then
    export SCRATCH=/global/pscratch/sd/${USER:0:1}/$USER
fi

scrontab

Crontab functionality is provided on NERSC HPC systems via scrontab. If you run bash scripts with scrontab, you may want to invoke a login shell (#!/bin/bash -l) in order to get the NERSC-defined environment variables, such as NERSC_HOST, SCRATCH, PSCRATCH, and to get the module command defined.

For more information about using scrontab at NERSC, please see our scrontab documentation page

Troubleshooting User Environment Issues

If you are facing issues with your user environment, we have some recommendations to help you diagnose the problem.

First, we recommend you check the shell startup files used by your shell type (bash, sh, csh, zsh, tcsh). Most user environment issues can be resolved by reviewing the content of your user startup files. For bash users, check your $HOME/.bashrc file to see if an environment issue is caused by this file. For csh, check $HOME/.cshrc and for zsh, check $HOME/.zshrc. If you update your startup files, you can source the files to apply the changes to the current shell (source $HOME/.bashrc) or log out and log back in.

If you want to know where environment variables are set, you will need to understand the shell startup files. When you ssh into NERSC systems you are in an interactive login shell. For bash user you will want to look at the table outlined in bash startup files. The /etc/profile script, which is typically sourced during shell login, is available on any Linux distribution, but its contents may vary by distribution. During shell initialization, the shell will source files in /etc/profile.d/* -- startup files added by the site administrator to provide system-wide defaults to all users. We encourage you review the content of each file if you need to troubleshoot your environment. Note that /etc/profile and files in /etc/profile.d/* are owned by the root user, so you wouldn't be able to edit them, but it's good to check these files when tracing issues related to the startup environment.

Second, you can review the modules loaded at startup. All user environments are initially loaded with a pre-determined set of modulefiles selected by the site administrators. You should review the content of your active modules by running module list, then analyze the content of each modulefile by running module show <modulefile>. Many users include module load statements in their ~/.bashrc to customize their startup modules, but this can cause unexpected side-effects when loading other modules.

Here are some additional tips to help you troubleshoot environment issues:

  • Check for environments like PATH, LD_LIBRARY_PATH in startup scripts such as ~/.bashrc that may cause issues. A common mistake is to reset one of these environment variables instead of prepending or appending additional paths. Setting export PATH=/path/to/dir will corrupt your shell -- instead set export PATH=/path/to/dir:$PATH, which will prepend a directory to $PATH.
  • Check all environment variables set in your terminal via env or printenv. If you are looking for a particular pattern, you can grep for it within the long output, e.g., printenv | grep -i petsc (the -i ignores capitalization).
  • Always check the path to the binary that is being run. For instance, if you want to run a python script, double check the path to the python wrapper by invoking which python and see if the path makes sense.
  • Make sure you are on the right machine! The environment variable NERSC_HOST will show you which machine you are logged into. The expected value should be the following for Perlmutter:
elvis@perlmutter> echo $NERSC_HOST
perlmutter
  • Check whether you are in login or compute node by invoking hostname. If you see an output start with nid* then chances are you are in a compute node.
  • If your shell prompt gets clobbered, try running reset, which will reset your terminal settings.

Troubleshooting Shell Scripts

Running Shell Scripts

You can run a shell script with your preferred shell (i.e., bash script.sh, csh script.sh, sh script.sh) or you can specify a full or relative path to the script. A shell script must be executable in order to run when specifying the full path. In example below there is a permission error, since the file doesn't have execute permission (x). You can fix this by running chmod +x script.sh.

elvis@login24> ./script.sh
bash: permission denied: ./script.sh

elvis@login24> ls -l script.sh
-rw-rw---- 1 elvis elvis 126 Apr  1 08:43 script.sh

Using Strict Running Modes

Running a script in a stricter mode can help in the debugging process. For example, the default behavior of the bash shell is to run a script to completion regardless of the success of any commands within the script. Using set -e makes the script terminate immediately when a simple command exits with a non-zero exit status (effectively, upon encountering an error).

The set command is a built-in option that changes shell behavior in bash and sh.

Note

In csh, the set command is used for setting variables (set FOO=BAR). This is very different from how set works in bash or sh: in these shells' syntax, set changes the behavior of the current shell.

In the following example, bash stops execution after running XYZ (which is an invalid command). The command whoami is not run because the script terminates immediately after the invalid command. Note the non-zero script exit code, retrieved by $?.

elvis@login24> cat script.sh
#!/bin/bash
set -e
hostname

# invalid command. Bash will terminate immediately
XYZ

# This command won't be executed
whoami

elvis@login24> bash script.sh
login24
script.sh: line 6: XYZ: command not found

elvis@login24> echo $?
127

The shebang is a character sequence (!#) at the beginning of a script used to indicate which shell interpreter to use when processing the script. You can also pass any shell options in the shebang line. In the previous example, we specified set -e within the script to modify the behavior of the bash shell. This option can be passed on the shebang line #!/bin/bash -e, which is also equivalent to invoking the script with /bin/bash -e <script>.sh. Likewise, to enable strict mode for the csh/tcsh shell, you can use #!/bin/csh -e and #!/bin/tcsh -e.

If we were to source this script, the setting would be applied to the current shell. When set -e is enabled in the current shell or set as a result of sourcing some script, an invalid command (even a typo!) will terminate your shell. Watch out for this behavior if you source any script that enables set -e.

elvis@login24> source script.sh
login24
 XYZ: command not found
Connection to perlmutter.nersc.gov closed.

Running in the mode in which the execution of a script terminates upon detecting a non-zero exit status can help you determine what went wrong in your script. You can check the exit code of your last command as follows:

# bash, sh, zsh
echo $?

# csh, tcsh
echo $status

For complicated commands, set -e may not be sufficient to determine whether there was an error. For example in bash, the exit code for a piped command (|) will be the last command in the pipe. Below we show two examples of non-zero exit codes within the pipe operator. The command grep123 is a typo -- we meant grep. In the first example we see a non-zero exit code, however in the second example we see a 0 exit code because wc -l returned 0:

elvis@login24> ls -ld | grep123 $user
 grep123: command not found
elvis@login24> echo $?
127

elvis@login24> ls -ld | grep123 $user | wc -l
 grep123: command not found
0
elvis@login24> echo $?
0

If you want bash to report the piped command as a failure, consider also running set -o pipefail. If we add this setting and rerun the same example, we now see the exit code is 127 instead of 0.

elvis@login24> set -o pipefail
elvis@login24> ls -ld | grep123 $user | wc -l
 grep123: command not found
0
elvis@login24> echo $?
127