Skip to content

File Systems

It is the responsibility of every user to organize and maintain software and data critical to their science. Managing your data requires understanding NERSC's storage systems.

File Systems

NERSC provides several file systems. Making efficient use of NERSC computing facilities requires an understanding of the strengths and limitations of each file system.

Note

JAMO is JGI's in-house-built hierarchical file system, which has functional ties to NERSC's file systems and tape archive. JAMO is not maintained by NERSC.

All NERSC file systems have per-user quotas on total storage and number of files and directories (known as inodes). To check your file system usage, use the showquota command from any login node or check the Data Dashboard on my.nersc.gov.

$HOME

Your home directory is mounted across all NERSC systems. You should refer to this home directory as $HOME wherever possible, and you should not change the environment variable $HOME.

Each user's $HOME directory has a quota of 40GB and 1M inodes. $HOME quotas will not be increased. Other file systems should be used for storage of large volumes of data which are either being stored long term or are a part of computation in progress.

Warning

As a global file system, $HOME is shared by all users, and is not configured for performance. Do not write scripts or run software in a way that will cause high bandwidth I/O to $HOME. High volumes of reads and writes can cause congestion on the file system metadata servers, and can slow NERSC systems significantly for all users. Large I/O operations should always be directed to scratch file systems.

DnA (Data n' Archive)

DnA is a 2.4PB GPFS file system for the JGI's archive, shared databases, and project directories.

DnA Projects DnA Shared DnA DM Archive
Location /global/dna/projectdirs/ /global/dna/shared /global/dna/dm_archive
Quota 5TB default Defined by agreement with the JGI Management Defined by agreement with the JGI Management
Backups Daily, only for projectdirs with quota <= 5TB Backed up by JAMO Backed up by JAMO
File purging Files are not automatically purged Purge policy set by users of the JAMO system Files are not automatically purged

The intention of the DnA "Project" and "Shared" space is to be a place for data that is needed by multiple users collaborating on a project which allows for easy reading of shared data. The "Project" space is owned and managed by the JGI. The "Shared" space is a collaborative effort between the JGI and NERSC. Write access to DnA is restricted to protect high performance; data can only be written to DnA from Data Transfer Nodes or by loading the esslurm module and then using the --qos=dna_xfer QOS.

If you would like a project directory, please contact JGI management to discuss the use case and requirements for the proposed directory.

The "DM Archive" is a data repository maintained by the JAMO system. Files are stored here during migration using the JAMO system. The files can remain in this space for as long as the user specifies. Any file that is in the "DM Archive" has also been placed in the HPSS tape archive. This section of the file system is owned by the JGI data management team.

$SCRATCH

Each user has a "scratch" directory. Scratch directories are NOT backed up and files can be purged if they have not been accessed for 90 days. Find your scratch directory using the environment variable $SCRATCH for example:

elvis@cori02:~> cd $SCRATCH
elvis@cori02:/global/cscratch1/sd/elvis> 

Scratch environment variables:

Environment Variable Value NERSC Systems
$SCRATCH Best-connected file system All NERSC computational systems
$CSCRATCH /global/cscratch[1,2,3]/sd/$username Cori

$SCRATCH will always point to the best-connected scratch space available for the NERSC machine you are accessing.

The intention of scratch space is for staging, running, and completing calculations on NERSC systems. Thus, these file systems are configured for best performance when usage is wide-scale file reading and writing from many compute nodes. The scratch file systems are not intended for long-term file storage or archival. Data is not backed up, and files not accessed for a significant time period can be purged.

Policies for $SCRATCH are described at NERSC Data Management Policy.

Other file systems

Other file systems used by JGI may also be mounted on NERSC systems:

  • /usr/common - is a file system where NERSC staff build software for user applications. This is the principal site for the modular software installations.
  • /global/cfs - is a GPFS-based file system that is accessible on almost all of NERSC's other compute systems used by all the other NERSC users.