It is the responsibility of every user to organize and maintain software and data critical to their science. Managing your data requires understanding NERSC's storage systems.
NERSC provides several file systems. Making efficient use of NERSC computing facilities requires an understanding of the strengths and limitations of each file system.
JAMO is JGI's in-house-built hierarchical file system, which has functional ties to NERSC's file systems and tape archive. JAMO is not maintained by NERSC.
All NERSC file systems have per-user quotas on total storage and number of files and directories (known as
inodes). To check your file system usage, use the
showquota command from any login node or check the Data Dashboard on my.nersc.gov.
Your home directory is mounted across all NERSC systems. You should refer to this home directory as
$HOME wherever possible, and you should not change the environment variable
$HOME directory has a quota of 40GB and 1M inodes.
$HOME quotas will not be increased. Other file systems should be used for storage of large volumes of data which are either being stored long term or are a part of computation in progress.
As a global file system,
$HOME is shared by all users, and is not configured for performance. Do not write scripts or run software in a way that will cause high bandwidth I/O to $HOME. High volumes of reads and writes can cause congestion on the file system metadata servers, and can slow NERSC systems significantly for all users. Large I/O operations should always be directed to scratch file systems.
DnA (Data n' Archive)¶
DnA is a 2.4PB GPFS file system for the JGI's archive, shared databases, and project directories.
|DnA Projects||DnA Shared||DnA DM Archive|
|Location|| || || |
|Quota||5TB default||Defined by agreement with the JGI Management||Defined by agreement with the JGI Management|
|Backups||Daily, only for projectdirs with quota <= 5TB||Backed up by JAMO||Backed up by JAMO|
|File purging||Files are not automatically purged||Purge policy set by users of the JAMO system||Files are not automatically purged|
The intention of the DnA "Project" and "Shared" space is to be a place for data that is needed by multiple users collaborating on a project which allows for easy reading of shared data. The "Project" space is owned and managed by the JGI. The "Shared" space is a collaborative effort between the JGI and NERSC. Write access to DnA is restricted to protect high performance; data can only be written to DnA from Data Transfer Nodes or by loading the
esslurm module and then using the
If you would like a project directory, please contact JGI management to discuss the use case and requirements for the proposed directory.
The "DM Archive" is a data repository maintained by the JAMO system. Files are stored here during migration using the JAMO system. The files can remain in this space for as long as the user specifies. Any file that is in the "DM Archive" has also been placed in the HPSS tape archive. This section of the file system is owned by the JGI data management team.
Each user has a "scratch" directory. Scratch directories are NOT backed up and files can be purged if they have not been accessed for 90 days. Find your scratch directory using the environment variable
$SCRATCH for example:
elvis@cori02:~> cd $SCRATCH elvis@cori02:/global/cscratch1/sd/elvis>
Scratch environment variables:
|Environment Variable||Value||NERSC Systems|
| ||Best-connected file system||All NERSC computational systems|
| || ||Cori|
$SCRATCH will always point to the best-connected scratch space available for the NERSC machine you are accessing.
The intention of scratch space is for staging, running, and completing calculations on NERSC systems. Thus, these file systems are configured for best performance when usage is wide-scale file reading and writing from many compute nodes. The scratch file systems are not intended for long-term file storage or archival. Data is not backed up, and files not accessed for a significant time period can be purged.
$SCRATCH are described at NERSC Data Management Policy.
Other file systems¶
Other file systems used by JGI may also be mounted on NERSC systems:
/usr/common- is a file system where NERSC staff build software for user applications. This is the principal site for the modular software installations.
/global/cfs- is a GPFS-based file system that is accessible on almost all of NERSC's other compute systems used by all the other NERSC users.