Skip to content

Darshan I/O profiler

Darshan is an open-source lightweight I/O profiler developed by ANL, which collects I/O statistics of several widely-used HPC I/O frameworks such as MPI-IO, HDF5, PNetCDF, and standard POSIX calls. We use Darshan at NERSC to examine file system utilization and provide advices to improve performance of users' applications.

Darshan is available as a module on Perlmutter for all users, and is included at link time into users' applications via the Cray compiler wrappers (cc, CC, ftn) (see the related page in the docs for more details on compilers at NERSC). If you wish to use darshan on Perlmutter you can do module load darshan before compiling.

Darshan is started automatically when a MPI session is initiated, and will create a log file in a defined log directory, which is then used by NERSC staff to extract usage metrics of the different file systems. Read on to learn how to enable darshan for non-MPI applications, or how you can use darshan log file to study the I/O behavior of your application.

To check whether your dynamically linked application has been compiled to instrument data with darshan at runtime, use ldd and look for darshan among the results:

$ ldd your-application |grep darshan
    libdarshan.so => /path/to/darshan/x.y.z/lib/libdarshan.so

For statically built applications you can list the symbols contained in your executable with nm.

Tip

The default darshan/3.4.0 module that is loaded automatically in the users' environment only instruments POSIX and MPI-IO calls, but we also provide darshan/3.4.0-hdf5, which can be used to instrument applications using HDF5, and can be swapped in with:

module swap darshan/3.4.0-hdf5

Opting out of darshan

Should darshan cause you any issue, you can disable it by unloading the darshan module and rebuilding your application. We believe darshan to be stable for most applications at NERSC, but we invite users to contact us if they experience any problems, via the online help desk.

Injecting darshan into your application

If you're not using the Cray compiler wrappers or want to compile a statically-linked non-MPI application refer to the official Darshan documentation for instructions on how to generate Darshan-capable compiler wrappers.

For all other cases, using cc, CC or ftn should work out of the box.

Enabling darshan at runtime

Darshan is automatically injected into users' applications at compile time, but it can also be enabled at runtime on dynamically linked executables: these are applications built before darshan went into production, or applications built without the Cray compiler wrappers, or interpreted languages applications (e.g. Python). This may also be useful for applications not built on Perlmutter, like executables on CVMFS or other pre-compiled binaries.

You can enable darshan by setting the LD_PRELOAD variable for your application, for example:

LD_PRELOAD="$DARSHAN_BASE_DIR/lib/libdarshan.so" your-application-here

Do not export LD_PRELOAD globally

export-ing LD_PRELOAD in your session will instrument any application you execute, which may impact your workflow and also the filesystem where the darshan logs are stored.

To instrument a code you execute through srun, export the LD_PRELOAD variable only to the application being launched by srun, to avoid instrumenting srun internal calls:

srun --export=ALL,LD_PRELOAD=$DARSHAN_BASE_DIR/lib/libdarshan.so your-application-here

Warning

The ALL token in srun --export=ALL,LD_PRELOAD=... is required to instruct SLURM to add LD_PRELOAD to the existing environment variables; not specifying ALL will make your application ignore the current environment variables and may cause your application to crash because some required environment variables are missing. See man srun for more information and details.

Warning

Darshan doesn't interact correctly with multiple Python processes spawned via multiprocessing, due to how the Python internals operate to clone processes.
Related bug tracker.

Instrumenting non-MPI code

Darshan can also be used to instrument non-MPI code. To enable this feature, set the environment variable DARSHAN_ENABLE_NONMPI to any value, e.g.:

DARSHAN_ENABLE_NONMPI=1 LD_PRELOAD="$DARSHAN_BASE_DIR/lib/libdarshan.so" your-application-here

Producing reports

The darshan modules save the data they collect to a shared dir, divided by date, username, application name, etc. according to the following "mask" on Perlmutter:

/pscratch/darshanlogs/${YEAR}/${MONTH}/${DAY}/${USER}_${APPLICATION-NAME}_${JOB-ID}_${TIME}.darshan

This means you can find the logs of your applications by searching for the day your application was running and filtering on your NERSC username.

Darshan log files can be processed to produce a plain text or PDF report containing relevant insights of your application.

For example, given $LOGFILE an environment variable storing some compressed darshan log data, you can parse it with darshan-parser, a command available in the darshan module loaded by default:

darshan-parser $LOGFILE

The output can be quite long if the application has accessed several files during a long run: redirect the output to a file (e.g. > $PARSED_LOGFILE) or pipe it to other commands for better reading (e.g. | less).

Excessive computing on login nodes harms other users

Please submit a job or use the interactive queue if you plan to parse several logfiles, because it may impact other users' experience and workflows on login nodes.

To produce a PDF report you need to first load the texlive module, then use darshan-job-summary.pl, like the following:

module load texlive
darshan-job-summary.pl $LOGFILE

You can control where to store the output file name with --output /path/to/output.pdf, otherwise the output file will default to a file named like the input darshan log file and the suffix .pdf, saved in the current directory.

Here's an example of a report produced by darshan when executing an MPI application: you can extract many details on how your application accesses and uses the file system, and you can appreciate some plots.

Difference between darshan-parser text output and PDF report

The PDF report does not contain everything that can be extracted with the darshan-parser tool, but new darshan releases may improve the PDF report produced, see e.g. this thread.

Build options

To build darshan 3.4.0 on Perlmutter, these scripts were used.

In particular, the PrgEnv-gnu and craype-haswell modules are used because the gnu compiler produces a more "compact" darshan library with less dependencies, which can be used to instrument applications built against many combinations of compilers and MPI frameworks.

The MPI framework used to build darshan is the Cray-optimized MPICH, automatically provided by the cc compiler wrapper: all users' applications built against MPICH or MVAPICH should work fine. Users building their applications against Open MPI or derivatives (Intel MPI, Spectrum MPI, etc) may need to disable darshan, or build their own version.

Darshan is also able to instrument PnetCDF I/O calls; this mode can be enabled by adding --enable-pnetcdf-mod=${PNETCDF_DIR} at the configure, after you load one of the cray-parallel-netcdf available pnetcdf modules.

HDF5-aware darshan build

The default cray-hdf5-parallel version 1.12.1.1 was used to build the HDF5-aware darshan on Perlmutter: HDF5 1.10 introduced some ABI changes that are not compatible with HDF5 1.8 or lower and cause darshan to break applications; only HDF5 1.10 or higher are currently available on Perlmutter, so if you only use NERSC-provided HDF5 modules you should not experience issues.

If your application was built against HDF5 1.8 or lower and you cannot rebuild it against a newer HDF5 release and you want to instrument your code with darshan, you need to build your own darshan against the HDF5 release you're using: feel free to use the scripts above or contact us for support.

A caveat of building an application with the HDF5-capable darshan is that the HDF5 library dependency will be always included in the list of libraries that the application will look up, also when you don't have any HDF5 code; this means that the library will always be "loaded" by the Operative System at execution, but apart from a minor slowdown in order to retrieve the library, your application should work normally.

Known issues

  1. If you build an application with gcc and HDF5, and you load the darshan built with HDF5, the linker may complain with the following message:

    /usr/bin/ld: warning: libhdf5_parallel_gnu_82.so.103, needed by $DARSHAN_BASE_DIR/lib/libdarshan.so, may conflict with libhdf5_parallel_gnu_82.so.200
    

    This is caused by a different hdf5 used, between the loaded cray-hdf5-parallel module and the one used to build darshan: this warning does not have an impact on your application and can be ignored, since the ABI used by darshan to instrument I/O calls should be the same for all the cray-hdf5 modules available at Perlmutter.

  2. While the hdf5-aware darshan module usually works fine for compiled applications, it may produce some incompatibility warnings when used with interpreted programs, for example with Python environments using conda to provide an external HDF5 build. This is probably caused by h5py trying to load the HDF5 dependencies it was built upon directly, instead of using those provided in the LD_PRELOAD variable. This causes the following warning message to appear:

    $ module load python
    $ conda create -y -n testdarshan/ python=3.8 h5py hdf5=1.10.6
    
    $ python -c 'import h5py; print(h5py.version.hdf5_version)
    1.10.6
    
    $ module swap darshan/3.4.0-hdf5
    $ LD_PRELOAD="$DARSHAN_BASE_DIR/lib/libdarshan.so" python -c 'import h5py; print(h5py.version.hdf5_version)
    testdarshan/lib/python3.8/site-packages/h5py/__init__.py:37: UserWarning: h5py is running against HDF5 1.10.5 when it was built against 1.10.6, this may cause problems
    Warning! ***HDF5 library version mismatched error***
    The HDF5 header files used to compile this application do not match the version used by the HDF5 library to which this application is linked.
    Data corruption or segmentation faults may occur if the application continues.
    This can happen when an application was compiled by one version of HDF5 but linked with a different version of static or shared HDF5 library.
    You should recompile the application or check your shared library related settings such as 'LD_LIBRARY_PATH'.
    You can, at your own risk, disable this warning by setting the environment variable 'HDF5_DISABLE_VERSION_CHECK' to a value of '1'.
    Setting it to 2 or higher will suppress the warning messages totally.
    Headers are 1.10.6, library is 1.10.5
    
    $ HDF5_DISABLE_VERSION_CHECK=2 LD_PRELOAD="$DARSHAN_BASE_DIR/lib/libdarshan.so" python -c 'import h5py; print(h5py.version.hdf5_version)
    1.10.5
    

    Setting the variable HDF5_DISABLE_VERSION_CHECK to 1 or higher will drop the warning, but this seems to cause h5py to use the HDF5 library used to compile darshan with, instead of the HDF5 library installed with conda.

    Please refer to the section Build options above to build your own darshan release on top of the HDF5 library you installed with conda. In particular when you configure darshan you need to specify the HDF5 path during the configure step, which is the prefix you use when you installed the conda environment; in the example above it would be: --enable-hdf5-mod="/path/to/conda/env/testdarshan/"

    You can then use your own darshan to instrument your python code.

  3. When instrumenting interpreted languages (e.g. Python), you may get errors like undefined symbol: H5get_libversion. Explicitly adding the HDF5 library in LD_PRELOAD after the darshan library fixes this error, for example:

    LD_PRELOAD="$DARSHAN_BASE_DIR/lib/libdarshan.so:/path/to/your/libhdf5.so" your-application-here
    

    And similarly for variables exported to srun.

  4. The HDF5-aware darshan library provided was built with the MPICH provided by the Cray compiler wrapper, and may cause some applications to break with the following message: Attempting to use an MPI routine before initializing MPICH

    If you're interested in tracing your non-MPI application, consider building your own version of darshan as shown in the section above, adding --without-mpi, and set the environment variable to enable darshan for non-MPI applications.

    If you're not interested in darshan, you can opt-out from it and just rebuild your application.

  5. Darshan aggregates the data collected during a MPI run only when MPI_Finalize() is called inside the application; this means that applications lacking the Finalize call won't have their data collected, and similarly this will happen for applications that crashed during execution. A fix for this issue is currently being developed.

  6. Applications built with darshan usually are less portable than those built without, because the library loader will try to load libdarshan.so at every execution. You can opt-out of darshan to make your application more portable.