Hyperparameter optimization (HPO) is for tuning the hyperparameters of your machine learning model. E.g., the learning rate, filter sizes, etc. There are several popular algorithms used for HPO including grid search, random search, Bayesian optimization, and genetic optimization. Similarly, there are several libraries and tools implementing these algorithms, each having their own tradeoffs in usability, flexibility, and feature support.
On this page we will collect recommendations and examples for running distributed HPO tasks on our HPC systems.
Cray provides an HPO library which integrates very naturally with the Cray systems. It can use SLURM to request and use an allocation and provides genetic search, random search, grid search, and population-based training.
The official Cray HPO documentation can be found here:
You can load the latest version on Cori with:
module load cray-hpo
You can find an example Jupyter notebook for genetic search here: