Livermore Computing (LC) provides a large variety of High Performance Computing (HPC) clusters. Slurm is the batch scheduler and resource manager that schedules almost all LC clusters. Some LC clusters have been or are being transitioned to the Flux workload manager. The IBM Sierra clusters (aka CORAL systems) run the Spectrum LSF scheduler.
Batch Schedulers and Workload Managers
LC has deployed, supported, and/or developed several tools to help in the scheduling of jobs and management of HPC workloads over its history.
In 2006, the NNSA Tri-Labs selected the Moab Workload Manager to be the standard batch system on clusters across all three labs it was used for over ten years.
Slurm on TOSS 3
With the rollout of the TOSS 3 Linux operating system onto LC clusters starting in 2016, Slurm took over all batch scheduling responsibilities. For Tri-Lab users who favor Moab commands, wrappers are present on all LC clusters that emulate Moab’s commands but interact with the Slurm scheduler.
Flux and TOSS 4
With the TOSS 4 rollout beginning in 2022, LC is moving towards the use of Flux as the system level workload manager on our clusters. Early access TOSS 4 clusters are either partially or completely managed by Flux. Additionally, users may start to explore using Flux by starting Flux inside of a Slurm or LSF allocation on any LC cluster.
To find out if an LC cluster is on TOSS 3 vs 4 and what its primary resource manager is, please see its login header announcements.
The Batch System Primer provides an introduction to the concepts and terms used for running jobs on LC’s HPC clusters. From there, the links on the left provide the user guides to running jobs using Slurm, Flux, and LSF.