Training

IBM Spectrum LSF is a batch scheduler that allows users to run their jobs on Livermore Computing’s (LC) Sierra (link is external) (CORAL) high performance computing (HPC) clusters. IBM Cluster System Management (CSM) is the resource manager for the Sierra systems.

Both Slurm and IBM's CSM provide a way to launch tasks (aka, Linux processes) of a user's application in parallel across resources allocated to the job. For Slurm, the command is srun; for CSM, the command is jsrun. This page presents their similarities and their differences. It also details lrun, an LLNL developed wrapper script for jsrun.

This page lists available online tutorials related to parallel programming and using LC's HPC systems.

May 1, 2019
 

Date / Time May 1, 2019   10:00am - 11:30am
Location

Building 453  Room 1001  (Armadillo Room)
Note: This is a Property Protection Area. Foreign national temporary escorted building access procedures apply.

This 3 day workshop covers the "getting started" basics for using Livermore Computing's (LC) High Performance Computing (HPC) systems. It is intended to help LLNL summer interns (and others) get up to speed quickly with using LC's supercomputers. Topics include an introduction to parallel computing, overview of LC's HPC resources and computing environment, using LC's HPC Linux clusters (with hands-on exercises), running jobs and using LC's Moab/SLURM batch systems (with hands-on exercises), introductory parallel programming with MPI (with hands-on exercises), introductory parallel programming with OpenMP (with hands-on exercises) and debugging serial and parallel applications using the TotalView debugger (with hands-on exercises).

June 6–8, 2017
 

Dates This event has already occurred.  Page is archival.
Description

This presentation by the Rogue Wave software development team will demonstrate some of the high impact improvements made as part of projects conducted in collaboration with LLNL. These projects focus on improving CORAL NVIDIA CUDA GPU and OpenMP 4 debugging with TotalView, and TotalView/Python debugging integration.

Max Katz, Steve Rennich and Peng Wang from Nvidia will be presenting a workshop on how to use Nvidia performance tools, how Unified Memory works today on Pascal, and how CUDA and unified memory will be evolving in the Volta time frame. See agenda below for content.

Hari Subramoni and Mark Arnold from the MVAPICH development team will on site to discuss how to get the most out of MVAPICH MPI on LC clusters, including both TOSS and CORAL systems. Drop in for any and all sessions you’d like. After each presentation, time is scheduled for Q&A, so come ready with your questions and feedback.

This presentation by the Rogue Wave software development team will demonstrate some of the high impact improvements made as part of projects conducted in collaboration with LLNL. These projects focus on improving CORAL NVIDIA CUDA GPU and OpenMP 4 debugging with TotalView, and TotalView/Python debugging integration.

Pages