Training

This presentation by the Rogue Wave software development team will demonstrate some of the high impact improvements made as part of projects conducted in collaboration with LLNL. These projects focus on improving CORAL NVIDIA CUDA GPU and OpenMP 4 debugging with TotalView, and TotalView/Python debugging integration.

Hari Subramoni and Mark Arnold from the MVAPICH development team will on site to discuss how to get the most out of MVAPICH MPI on LC clusters, including both TOSS and CORAL systems. Drop in for any and all sessions you’d like. After each presentation, time is scheduled for Q&A, so come ready with your questions and feedback.

This 3 day workshop covers the "getting started" basics for using Livermore Computing's (LC) High Performance Computing (HPC) systems. It is intended to help LLNL summer interns (and others) get up to speed quickly with using LC's supercomputers. Topics include an introduction to parallel computing, overview of LC's HPC resources and computing environment, using LC's HPC Linux clusters (with hands-on exercises), running jobs and using LC's Moab/SLURM batch systems (with hands-on exercises), introductory parallel programming with MPI (with hands-on exercises), introductory parallel programming with OpenMP (with hands-on exercises) and debugging serial and parallel applications using the TotalView debugger (with hands-on exercises).

This 3 day workshop covers the "getting started" basics for using Livermore Computing's (LC) High Performance Computing (HPC) systems. It is intended to help LLNL summer interns (and others) get up to speed quickly with using LC's supercomputers. Topics include an introduction to parallel computing, overview of LC's HPC resources and computing environment, using LC's HPC Linux clusters (with hands-on exercises), running jobs and using LC's Moab/SLURM batch systems (with hands-on exercises), introductory parallel programming with MPI (with hands-on exercises), introductory parallel programming with OpenMP (with hands-on exercises) and debugging serial and parallel applications using the TotalView debugger (with hands-on exercises).

This page lists available online tutorials related to parallel programming and using LC's HPC systems.

Both Slurm and IBM's CSM provide a way to launch tasks (aka, Linux processes) of a user's application in parallel across resources allocated to the job. For Slurm, the command is srun; for CSM, the command is jsrun. This page presents their similarities and their differences. It also details lrun, an LLNL developed wrapper script for jsrun.

This workshop is intended for code teams and developers who will be using LLNL's future CORAL Sierra supercomputer. In preparation for Sierra, LC provides Early Access (EA) systems using a similar hardware and software environment. The focus of this introductory workshop is to provide basic "getting started" information for prospective users of these EA systems. Materials presented include Sierra overview, EA hardware, accounts/access, selected user environment topics, compilers, MPI, running jobs & the LSF scheduler, NVIDIA GPU topics, tools & debuggers, documentation and getting help.

IBM Spectrum LSF is a batch scheduler that allows users to run their jobs on Livermore Computing’s (LC) Sierra (link is external) (CORAL) high performance computing (HPC) clusters. IBM Cluster System Management (CSM) is the resource manager for the Sierra systems.

This presentation by the Rogue Wave software development team will demonstrate some of the high impact improvements made as part of projects conducted in collaboration with LLNL. These projects focus on improving CORAL NVIDIA CUDA GPU and OpenMP 4 debugging with TotalView, and TotalView/Python debugging integration.

Pages