Catalyst—Decommissioned

The 150 teraflop/s Catalyst, a unique high-performance computing (HPC) cluster, serves as a proving ground for new HPC and big data technologies, architectures, and applications. Developed by a partnership of Cray, Intel, and Lawrence Livermore, this Cray CS300 system is available for collaborative projects with industry through Livermore’s HPC Innovation Center. Catalyst also supports the LLNL’s Advanced Simulation and Computing program.

Catalyst features include 128 gigabytes of dynamic random access memory (DRAM) per node and 800 gigabytes of non-volatile memory (NVRAM) per compute node. The increased storage capacity of the system represents a major departure from classic simulation-based computing architectures common at Department of Energy laboratories and enables researchers to explore the potential of combining floating point-focused capability with data analysis in one environment. In addition, the machine’s expanded DRAM and fast, persistent NVRAM are particularly well suited to solving big data problems, such as those found in the areas of bioinformatics, graph networks, machine learning, and natural language processing, or for exploring new approaches to application checkpointing, in-situ visualization, out-of-core algorithms and data analytics. Catalyst should help extend the range of possibilities for the processing, analysis, and management of the ever larger and more complex data sets that many areas of business and science now confront.

Catalyst is limited access.

*2 nodes: catalyst[159,160]

**Notes:**

Local NVRAM storage, mounted on each compute node as /l/ssd: 800 GB

Job Limits

Each LC platform is a shared resource. Users are expected to adhere to the following usage policies to ensure that the resources can be effectively and productively used by everyone. You can view the policies on a system itself by running:

news job.lim.MACHINENAME

Web version of Catalyst Job Limits

Hardware

There are 302 compute nodes, each with 128 GB of memory. All compute nodes have Intel Xeon E5-2695 v2 processors with 24 cores/node. The nodes are connected via InfiniBand QDR (QLogic)

Scheduling

Catalyst jobs are scheduled through SLURM.

Jobs are scheduled per node. There is 1 node scheduled pool:

pbatch - 300 nodes (7,200 cores), only batch use permitted.

Catalyst limits:

Max nodes/job Max runtime ------------------------------------------------------- pbatch * 24 hours -------------------------------------------------------
* There are no limits currently set for the maximum nodes/job, however in general LC prefers users to adhere to a good neighbor policy of limiting the use of a system or queue to 50% of the available nodes. In this case, limiting maximum job size to 150 nodes, or limiting the aggregate number of
nodes across all running jobs to 150 nodes is a good practice.

Documentation

Linux Clusters Tutorial Part One | Linux Clusters Part Two

Slurm Tutorial (formerly Slurm and Moab

Using TOSS 3

TCE Home

Scratch Disk Space

Consult CZ File Systems Web Page: https://lc.llnl.gov/fsstatus/fsstatus.cgi
Zone	CZ
Vendor	Cray
User-Available Nodes	Login Nodes* 2 Batch Nodes 300 Debug Nodes 4 Total Nodes 324
CPUs	CPU Architecture Intel Xeon E5-2695 v2 Cores/Node 24 Total Cores 7,776
Memory Total (GiB)	41,472
CPU Memory/Node (GiB)	128
Clock Speed (GHz)	2.4
Peak single CPU memory bandwidth ((GiB)/s)	60
OS	TOSS 3
Interconnect	IB QDR
Scheduler	srun
Recommended location for parallel file space	/p/lustre[1,2]
Program	ASC, M&IC
Compilers	Compiler (TOSS 3)
Documentation	Introduction to LC Resources Linux Clusters Overview

Job Limits

Web version of Catalyst Job Limits

Hardware

Scheduling

Documentation

Scratch Disk Space