RZVernal | HPC @ LLNL

*Login nodes: rzvernal10, rzvernal11

Job Limits

Each LC platform is a shared resource. Users are expected to adhere to the following usage policies to ensure that the resources can be effectively and productively used by everyone. You can view the policies on a system itself by running:

news job.lim.MACHINENAME

Web Version of RZVernal

RZVernal is an CORAL2 early access system. There are 2 login nodes and 36 compute nodes in a pdebug partition. The compute nodes have 64 AMD EPYC cores/node, 4 AMD gfx90a gpus/node, and 512 GB memory per node. RZVernal is running TOSS 4.

There is 1 scheduling pool:

pdebug—2340 cores, 144 gpus (36 nodes)

Pool Max nodes/job Max runtime --------------------------------------------------- pdebug 3 4 hours ---------------------------------------------------
Scheduling

RZvernal jobs are scheduled using SLURM. Scheduling is not technically enforced so users are expected to monitor their own behavior and keep themselves within the current limits while following the policies written below:

Users will not compile on the login nodes during daytime hours.

Daytime hours are from 8:00am to 8:00pm Monday through Friday.

No production runs allowed, only development and debugging.

Users will avoid computationally intensive work on the login node.

We are all family and expect developers to play nice. However if someone's job(s) have taken over the machine:

Call them or send them an email.

Email ramblings-help@llnl.gov with a screenshot so we can take care of the situation by killing work that violates policy.

This approach will be revisited later and additional limits will be set if necessary.

The queue can be found by typing "flux jobs -A"

Contact

Please call or send email to the LC Hotline if you have questions. | LC Hotline: phone: 925-422-4531 | SCF email: lc-hotline@pop.llnl.gov
Zone	RZ
Vendor	HPE Cray
User-Available Nodes	Login Nodes* 2 nodes: rzvernal[10,11] Batch Nodes 0 Debug Nodes 36 Total Nodes 38
CPUs	CPU Architecture AMD Trento Cores/Node 64 Total Cores 2,432
GPUs	GPU Architecture AMD MI-250X Total GPUs 152 GPUs per compute node 4 GPU global memory (GiB) 128.00
Memory Total (GiB)	16,384
Peak Performance	Peak PFLOPS (CPUs) 0.512 Peak PFLOPs (GPUs) 6.840 Peak PFLOPS (CPUs+GPUs) 6.916
Clock Speed (GHz)	1.9
OS	TOSS 4
Interconnect	HPE Slingshot 11
Scheduling Policy (main batch queue)	node-scheduled
Program	ASC
Class	ATS-4/EA, CORAL-2
Year Commissioned	2022
Compilers	See Compilers page
Documentation	El Capitan Early Access Systems Documentation on LC Confluence

Job Limits