Job Limits

Each LC platform is a shared resource. Users are expected to adhere to the following usage policies to ensure that the resources can be effectively and productively used by everyone. You can view the policies on a system itself by running:

news job.lim.MACHINENAME

Hardware

Each RZhound node is based on Intel Sapphire Rapids processor with 56 cores per socket, 2 sockets per node, and 256 GB DDR5 memory.

Scheduling

Batch jobs are scheduled through SLURM.

  • pdebug—16 nodes (1702 cores), interactive use only.
  • pbatch— 358 nodes (40096 cores), batch use only.
Pools               Max nodes/job       Max runtime
---------------------------------------------------
pdebug                    4(*)            1 hour
pbatch                   32(**)          24 hours
---------------------------------------------------

(*) Please limit the use of pdebug to 8 nodes on a PER USER basis, not a PER JOB basis, to allow other users access. Pdebug is scheduled using fairshare and jobs are core-scheduled, not node-scheduled. To allocate whole nodes, add a '--exclusive' flag to your sbatch or salloc command.

(**) In addition to the max-nodes / job limit, there is an additional limit of 32 nodes per user per bank across all of an individual user's jobs.

 

Do NOT run computationally intensive work on the login nodes. There are a limited number of login nodes which are meant primarily for editing files and launching jobs. A majority of the time when a login node is laggy, it is because a user has started up a compile on that login node.

Pdebug is intended for debugging, visualization, and other inherently interactive work.  It is not intended for production work. Do not use pdebug to run batch jobs.  Do not chain jobs to run one after the other. Individuals who misuse the pdebug queue in this or any similar manner may be denied access to running jobs in the pdebug queue.

Pdebug is core scheduled. To allocate whole nodes, add a '--exclusive' flag to your sbatch or salloc command.

Interactive access to a batch node is allowed while you have a batch job running on that node, and only for the purpose of monitoring your job. When logging into a batch node, be mindful of the impact your work has on the other jobs running on the node.

Scratch Disk Space: Consult CZ File Systems Web Page: https://lc.llnl.gov/fsstatus/fsstatus.cgi

Documentation

Contact

Please call or send email to the LC Hotline if you have questions. LC Hotline | phone: 925-422-4531 | email: lc-hotline@llnl.gov

Zone
RZ
Vendor
Dell
User-Available Nodes
Login Nodes*
2
Batch Nodes
374
Total Nodes
386
CPUs
CPU Architecture
Intel Sapphire Rapids
Cores/Node
112
Total Cores
41,888
Memory Total (GB)
95,744
CPU Memory/Node (GB)
256
Peak Performance
Peak TFLOPS (CPUs)
2,655.4
Clock Speed (GHz)
2.0
Interconnect
Cornelis Networks
Parallel job type
multiple nodes per job
Recommended location for parallel file space
Program
ASC, M&IC
Class
CTS-2
Password Authentication
OTP, Kerberos, ssh keys
Year Commissioned
2023
Compilers

See Compilers page

Documentation