Job Limits

Each LC platform is a shared resource. Users are expected to adhere to the following usage policies to ensure that the resources can be effectively and productively used by everyone. You can view the policies on a system itself by running:

news job.lim.poodle

Web version of Poodle Job Limits

Poodle is an M&IC resource to be used for serial and on-node parallelism only. Jobs are scheduled per core. This system does not have a high-speed interconnect.

Hardware

There are 2 login nodes, 3 pdebug nodes, 29 pbatch nodes, and 4 phighmem nodes. Each Poodle batch node contains dual-socket Intel Xeon Platinum 8479 processors (2.0 GHz) each with 56 cores for a total of 112 cores/node and 256 GiB DDR5 memory/node.

UPDATE 8/2023: Poodle also has High Bandwidth Memory (HBM) nodes which contain dual-socket Intel(R) Xeon(R) CPU Max 9480 processors (1.9 GHz) each with 56 cores for a total of 112 cores/node and 128 GiB HBM memory/node.

Scheduling

Batch jobs are scheduled through SLURM.

  •     Interactive login: poodle[17,18]
  •     Pdebug: poodle[1-3]
  •     Batch use: poodle[4-16,22-37]
  •     HBM use: poodle[38-41]

Poodle jobs are scheduled using SLURM, Jobs are scheduled per core. Scheduling is not technically enforced so users are expected to monitor their own behavior and keep themselves within the current limits while following the policies:

  • Users will not compile on the login nodes during daytime hours
  • Users can only use up to one node of phighmem at a time
  • A user can have a maximum of 336 processors with a runtime of up to 4 hours in queue during the day with the following exceptions:
    • An occasional one hour max job for debugging that takes 337-560 processors as long as it is the user's only job in the queue.
  • Daytime is 0800-2000 Mondays-Fridays not including holidays
  • No production runs allowed, only development and debugging
  • Users won't run computationally intensive work on the login node

Pdebug is intended for debugging, visualization, and other inherently interactive work.  It is NOT intended for production work. Do not use pdebug to run batch jobs.  Do not chain jobs to run one after the other. Individuals who misuse the pdebug queue in this or any similar manner will be denied access to running jobs in the pdebug queue.

Interactive access to a batch node is allowed only while you have a batch job running on that node, and only for the purpose of monitoring your job. When logging into a batch node, be mindful of the impact your work has on the other jobs running on the node.

Jobs are limited to a single node.  Multiple users can run on the same node.  If the number of cores is not specified, the job is allocated to one core.

We are all family and expect developers to play nice. However if someone's job(s) have taken over the machine:

  • Call them or send them an email.
  • Email ramblings-help@llnl.gov with a screenshot so we can take care of the situation by killing work that violates policy.

This approach will be revisited later and additional limits will be set if necessary. If someone monopolizes the machine, developers can always shift to other CZ resources.

Scratch Disk Space: Consult CZ File Systems Web Page: https://lc.llnl.gov/fsstatus/fsstatus.cgi

Documentation

Contact

Please call or send email to the LC Hotline if you have questions. LC Hotline | phone: 925-422-4531 | email: lc-hotline@llnl.gov

Zone
CZ
Vendor
Dell
User-Available Nodes
Login Nodes*
2
Batch Nodes
33
Debug Nodes
3
Total Nodes
41
CPUs
CPU Architecture
Intel(R) Xeon(R) Platinum 8479, Intel(R) Xeon(R) CPU Max 9480
Cores/Node
112
Total Cores
4,592
Memory Total (GB)
10,496
CPU Memory/Node (GB)
256
Peak Performance
Peak TFLOPS (CPUs)
293.9
OS
TOSS 4
Interconnect
Cornelis
Parallel job type
Serial
Recommended location for parallel file space
Program
ASC + M&IC (poss)
Class
CTS-2
Password Authentication
OTP
Year Commissioned
2022
Compilers

See Compilers page