Job Limits
Each LC platform is a shared resource. Users are expected to adhere to the following usage policies to ensure that the resources can be effectively and productively used by everyone. You can view the policies on a system itself by running:
news job.lim.poodle
Web version of Poodle Job Limits
Poodle is an M&IC resource to be used for serial and on-node parallelism only. Jobs are scheduled per core. This system does not have a high-speed interconnect.
Hardware
There are 2 login nodes, 3 pdebug nodes, 29 pbatch nodes, and 4 phighmem nodes. Each Poodle batch node contains dual-socket Intel Xeon Platinum 8479 processors (2.0 GHz) each with 56 cores for a total of 112 cores/node and 256 GiB DDR5 memory/node.
UPDATE 8/2023: Poodle also has High Bandwidth Memory (HBM) nodes which contain dual-socket Intel(R) Xeon(R) CPU Max 9480 processors (1.9 GHz) each with 56 cores for a total of 112 cores/node and 128 GiB HBM memory/node.
Scheduling
Batch jobs are scheduled through SLURM.
- Interactive login: poodle[17,18]
- Pdebug: poodle[1-3]
- Batch use: poodle[4-16,22-37]
- HBM use: poodle[38-41]
Poodle jobs are scheduled using SLURM, Jobs are scheduled per core. Scheduling is not technically enforced so users are expected to monitor their own behavior and keep themselves within the current limits while following the policies:
- Users will not compile on the login nodes during daytime hours
- Users can only use up to one node of phighmem at a time
- A user can have a maximum of 336 processors with a runtime of up to 4 hours in queue during the day with the following exceptions:
- An occasional one hour max job for debugging that takes 337-560 processors as long as it is the user's only job in the queue.
- Daytime is 0800-2000 Mondays-Fridays not including holidays
- No production runs allowed, only development and debugging
- Users won't run computationally intensive work on the login node
Pdebug is intended for debugging, visualization, and other inherently interactive work. It is NOT intended for production work. Do not use pdebug to run batch jobs. Do not chain jobs to run one after the other. Individuals who misuse the pdebug queue in this or any similar manner will be denied access to running jobs in the pdebug queue.
Interactive access to a batch node is allowed only while you have a batch job running on that node, and only for the purpose of monitoring your job. When logging into a batch node, be mindful of the impact your work has on the other jobs running on the node.
Jobs are limited to a single node. Multiple users can run on the same node. If the number of cores is not specified, the job is allocated to one core.
We are all family and expect developers to play nice. However if someone's job(s) have taken over the machine:
- Call them or send them an email.
- Email ramblings-help@llnl.gov with a screenshot so we can take care of the situation by killing work that violates policy.
This approach will be revisited later and additional limits will be set if necessary. If someone monopolizes the machine, developers can always shift to other CZ resources.
Scratch Disk Space: Consult CZ File Systems Web Page: https://lc.llnl.gov/fsstatus/fsstatus.cgi
Documentation
Contact
Please call or send email to the LC Hotline if you have questions. LC Hotline | phone: 925-422-4531 | email: lc-hotline@llnl.gov