NOTE This system currently has limited availability. 

Job Limits

Each LC platform is a shared resource. Users are expected to adhere to the following usage policies to ensure that the resources can be effectively and productively used by everyone. You can view the policies on a system itself by running:

news job.lim.MACHINENAME

Web Version of Matrix Job Limits

Hardware

Each Matrix node is based on Intel Sapphire Rapids processor with 56 cores per socket, 2 sockets per node, and 512 GB DDR5 memory, as well as 4 Nvidia H100 GPUs.

Scheduling

Matrix is GPU scheduled. This means that the minimum allocation will include 1 GPU, 28 cores, and 128 GB of memory and all allocations will be multiples of 1 GPU, 28 cores, 128 GB of memory. Please use the Slurm '-G', '--gpus', or '--gpus-per-task' flags to specify how many resources you will need for your job. If you do not specify a GPU option, your allocation may not have the resources available that you expect. You can also allocate whole nodes with '--exclusive'.

Additional scheduling examples can be found on below.

Matrix Scheduling Examples

Matrix is a GPU-scheduled machine where resources are allocated based on GPUs rather than entire nodes. Below are the key principles of how resources are allocated:

  • Smallest Allocation: Includes 1 GPU and its 28 local CPU cores.
  • Larger Allocations: Include multiple GPUs and all CPU cores local to those GPUs.
  • Shared Nodes: Multiple users can share a node, each with dedicated GPUs, memory, and cores.
  • Multi-Node Jobs: A job with multiple GPUs can span multiple nodes, even if using less than a full node.

Flags and Their Behavior

  1.  -G (Number of GPUs)

    Specifies the number of GPUs needed for the job. Each GPU comes with its associated CPU cores and memory.
    •  Examples:
      • salloc -G 1 : Allocates 1 GPU and 28 CPU cores.
      • salloc -G 2 : Allocates 2 GPUs and 56 CPU cores.
    • Notes:
      1. If the requested GPUs are available on the same node, the job will remain on one node.
      2. If the requested GPUs are not available on the same node, the job may span multiple nodes

         
  2. -n (Number of Tasks)

    Specifies the number of tasks (processes) required for the job, with one CPU core allocated per task.
    • Examples:
      • salloc -n 1 : Allocates 1 GPU and 28 CPU cores (smallest allocation).
      • salloc -n 28: Allocates 1 GPU and 28 CPU cores (fully utilizes the cores associated with the GPU).
      • salloc -n 29 :Allocates 2 GPUs and 56 CPU cores (spans multiple GPUs when more than 28 tasks are requested).

         
  3. -N (Number of Nodes)

    Specifies the number of nodes over which the allocation will be spread.
    • Examples:
      • salloc -N 1 : Allocates 1 GPU and 28 CPU cores (smallest possible allocation on one node).
      • salloc -N 2 : Allocates 2 GPUs and 56 CPU cores, distributed across two nodes.
    • Notes:
      • salloc -G 2 -N 1 : Allocates 2 GPUs on the same node.
      • salloc -G 2 -N 2 : Allocates 1 GPU on each of 2 nodes.

         
  4. --exclusive  (Exclusive Node Access)

    Allocates the entire node for all nodes in the request. This ensures no other jobs share the node.
    • Examples:
      • salloc -G 1 --exclusive : Allocates 4 GPUs and 112 CPU cores on the same node (entire node).
      • salloc -G 5 --exclusive : Allocates 8 GPUs and 256 CPU cores, split across 2 nodes.

Additional Notes

  • Multi-GPU Allocation:
    • salloc -G 2 allocates 2 GPUs and 56 cores on the same node if possible. 
    • If the GPUs are not available on the same node, the job will be split across 2 nodes.
  • Node-Specific Allocation:
    • salloc -G 2 -N 1 : Ensures 2 GPUs are allocated from the same node.
    • salloc -G 2 -N 2 : Allocates 1 GPU on each of 2 nodes.

 

Zone
CZ
Vendor
Dell
User-Available Nodes
Login Nodes*
matrix[1,2]
Batch Nodes
12
Debug Nodes
2
Total Nodes
16
CPUs
CPU Architecture
Intel(R) Xeon(R) Platinum 8480+
Cores/Node
112
Total Cores
1,792
GPUs
GPU Architecture
NVIDIA H100
Total GPUs
56
GPUs per compute node
4
GPU peak performance (TFLOP/s double precision)
30.00
GPU global memory (GB)
320.00
Memory Total (GB)
8,064
CPU Memory/Node (GB)
504
Peak Performance
Peak TFLOPS (CPUs)
198.0
Peak TFLOPs (GPUs)
3,800.0
Peak TFLOPS (CPUs+GPUs)
4,000.00
Clock Speed (GHz)
3.7
OS
TOSS 4
Interconnect
IB
Parallel job type
multiple nodes per user
Recommended location for parallel file space
Class
CTS-2
Password Authentication
OTP, Kerberos
Year Commissioned
2025
Compilers

See Compilers page