Livermore Computing Resources and Environment

Table of Contents

  1. Abstract
  2. Organization
    1. What Is Livermore Computing?
    2. History of Livermore Computing
  3. Terminology
  4. LC's Hardware
    1. Systems Summary
    2. Intel Xeon Systems
    3. CORAL Systems
    4. Future Systems
    5. Typical LC Linux Cluster
    6. Interconnects
    7. Facilities, Machine Room Tours, Photos
  5. LC Accounts
  6. Accessing LC Systems
    1. Passwords, Authentication, and OTP Tokens
    2. SSH and Access Methods
    3. A Few More Words About SSH
    4. Where to Login
    5. VPN Remote Access Service
    6. SecureNet
  7. LC File Systems
    1. Home Directories and Login Files
    2. /usr/workspace File Systems
    3. Temporary File Systems
    4. Parallel File Systems
    5. Archival HPSS Storage
    6. /usr/gapps, /usr/gdata File Systems
    7. Quotas
    8. Purge Policies
    9. Backups
    10. File Transfer and Sharing
    11. File Interchange Service (FIS)
  8. System Status and Configuration Information
    1. System Configuration Information
    2. System Configuration Commands
    3. System Status Information
  9. Exercise 1
  10. Software and Development Environment Overview
    1. Development Environment Group (DEG)
    2. TOSS Operating System
    3. Software Lists
    4. Modules
    5. Atlassian Tools - Confluence, JIRA, etc.
    6. Spack Package Manager
  11. Compilers
    1. Available Compilers and Invocation Commands
    2. Compiler Versions and Defaults
    3. Compiler Options
    4. Compiler Documentation
    5. Optimizations
    6. Floating-point Exceptions
    7. Precision, Performance, and IEEE 754 Compliance
    8. Mixing C and Fortran
  12. Debuggers
    1. TotalView
    2. DDT
    3. STAT - Stack Trace Analysis Tool
    4. Debugging in Batch: mxterm / sxterm
    5. Other Debuggers
    6. A Few Additional Useful Debugging Hints
  13. Performance Analysis Tools
    1. We Need a Book!
    2. Memory Correctness Tools
    3. Profiling, Tracing, and Performance Analysis
    4. Beyond LC
  14. Graphics Software and Resources
    1. Consulting
    2. Video Production
    3. Visualization Machine Resources
    4. Power Walls
  15. Running Jobs
    1. Where to Run?
    2. Batch Versus Interactive
    3. Starting Jobs - srun
    4. Interacting with Jobs
    5. Other Topics of Interest
  16. Batch Systems
  17. Miscellaneous Topics
    1. Clusters with GPUs
    2. Big Data at LC
    3. Green Data Oasis
    4. Security Reminders
  18. Where to Get Information & Help
    1. LC Hotline
    2. LC Users Home Page: hpc.llnl.gov
    3. Lorenz User Dashboard: mylc.llnl.gov
    4. Login Banner
    5. News Items
    6. Machine Email Lists
    7. LC User Meetings
  19. Exercise 2

Abstract

This is the second tutorial in the "Livermore Computing Getting Started" workshop. It provides an overview of Livermore Computing's (LC) supercomputing resources and how to effectively use them. As such, it is definitely intended as a "getting started" document for new users or for those who want to know "in a nutshell" what supercomputing at LC is all about from a practical user's perspective. It is also intended to provide essential, practical information for attendees planning to attend the other tutorials in this workshop.

A wide variety of topics are covered in what is hopefully a logical progression, starting with a description of the LC organization, a summary of the available supercomputing hardware resources, how to obtain an account and how to access LC systems. Important aspects concerning the user environment are then addressed, such as the user's home directory, various files and file systems, how to transfer/share files, quotas, archival storage and getting system status/configuration information. A brief description of the software development environment (compilers, debuggers, and performance tools), a summary of video and graphics services, and the basics of how to run jobs follow. Several miscellaneous topics are discussed. Finally, this tutorial concludes with a discussion on where to obtain more information and help. Note: This tutorial only provides an overview of using LC's Slurm/Moab batch systems; these topics are covered in the EC4045 "Slurm and Moab" tutorial.

Level/Prerequisites: This tutorial is geared to new users of LC systems and might actually be considered a prerequisite for using LC systems and attending other tutorials that describe parallel programming on LC systems in more detail.

Organization

What Is Livermore Computing?

History of Livermore Computing

  • The history of Livermore Computing has its origins over 65 years ago when LLNL acquired its first computer, a Univac 1, in 1953.
  • A pictorial history of LLNL's computers is available at: computing.llnl.gov/history.

Terminology

"The acronyms can be a bit overwhelming"
- Excerpt from a workshop attendee evaluation form

AcronymMeaning
LCLivermore Computing - the division / program directly responsible for LLNL's supercomputers.
HPCHigh Performance Computing. Supercomputing. Computation using the largest scale computers available.
OCFLC's Open Computing Facility - unclassified computing
SCFLC's Secure Computing Facility - closed or classified computing
CZCollaboration Zone - the open "green" part of the OCF
RZRestricted Zone - the internal "yellow" part of the OCF
ASCThe Department of Energy's Advanced Simulation and Computing program. Funding supports HPC at Livermore, Los Alamos and Sandia National Laboratories. ASC website: asc.llnl.gov
M&ICMultiprogrammatic & Institutional Computing (mic.llnl.gov)
Slurm
Moab
LSF
Batch systems employed on LC machines
NodeA single computer in a networked HPC system/cluster:
  • Contains multiple CPUs/cores
  • Linked to other nodes via at least one type of network
CPU
Core
CPU = Central Processing Unit. The component within a node that executes programs and performs computations. Currently, CPUs are comprised of multiple, identical computational subunits called cores.
CTS, CTS-1Commodity Technology Systems. Linux clusters procured under a Tri-lab proposal process for capacity computing systems. See asc.llnl.gov/computers/commodity.
TFLOPS
PFLOPS
Measure of a supercomputer's power/speed. Teraflops: Trillion Floating Point Operations per Second. Petaflops: Quadrillion Floating Point Operations per Second
LustreLinux cluster parallel file system used on most LC clusters (wiki.lustre.org/index.php/Main_Page)
TOSSTri-Laboratory Operating System Stack - the operating system and software stack used on LC Linux clusters. Derived from Redhat Enterprise Linux.
NewbieWho this tutorial is intended for

"Everything changes except the fact that everything changes."

DISCLAIMER: All information presented today is subject to change! This information was current as of Jan 2019.

Hardware

Systems Summary

IBM POWER with NVIDIA GPUsIntel XeonIntel Xeon

Mix of Resources

  • IBM POWER with NVIDIA GPUs: LC's newest and largest systems. Include LLNL's CORAL Early Access and Sierra systems.
  • Intel Xeon: Comprise the majority of LC's clusters; several different types
  • Size: Wide range, from dozens of cores to 1.6 million cores; from less than several teraflops to 125 petaflops
  • Networks: InfiniBand, Intel Omni-Path, 5D Torus interconnect, or no interconnect
  • Uses: Capability Computing, Grand Challenge, routine production work, visualization work, file transfer, test-bed
  • Funding: ASC; M&IC, mixed

Primary Systems

LC's primary HPC computing systems are summarized in the table below.

ClusterOCF
SCF
ArchitectureClock Speed (CHz)Nodes GPUsCores / Node / GPUCores TotalMemory / Node (GB)Memory Total (GB)TFLOPS PeakSwitchASC
M&IC
Notes
agateSCFIntel 18-core Xeon E5-2695 v42.148361,7281286,14458.1NoASC 
boraxOCFIntel 18-core Xeon E5-2695 v42.148361,7281286,14458.1NoASC/M&IC 
catalystOCFIntel 12-core Xeon E5-2695 v22.4324247,77612841,472149.3IB QDRASC/M&IC1,4
cslicSCFIntel 8-core Xeon E5-26702.610161601281,2803.3NoASC2
jade
jadeite
jadedev
SCFIntel 18-core Xeon E5-2695 v42.12,6883696,768128344,0643,251.4Omni-PathASC5
lassenOCFIBM Power9
NVIDIA Tesla V100 (Volta)
2.3-3.8
1530 MHz
774
774*4
44
5120
34,056
15,851,520
256
16*4
198,144
49,536
22,508IB EDRASC/M&IC1
maxSCFIntel 8-core Xeon E5-2670
NVIDIA Tesla K20x
2.6
732 MHz
324
20*2
16
2688
5,184
107,520
256
6*2
78,336
240
108
52.4
IB QDRASC VIZ3
micaSCFIntel 18-core Xeon E5-2695 v42.13843613,82412849,152464.5Omni-PathASC 
oslicOCFIntel 8-core Xeon E5-26702.610161601281,2803.3NoASC2
pascalOCFIntel 18-core Xeon E5-2695 v42.1171
163*2
36
3484
6,156
1,135,784
256
16*2
18,176
5,216
206.8
1,727.8
Omni-PathASC/M&IC 
pinotISNSIIntel 8-core Xeon E5-26702.6162162,5926410,36853.9IB QDRM&IC1
quartzOCFIntel 18-core Xeon E5-2695 v42.13,07236110,592128393,2163,715.9Omni-PathASC/M&IC 
rayOCFIBM Power8
NVIDIA Tesla P100 (Pascal)
2.0-4.0
1481 MHz
62
54*4
20
3484
1,240
752,544
256
16*4
15,872
3,456
39.7
1,144.8
IB EDRASC/M&IC1
rzalastorOCFIntel 10-core Xeon E5-2670 v22.83620720642,30416.1IB QDRASC1
rzanselOCFIBM Power9
NVIDIA Tesla V100 (Volta)
2.3-3.8
1530 MHz
54
54*4
44
5120
2376
1,105,920
256
16*4
13,824
3,456
1,570IB EDRASC1
rzgenieOCFIntel 18-core Xeon E5-2695 v42.148361,7281286,14458.1Omni-PathASC 
rzhasgpuOCFIntel 8-core Xeon E5-2667 v3
NVIDIA Tesla K80 GPU
3.2
824 MHz
20
16*4
16
2496
320
159,744
128
24*2
2,560
768
8.2
59.8
IB QDRASC/M&IC1
rzmantaOCFIBM Power8
NVIDIA Tesla P100 (Pascal)
2.0-4.0
1481 MHz
44
36*4
20
3484
880
501,696
256
16*4
11,264
2,304
28.2
763.2
IB EDRASC1
rzslicOCFIntel 8-core Xeon E5-26702.610161601281,2803.3NoASC2
rztopazOCFIntel 18-core Xeon E5-2695 v42.17683627,64812898,304929Omni-PathASC 
rztronaOCFIntel 18-core Xeon E5-2695 v42.120367201282,56024.2NoASC 
sharkSCFIBM Power8
NVIDIA Tesla P100 (Pascal)
2.0-4.0
1481 MHz
44
36*4
20
3484
880
501,696
256
16*4
11,264
2,304
28.2
763.2
IB EDRASC1
sierraSCFIBM Power9
NVIDIA Tesla V100 (Volta)
2.3-3.8
1530 MHz
4320
4320*4
44
5120
190,080
88,473,600
256
16*4
1,101,920
276,480
125,000IB EDRASC1
surfaceOCFIntel 8-core Xeon E5-2670
NVIDIA Tesla K40 GPU
2.6
745 MHz
162
158*2
16
2880
2,592
910,080
256
12*2
41,472
3,792
53.9
451.9
IB QDRASC/M&IC
VIZ
 
syrahOCFIntel 8-core Xeon E5-26702.6324165,1846420,736107.8IB QDRASC/M&IC1
zinSCFIntel 8-core Xeon E5-26702.62,9161646,6563293,312970.4IB QDRASC 

Notes:

  1. Limited access, no Generally Available or not a production system.
  2. Primary use is for the transfer to storage.
  3. Two Tesla K20x GPUs on 20 nodes; 6 GB memory per GPU
  4. Login nodes have 48 cores; compute nodes have additional 800 GB NVRAM.
  5. Jade is split into 3 subsystems - compute nodes are: jade (1302), jadeita (1270), jadedev (32)

Peak Comparisons

Intel Xeon Systems

  • The majority of LC's systems are Intel Xeon based Linux clusters, and include the following processor architectures:
    • Intel Xeon 18-core E5-2695 v4 (Broadwell)
    • Intel Xeon 8-core E5-2670 (Sandy Bridge - TLCC2) w/without NVIDIA GPUs
    • Intel Xeon 12-core E5-2695 v2 (Ivy Bridge)
  • Mix of resources:
    • 8, 12, and 18 core processors
    • OCF and SCF
    • ASC, M&IC, VIZ
    • Capacity, Grand Challenge, visualization, testbed
    • Several GPU enabled clusters
  • 64-bit architecture
  • TOSS operating system stack
  • InfiniBand and Intel Omni-Path interconnects
  • Hyper-threading enabled (2 threads/core)
  • Vector/SIMD operations
  • For detailed hardware information, please see the "Additional Information" references below.

Additional Information


Quartz Intel Cluster
 


Zin Intel Cluster

System Details

ClusterOCF
SCF
ArchitectureClock Speed (CHz)Nodes GPUsCores / Node / GPUCores TotalMemory / Node (GB)Memory Total (GB)TFLOPS PeakSwitchASC
M&IC
agateSCFIntel 18-core Xeon E5-2695 v42.148361,7281286,14458.1NoASC
boraxOCFIntel 18-core Xeon E5-2695 v42.148361,7281286,14458.1NoASC/M&IC
catalystOCFIntel 12-core Xeon E5-2695 v22.4324247,77612841,472149.3IB QDRASC/M&IC
cslicSCFIntel 8-core Xeon E5-26702.610161601281,2803.3NoASC
jade
jadeite
jadedev
SCFIntel 18-core Xeon E5-2695 v42.12,6883696,768128344,0643,251.4Omni-PathASC
maxSCFIntel 8-core Xeon E5-2670
NVIDIA Tesla K20x
2.6
732 MHz
324
20*2
16
2688
5,184
107,520
256
6*2
78,336
240
108
52.4
IB QDRASC VIZ
micaSCFIntel 18-core Xeon E5-2695 v42.13843613,82412849,152464.5Omni-PathASC
oslicOCFIntel 8-core Xeon E5-26702.610161601281,2803.3NoASC
pascalOCFIntel 18-core Xeon E5-2695 v42.1171
163*2
36
3484
6,156
1,135,784
256
16*2
18,176
5,216
206.8
1,727.8
Omni-PathASC/M&IC
pinotISNSIIntel 8-core Xeon E5-26702.6162162,5926410,36853.9IB QDRM&IC
quartzOCFIntel 18-core Xeon E5-2695 v42.13,07236110,592128393,2163,715.9Omni-PathASC/M&IC
rzalastorOCFIntel 10-core Xeon E5-2670 v22.83620720642,30416.1IB QDRASC
rzgenieOCFIntel 18-core Xeon E5-2695 v42.148361,7281286,14458.1Omni-PathASC
rzhasgpuOCFIntel 8-core Xeon E5-2667 v3
NVIDIA Tesla K80 GPU
3.2
824 MHz
20
16*4
16
2496
320
159,744
128
24*2
2,560
768
8.2
59.8
IB QDRASC/M&IC
rzslicOCFIntel 8-core Xeon E5-26702.610161601281,2803.3NoASC
rztopazOCFIntel 18-core Xeon E5-2695 v42.17683627,64812898,304929Omni-PathASC
rztronaOCFIntel 18-core Xeon E5-2695 v42.120367201282,56024.2NoASC
surfaceOCFIntel 8-core Xeon E5-2670
NVIDIA Tesla K40 GPU
2.6
745 MHz
162
158*2
16
2880
2,592
910,080
256
12*2
41,472
3,792
53.9
451.9
IB QDRASC/M&IC
VIZ
syrahOCFIntel 8-core Xeon E5-26702.6324165,1846420,736107.8IB QDRASC/M&IC
zinSCFIntel 8-core Xeon E5-26702.62,9161646,6563293,312970.4IB QDRASC

Note: The pinot cluster is dedicated to ISNSI use.

CORAL Systems

CORAL

  • CORAL = Collaboration Oak Ridge, Argonne, Livermore
  • A first-of-its-kind U.S. DOE collaboration between the NNSA's ASC Program and the Office of Science's Advanced Scientific Computing Research program (ASCR).
  • CORAL is the next major phase in the DOE's scientific computing roadmap and path to exascale computing.
  • Will culminate in three ultra-high performance supercomputers at Lawrence Livermore, Oak Ridge, and Argonne national laboratories.
  • Will be used for the most demanding scientific and national security simulation and modeling applications, and will enable continued U.S. leadership in computing.
  • The three CORAL systems are:
  • LLNL and ORNL systems were delivered in the 2017-18 timeframe. The Argonne system's planned delivery (revised) is in 2021.

CORAL Early Access (EA) Systems

  • In preparation for the final delivery Sierra systems, LLNL has implemented three "early access" systems, one on each network:
    • ray - OCF-CZ
    • rzmanta - OCF-RZ
    • shark - SCF
  • Primary purpose was to provide platforms where Tri-lab users can begin porting and preparing for the hardware and software that will be delivered with the final Sierra systems. Still available for development and testing purposes.
  • Similar to the final delivery Sierra systems but use the previous generation IBM Power processors and NVIDIA GPUs.
  • IBM Power Systems S822LC Server: Hybrid architecture using IBM POWER8+ processors and NVIDIA Pascal GPUs.
  • IBM POWER8+ processors:
    • 2 per node (dual-socket)
    • 10 cores/socket; 20 cores per node
    • 8 SMT threads per core; 160 SMT threads per node
    • Clock: due to adaptive power management options, the clock speed can vary depending upon the system load. At LC speeds can vary from approximately 2 GHz - 4 GHz.
  • NVIDIA GPUs:
    • 4 NVIDIA Tesla P100 (Pascal) GPUs per compute node (not on login/service nodes)
    • 3584 CUDA cores per GPU; 14,336 per node
  • Memory:
    • 256 GB DDR4 per node
    • 16 GB HBM2 (High Bandwidth Memory 2) per GPU; 732 GB/s peak bandwidth
  • NVLINK 1.0:
    • Interconnect for GPU-GPU and CPU-GPU shared memory
    • 4 links per GPU with 160 GB/s total bandwidth
  • NVRAM: 1.6 TB NVMe PCIe SSD per compute node (CZ ray system only)
  • Network:
    • Mellanox 100 Gb/s Enhanced Data Rate (EDR) InfiniBand
    • One dual-port 100 Gb/s EDR Mellanox adapter per node
  • Parallel File System: IBM Spectrum Scale (GPFS)
    • ray: 1.3 PB
    • rzmanta: 431 TB
    • shark: 431 TB
  • Batch System: IBM Spectrum LSF
  • Additional information:

CORAL EA Ray Cluster

System Details
ClusterOCF
SCF
ArchitectureClock Speed (CHz)Nodes GPUsCores / Node / GPUCores TotalMemory / Node (GB)Memory Total (GB)TFLOPS PeakSwitchASC
M&IC
rayOCFIBM Power8
NVIDIA Tesla P100 (Pascal)
2.0-4.0
1481 MHz
62
54*4
20
3484
1,240
752,544
256
16*4
15,872
3,456
39.7
1,144.8
IB EDRASC/M&IC
rzmantaOCFIBM Power8
NVIDIA Tesla P100 (Pascal)
2.0-4.0
1481 MHz
44
36*4
20
3484
880
501,696
256
16*4
11,264
2,304
28.2
763.2
IB EDRASC
sharkSCFIBM Power8
NVIDIA Tesla P100 (Pascal)
2.0-4.0
1481 MHz
44
36*4
20
3484
880
501,696
256
16*4
11,264
2,304
28.2
763.2
IB EDRASC

Sierra Systems

  • Sierra is a classified, 125-petaflop, IBM Power Systems AC922 hybrid architecture system comprised of IBM POWER9 nodes with NVIDIA Volta GPUs. Sierra is a Tri-lab resource sited at LLNL.
  • Unclassified Sierra systems are similar, but smaller, and include:
    • Lassen - a 20-petaflop system located on LC's CZ zone
    • rzansel - a 1.5-petaflop system is located on LC's RZ zone
  • IBM Power Systems AC922 Server: Hybrid architecture using IBM POWER9 processors and NVIDIA Volta GPUs.
  • IBM POWER9 processors (compute nodes):
    • 2 per node (dual-socket)
    • 22 cores/socket; 44 cores per node
    • 4 SMT threads per core; 176 SMT threads per node
    • Clock: due to adaptive power management options, the clock speed can vary depending upon the system load. At LC speeds can vary from approximately 2.3 - 3.8 GHz. LC can also set the clock to a specific speed regardless of workload.
  • NVIDIA GPUs:
    • 4 NVIDIA Tesla V100 (Volta) GPUs per compute, login, launch node
    • 5120 CUDA cores per GPU; 20,480 per node
  • Memory:
    • 256 GB DDR4 per compute node
    • 16 GB HBM2 (High Bandwidth Memory 2) per GPU; 900 GB/s peak bandwidth
  • NVLINK 2.0:
    • Interconnect for GPU-GPU and CPU-GPU shared memory
    • 6 links per GPU with 300 GB/s total bandwidth
  • NVRAM: 1.6 TB NVMe PCIe SSD per compute node
  • Network:
    • Mellanox 100 Gb/s Enhanced Data Rate (EDR) InfiniBand
    • One dual-port 100 Gb/s EDR Mellanox adapter per node
  • Parallel File System: IBM Spectrum Scale (GPFS)
  • Batch System: IBM Spectrum LSF
  • Water (warm) cooled compute nodes
  • Additional information:

Sierra

System Details
ClusterOCF
SCF
ArchitectureClock Speed (CHz)Nodes GPUsCores / Node / GPUCores TotalMemory / Node (GB)Memory Total (GB)TFLOPS PeakSwitchASC
M&IC
sierraSCFIBM Power9
NVIDIA Tesla V100 (Volta)
2.3-3.8
1530 MHz
4320
4320*4
44
5120
190,080
88,473,600
256
16*4
1,101,920
276,480
125,000IB EDRASC
lassenOCFIBM Power9
NVIDIA Tesla V100 (Volta)
2.3-3.8
1530 MHz
774
774*4
44
5120
34,056
15,851,520
256
16*4
198,144
49,536
22,508IB EDRASC/M&IC
rzanselOCFIBM Power9
NVIDIA Tesla V100 (Volta)
2.3-3.8
1530 MHz
54
54*4
44
5120
2376
1,105,920
256
16*4
13,824
3,456
1,570IB EDRASC

Future Systems

Advanced Technology Systems (ATS)

  • Supercomputers dedicated to the largest and most complex calculations critical to stockpile stewardship; "capability computing"
  • Typically include leading-edge/novel architecture components, custom engineering
  • Shared across the Tri-labs; accounts granted to projects via a formal proposal process
  • ATS-3 "Crossroads": Will be sited at LANL
  • ATS-4 "El Capitan": Will be sited at LLNL

Commodity Technology Systems (CTS)

  • Robust, cost-effective systems to meet the day-to-day simulation workload needs of the ASC program; "work-horse, capacity computing"
  • Common Tri-Lab procurement with platforms delivered to all three labs; accounts handled independently by each lab.
  • CTS-1 systems are currently in production at all three labs.
  • CTS-2: TBA

Typical LC Linux Cluster

Basic Components

  • Currently, LC has several types of production Linux clusters based on the following processor architectures:
    • Intel Xeon 18-core E5-2695 v4 (Broadwell)
    • Intel Xeon 8-core E5-2670 (Sandy Bridge - TLCC2) w/without NVIDIA GPUs
    • Intel Xeon 12-core E5-2695 (Ivy Bridge)
  • All of LC's Linux clusters differ in their configuration details, however they do share the same basic hardware building blocks:
    • Nodes
    • Frames / racks
    • High speed interconnect (most clusters)
    • Other hardware (file systems, management hardware, etc.)

Nodes

  • The basic building block of a Linux cluster is the node. A node is essentially an independent computer. Key features:
    • Self-contained, diskless, multi-core computer.
    • Low form-factor - Clusters nodes are very thin to save space.
    • Rack Mounted - Nodes are mounted compactly in a drawer fashion to facilitate maintenance, reduced footprint, etc.
    • Remote Management - There is no keyboard, mouse, monitor or other device typically used to interact with a computer. All node management occurs over the network from a "management" node.
  • Example (click for larger image):

Single compute node - CTS-1

  • In general, an LC production cluster has four types of nodes, based upon function, which can differ in configuration details:
    • Login
    • Interactive/debug
    • Batch
    • I/O and service nodes (unavailable to users)

Login nodes:

  • Every system has a designated number of login nodes - depends upon the size of the system. Some examples:
    agate = 2
    sierra = 5
    quartz = 14
    zin = 20
  • Login nodes are shared by multiple users
  • Primarily used for interactive work such as editing files, submitting batch jobs, compiling, running GUIs, etc.
  • Interactive use exclusively - login only nodes do not permit any batch jobs.
  • DO NOT run production jobs on login nodes! Remember, you are sharing login nodes with other users.

Interactive/debug (pdebug) nodes:

  • Most LC systems have nodes that are designated for interactive work.
  • Meant for testing, prototyping, debugging, and small, short jobs
  • Cannot be logged into (rsh) unless you already have a job running on them
  • Nodes run one job at a time - not shared like login nodes
  • Can also be used through the batch system

Batch (pbatch) nodes:

  • Comprise the majority of nodes on each system
  • Meant for production work
  • Work is submitted via a batch scheduler (Slurm, Moab
  • Cannot be logged into (rsh) unless you already have a job running on them
  • Nodes run one job at a time - not shared like login nodes

Frames / Racks

  • Frames are the physical cabinets that hold most of a cluster's components:
    • Nodes of various types
    • Switch components
    • Other network and cluster management components
    • Parallel file system disk resources (usually in separate racks)
  • Vary in size/appearance between the different Linux clusters at LC.
  • Power and console management - frames include hardware and software that allow system administrators to perform most tasks remotely.
  • Example images below (click for larger image):


Frames - Sierra
 


Frames - Quartz

Scalable Unit

  • The basic building block of LC's production Linux clusters is called a "Scalable Unit" (SU). An SU consists of:
    • Nodes (compute, login, management, gateway)
    • First stage switches that connect to each node directly
    • Miscellaneous management hardware
    • Frames sufficient to house all required hardware
    • Additionally, second stage switch hardware is needed to connect multi-SU clusters (not shown).
  • The number of nodes in an SU depends upon the type of switch hardware being used. For example:
    • QLogic = 162 nodes
    • Intel Omni-Path = 192 nodes
  • Multiple SUs are combined to create a cluster. For example:
    • 2 SU = 324 / 384 nodes
    • 4 SU = 648 / 768 nodes
    • 8 SU = 1296 / 1536 nodes
  • The SU design is meant to:
    • Standardize configuration details across the enterprise
    • Easily "grow" clusters in incremental units
    • Leverage procurements and reduce costs across the Tri-labs
  • An example of a 2 SU cluster is shown below for illustrative purposes. Note that a frame holding the second level switch hardware is not shown.

Interconnects

  • Types of interconnects:
    • Varies by cluster; a few clusters do not have interconnects.
    • Intel Xeon CTS-1 clusters use Intel Omni-Path.
    • Most other Intel Xeon clusters use 4x QDR (Quad Data Rate) QLogic InfiniBand
    • CORAL/Sierra clusters use Mellanox EDR (Enhanced Data Rate) InfiniBand
  • Bandwidths:
    • QLogic 4x QDR = 40 Gbits/sec
    • Intel Omni-Path = 100 Gbits/sec
    • Mellanox EDR = 100 Gbits/sec

Primary Components

Note: For additional details on Sierra systems see: hpc.llnl.gov/training/tutorials/using-lcs-sierra-system

Adapter Card

  • Communications processor packaged on network PCI Express adapter card.
  • Remote Direct Memory Access (RDMA) improves communication bandwidth by off-loading communications from the CPU.
  • Provides the interface between a node and a two-stage network.
  • Connected to a first stage switch by copper cable (most cases) .
  • Types: Intel Omni-Path, QLogic 4x QDR IB, Mellanox EDR InfiniBand

Omni-Path Fabric Adapter PDF
(Image source: Intel)

QLogic IB Adapter
(Image source: QLogic)
Mellanox EDR InfiniBand Adapter
(Image source: Mellanox)

1st Stage Switch:

  • Intel Omni-Path 48-port: 32 ports connect to adapters in nodes and 16 ports connect to second stage switches.
  • QLogic QDR 36-port: 18 ports connect to adapters in nodes and 18 ports connect to second stage switches.
  • Mellanox Switch-IB 36-port: 18 ports connect to adapters in nodes and 12 ports connect to second stage switches.

2nd Stage Switch:

  • Intel Omni-Path 768-port: all used ports connect to a first stage switch via optic fiber cabling.
  • QLogic QDR 18-864 port: all used ports connect to a first stage switch via optic fiber cabling.
  • Mellanox CS7500 648-port: all used ports connect to a first stage switch via optic fiber cabling.

QLogic 1st and 2nd Stage Switches (back)

Topology

  • Two-stage, federated, bidirectional, fat-tree.
  • Examples:

2688-way Interconnect
Jade - 14 SU

Sierra cluster

Performance

  • The inter-node bandwidth measurements below were taken on live, heavily loaded LC machines using a simple MPI non-blocking test code. One task on each of two nodes. Message size = 1 MB. Not all systems are represented. Your mileage may vary.
System TypeLatencyBandwidth
Intel Xeon Clusters with QDR QLogic~1-2 us~4.1 GB/sec
Intel Xeon Clusters with QDR QLogic (TLCC2)~1 us~5.0 GB/sec
Intel Xeon Clusters with Intel Omni-Path (CTS-1)~1 us~21 GB/sec
Sierra Clusters with Mellanox EDR Infiniband~1 us~21 GB/sec

Facilities, Machine Room Tours, Photos

Facilities

  • Most of LC's computing resources are located in the Livermore Computing Complex (LCC) building 453, and buildings 451 and 654. The LCC was formerly known as the Terascale Simulation Facility (TSF).
  • Map available here
  • LCC highlights:
    • Four-story office tower with 121,600 square feet for 285 offices, a visualization theater, a 150-seat auditorium, and several conference rooms on each floor.
    • Machine room with 48,000 square feet of unobstructed computer room floor
    • 30 megawatts machine power capacity
    • Mechanical cooling system with cooling towers boasting total capacity of 12,600 gallons per minute, a chiller plant with total capacity of 7,200 tons, and air handlers with a total capacity of 2,720,000 cubic feet per minute
    • 3,600-gallon-per-minute, closed-loop, liquid-cooling system for Sequoia that can cool up to 9.6 megawatts.
  • LC's building 654 comprises 6,000 square feet of computer floor space and is scalable up to 7.5 MW. B654 schematic drawing
  • Additional reading/viewing:

Machine Room Tours

  • LLNL hosts can request tours of the B453 machine room for visitors and groups. Hosts are responsible for providing Administrative Escorts (AE) and ensuring AE policies/rules are followed.
  • Tour participants must be U.S. citizens.
  • For LCC building 453 tour information, contact hpc-tours@llnl.gov.
  • Summer students: "Virtual" tours are offered for summer students). See the Lab Events Calendar for details and registration: ebb.llnl.gov.

Machine Photos

  • Photo collections of some LC systems, present and past, are available at this internal URL that requires authentication: lc.llnl.gov/confluence/display/gallery/Photo+Gallery+of+LC+Systems

Accounts

  • The process for obtaining an LC account varies, depending upon factors such as:
    • Lab employee?
    • Collaborator (non-employee)?
    • Foreign national?
    • Classified or unclassified?
  • It also involves more than one account processing system:
  • Because things can get a little complex, you should consult the LC accounts documentation at: hpc.llnl.gov/accounts.
  • One Time Password (OTP) Tokens:
    • For OCF accounts, you will receive via US mail, an RSA One-time Password (OTP) token. Instructions on how to activate and use this token are included with your account notification email.
    • For OCF RZ accounts, you will also receive an RZ RSA OTP token.
    • For SCF accounts, you will be asked to visit the LC Hotline to obtain your OTP token and setup your PIN.
  • Required training: All account requests require completion of online training before they are activated.
  • Annual renewal: Accounts are subject to annual revalidations and completion of online training.
  • Foreign national accounts require additional processing and take longer to set up.
  • Virtual Private Network (VPN) Account: for remote access may also be required. Discussed later under "VPN Remote Access Service."
  • Questions? Contact the LC Hotline: (925) 422-4533 lc-support@llnl.gov

Accessing LC Systems

Passwords, Authentication, and OTP Tokens

One-time Passwords (OTP)

  • Single-use passwords are mandatory on all LC machines: classified and unclassified.
  • Based upon a "two factor" authentication:
    • static, 4-8 character alphanumeric PIN for every user
    • 6-digit random number generated by an RSA SecureID token device (similar to a CRYPTOcard).
  • OTP authentication is also used for other services:
    • Access to internal web pages
    • Remote Access Services such as VPN (discussed later)

OCF Collaboration Zone (CZ) or Restricted Zone (RZ)

  • LC's unclassified HPC systems are configured into two separate zones:
  • Collaboration Zone (CZ):
    • Most unclassified HPC clusters are in CZ, which will permit Foreign Nationals.
    • CZ machines can be accessed directly from anywhere on the Internet.
    • Authenticate with your LC username and PIN + OTP RSA token.
    • LLNL's VPN service is not required.
    • LANL/Sandia users: see the LANL/Sandia Access Methods section for differences.
  • Restricted Zone (RZ):
    • For programmatic and export control reasons, selected machines are in the RZ.
    • RZ machines have names that begin with "rz", such as rzansel, rztopaz, rzslic, etc.
    • Foreign National access is very limited.
    • Authentication requires your LC username and a separate RZ PIN and RZ OTP token.
    • Access to the RZ from outside LLNL also requires using LLNL's VPN service (discussed later).
    • LANL/Sandia users: see the LANL/Sandia Access Methods section for differences.

SCF Authentication

  • LC's classified systems use the same RSA OTP token as the CZ.
  • Authentication requires your LC username, SCF PIN and RSA OTP token.
  • LANL/Sandia users: see the LANL/Sandia Access Methods section for differences.

Problems?

  • Under certain circumstances, an OTP server and your token may get out of sync. In such cases it is necessary to enter two consecutive token codes so the server can resynchronize itself.
  • You may also need/want to change your PIN.
  • Both of these actions can be performed via the OTP web pages listed below:
  • Contact the LC Hotline if problems persist, or for other token related issues/questions: (925) 422-4533 lc-support@llnl.gov

SSH and Access Methods

SSH Required

  • Secure Shell (SSH) is required for access to all LC systems, whether you are internal to LC or external, whether you are on the OCF or the SCF.
  • The main advantages of SSH are:
    • No clear text password goes over network
    • The data stream is encrypted
    • Use of RSA/DSA authentication between LC clusters
  • Mac and Linux users:
    • SSH is included on Mac and Linux platforms
    • Can simply be used from a terminal window command line. Examples:
ssh joeuser@quartz.llnl.gov
ssh -l joeuser sierra.llnl.gov
  • Windows PC users:
ssh -m hmac-sha2-256 joeuser@quartz.llnl.gov
ssh -m hmac-sha2-512 -l joeuser sierra.llnl.gov

To avoid the need to enter a MAC type each time, simply create a C:\Users\joeuser\.ssh\config file and add the following line to it:

MACs hmac-sha2-256,hmac-sha2-512
  • Typically need to install an SSH app such as X-Win32 (provided by LLNL via LANDESK Portal Manager) or PuTTY. Searching the web will reveal other options.
  • X-Win32 instructions are available at: hpc.llnl.gov/manuals/access-lc-systems/x-win32-configuration.
  • Windows 10 provides an OpenSSH SSH client, which can be used from a Command Prompt window or PowerShell window. Note that you will probably need to specify the MAC (authentication) type. Examples:

Collaboration Zone (CZ) Access Methods

  • CZ machines can be accessed directly from anywhere on the Internet.
  • Simply use SSH (or for Windows, use your favorite SSH app) and connect to a cluster where you have an account.
  • Authenticate with your LC username and CZ PIN + RSA OTP token.

Restricted Zone (RZ) Access Methods

  • Requires the use of an LC RZ OTP token, not be confused with the RSA OTP token.
  • From inside LLNL:
    • You must be inside the RZ network or inside the LLNL institutional network. Access from the CZ is not permitted.
    • Use SSH (or for Windows, use your favorite SSH app) and connect to a CZ cluster where you have an account.
    • Authenticate with your LC username and RZ PIN + RZ OTP token
  • From outside LLNL:
    • Must first have a VPN Remote Access Service account and software setup (discussed later)
    • Then, start up and authenticate to VPN using your LLNL OUN (Official User Name) and your CZ PIN + RSA OTP token
    • Use SSH (or for Windows, use your favorite SSH app) to connect to an RZ cluster where you have an account.
    • Authenticate with your LC username and RZ PIN + RZ OTP token

SCF Access Methods

  • From inside the LLNL classified (iSRD) network:
    • Simply use SSH (or for Windows, use your favorite SSH app) and connect to a cluster where you have an account.
    • Authenticate with your LC username and SCF PIN + RSA OTP token
  • From outside the LLNL classified (iSRD) network:
    • Must be part of the DOE SecureNet network.
    • Then, simply use SSH (or for Windows, use your favorite SSH app) and connect to a cluster where you have an account.
    • Authenticate with your LC username and SCF PIN + RSA OTP token

Storage and FIS Access Methods

  • CZ-only users may access Storage and FIS from CZ machines and desktops
  • RZ-only users: may access Storage and FIS from RZ machines and desktops
  • CZ+RZ users may access Storage and FIS from RZ machines and desktops; not from CZ machines.
  • SCF users may access Storage and FIS may from SCF machines and iSRD desktops.
  • For details see the following:

Web Page Access

  • The majority of LC's web pages at hpc.llnl.gov are publicly available over the Internet without the need for authentication.
  • Web pages located on LC's Confluence Wikis (CZ/RZ/SCF) require authentication with an LC username and the relevant domain PIN + OTP token.
  • Likewise, web pages located on the MyLC portals require the appropriate LC authentication method.
  • Notes:
    • LANL/Sandia users: see LANL/Sandia Access Methods below.
    • LLNL's institutional web pages (e.g., LITE, LAPIS, LTRAIN) are unrelated to LC web pages and usually require LLNL OUN/PAC or Active Directory authentication (not covered here).

LANL/Sandia Access Methods

  • Tri-lab access methods differ from those used by other users, as described below.
  • CZ Access:
    • Begin on a LANL/Sandia iHPC login node. For example, at LANL start from ihpc-gate1.lanl.gov; at Sandia start from ihpc.sandia.gov.
    • Then connect to an LC cluster in the CZ using your LC username
    • Authentication: No password required
ssh -l lc-username loginmachine.llnl.gov
  • RZ Access:
    • Begin on a LANL/Sandia iHPC login node. For example, at LANL start from ihpc-gate1.lanl.gov; at Sandia start from ihpc.sandia.gov.
    • Then connect to an LC cluster in the RZ using your LC username
    • Authentication: RZ PIN + RZ OTP token
ssh -l lc-username loginmachine.llnl.gov
  • SCF Access:
    • Login to a local, classified HPC system
    • Then connect to an LC SCF cluster using your LC username
    • Authentication: No password required
ssh -l lc-username loginmachine.llnl.gov

A Few More Words About SSH

OpenSSH

  • All OCF and SCF production machines use OpenSSH.
  • OpenSSH supports both RSA and DSA authentication.
  • OpenSSH home page: www.openssh.com

RSA/DSA Authentication (SSH Keys)

  • By default, SSH will authenticate in secure password mode. That is, when host1 does an ssh to host2, and is prompted for a userid and password, the information will be sent in encrypted form to host2. That way, passwords cannot be "sniffed" or sent "clear text" over the network.
  • One of the features of SSH is that it allows you bypass this usual login method (userid/password) by setting up RSA/DSA authentication keys. Both are supported by OpenSSH.
  • The RSA/DSA key authentication methods allow you to optionally:
    • Improve security even more by requiring a login passphrase, which can be much longer than the typical UNIX password
    • Relax the need to enter a userid/password at all. There are known security risks with this convenience.
  • In a nutshell, creating RSA/DSA keys with OpenSSH is a one-time deal that can be done as follows:
    • Execute ssh-keygen -t type where type is either "rsa" or "dsa". Take your pick.
    • When prompted, enter a passphrase if you want improved security. If you want the convenience of being able to ssh into other LC OpenSSH machines without entering a userid/password, don't enter anything.
    • After the command completes cd to your .ssh file and copy the file which ends in .pub to a file named authorized_keys. This is your public key. For example:
cp id_dsa.pub authorized_keys
  • Because all OCF/SCF machines share the same home directory, you don't need to copy your public key file to each host. One copy does the trick.
  • Make sure that your .ssh files are readable only by you!!!
  • Use of ssh keys is permitted only between LC machines - not from outside the LC network or from desktop office machines.

SSH Timeouts

  • If you find that your sessions are being disconnected too quickly due to lack of keyboard interaction try either of the following:
  • Use the two options below with your ssh command:
-o ServerAlive Interval=60 -o ServerAliveCountMax=30
  • Create a .ssh/config file and include the two lines below in it:
ServerAliveInterval=60
ServerAliveCountMax=30

SSH and X11

  • If you are logged into an LC cluster from your desktop, and are running applications that generate graphical displays, you will need to have X11 setup on your desktop.
  • Linux: automatic - nothing special needs to be done in most cases
  • Macs: you'll need X server software installed. XQuartz is commonly used (www.xquartz.org/).
  • Windows: you'll need X server software installed. LLNL provides X-Win32, which can be downloaded/installed from your desktop's LANDesk Management software. Xming is a popular, free X server available for non-LLNL systems.
  • Helpful Hints:
    • X-Win32 setup instructions for LLNL: hpc.llnl.gov/manuals/access-lc-systems/x-win32-configuration
    • It's usually not necessary to define your DISPLAY variable in an SSH session between LC hosts. It should be picked up automatically.
    • Make sure your X server is setup to allow tunneling/forwarding of X11 connections BEFORE you connect to the LC host.
    • Often, you need to supply the -X or -Y flag to your ssh command to enable X11 forwarding.
    • May also try setting the two parameters below in your .ssh/config file:
ForwardX11=yes
ForwardX11Trusted=yes
  • Use the verbose option to troubleshoot problems:
ssh -v [other options] [host]

Need SSH?

  • Linux, Macs: included as part of the operating system
  • Windows: LLNL provides X-Win32 for lab machines. Can be downloaded from the LLNL Software Portal via your LANDesk Management software:
All Programs --> LANDesk Management --> Desktop Manager
  • Free versions, such as PuTTY, are available for most platforms - search the web

More Information

Where to Login

Login Nodes

  • LC clusters have specific nodes dedicated to user login sessions.
  • Login nodes are shared by multiple users.
  • LC provides a "generic" login alias (cluster login) for each cluster. The cluster login automatically rotates between available login nodes for load balancing purposes.
  • For example: sierra.llnl.gov is the cluster login alias - which could be any of the physical login nodes.
  • Users don't need to know (in most cases) the actual login node they are rotated onto - unless there are problems. Using the hostname command will indicate the actual login node name for support purposes.
  • If the login node you are on is having problems, you can ssh directly to another one. To find the list of available login nodes, use the command: nodeattr -c login

Logging into Compute Nodes

  • LC permits users to login to compute nodes on Linux clusters while they have a job running there.
  • Accessing BG/Q compute nodes is not permitted.
  • Very useful for debugging running jobs
  • Several commonly used commands can be used to determine which nodes your job is using, such as: squeue, checkjob, sview
  • Nodes are named as: [system][#]. For example:
borax8
zin223
sierra309
jade122
quartz1022
  • Note: You can use either rsh or ssh to access compute nodes.
  • You can also use LC's mxterm/sxterm utilities to acquire compute nodes for "interactive" work.
  • How to use mxterm/sxterm:
  1. Starting from your desktop machine, make sure you have your X11 environment setup correctly
  2. ssh to an LC cluster login node
  3. Issue the command as follows:
mxterm #nodes #tasks #minutes
sxterm #nodes #tasks #minutes

Where:
#nodes = number of nodes your job requires
#tasks = number of tasks your job requires
#minutes = how much time your job needs

4. This will submit a batch job for you that will open an xterm on your desktop when it starts to run.
5. After the xterm appears, you will be on a compute node and can do your work interactively.
6. This utility does not have a man page, however you can view the usage information by simple typing the name of the command.

VPN Remote Access Service

  • Use of a Remote Access Service (usually VPN) is required if you are outside of the LLNL internal network, and wish to access:
    • Institutional network services (LITE, LTRAIN, email, etc.)
    • Livermore Computing Restricted Zone (RZ) compute resources
  • Provided by the Cyber Security Program
  • Does not apply to the SCF
  • Not required for access to LC OCF Collaboration Zone (CZ) machines
  • To request LLNL VPN access, download software and see setup instructions, go to: access.llnl.gov/vpn/.
  • LLNL also offers a browser-based SSL VPN Web Portal:
    • The web portal should be used for Internet kiosks, such as at an airport or a conference, to access LLNL systems from off-site.
    • This service can be used for submitting your timecard or sending unencrypted email.
    • For details, see the link provided above.

SecureNet

  • SecureNet is the network that provides access between classified systems at DOE national laboratories and facilities.
    • LLNL
    • LANL
    • Sandia (New Mexico)
    • Sandia (California)
    • Honeywell Kansas City Plant
    • Pantex Plant
    • Westinghouse Savannah River Site
    • Y-12 National Security Complex
  • All LC classified systems must be accessed over SecureNet from non-LLNL systems.
  • Non-Tri-Lab users who wish to access LLNL classified resources require a SecureNet account in addition to an SCF account.
  • For a SecureNet account application form see: hpc.llnl.gov/accounts/forms.

File Systems

Home Directories and Login Files

Home Directories

  • LC user home directories are global to their network partition: 1 home directory system for the SCF, 1 for the OCF-CZ and 1 for the OCF-RZ.
  • Naming scheme: /g/g#/user_name. Examples:

/g/g15/joeuser
/g/g0/joestaff

  • Backups:
    • Online: .snapshot directories - twice daily
    • Daily incremental
    • Monthly
    • Bi-annual offsite disaster recovery
    • See the Backups section for details
  • NFS mounted:
    • Access is slower than local or parallel file systems
    • Not recommended for parallel I/O - may cause NFS server problems
  • Quota in effect - see the Quotas section for details.

LC's Login Files

  • Your login shell is established when your LC account is initially setup. The usual login shells are supported:
    /bin/bash
    /bin/csh
    /bin/ksh
    /bin/sh
    /bin/tcsh
    /bin/zsh
  • All LC users automatically receive a set of login files. These include:
.cshrc        .kshenv       .login        .profile
              .kshrc        .logout
.cshrc.linux  .kshrc.linux  .login.linux  .profile.linux
  • The files which are "sourced" when you login depends upon your shell.
  • Note for bash and zsh users: LC does not provide .bashrc, .bash_profile, .zprofile or .zshrc files at this time.

Operating System Specific Dot Files

  • LC also provides the ability for users to create dot files that are specific to a particular operating system.
  • These are not automatically provided - you need to create them yourself if you desire this feature.
  • Naming of these files is based upon the SYS_TYPE environment variable. For example, the file .cshrc.chaos_5_x86_64_ib will only be sourced on systems running the chaos_5_x86_64_ib operating system.
  • How to find SYS_TYPE?
    • echo $SYS_TYPE
    • Look in the /etc/home.config file
  • Note: operating systems change, so it is up to the user to keep such files current.

A Few Hints

  • Login files contain some important settings that should not be modified. Read the comments inside the file for guidance.
  • Place your modifications carefully, especially interactive commands. Again, read the comments inside the file for guidance.
  • Some of the more insidious and "odd" behaviors users encounter occur due to modifications to dot files.

Need a New Copy?

  • If you accidentally delete or clobber a dot file, a fresh copy can be obtained:
    • Architecture specific: /gadmin/etc/arch/skel/ directory, where arch matches one of the supported architectures.
    • Master dot files can be copied from /gadmin/etc/skel/

/usr/workspace File Systems

  • LC provides 2 terabytes of NFS mounted file space for each user and group.
  • Located under /usr/workspace/username and /usr/workspace/groupname
  • /usr/workspace/username is accessible by the user only. /usr/workspace/groupname may be accessed by the group members.
  • Similar to home directory:
    • Cross mounted from appropriate clusters
    • Not purged
    • Includes .snapshot directory for twice-daily online backups
    • Not intended for parallel I/O
  • Different from home directory:
    • Not backed up
    • 7 days of .snapshot backups

Temporary File Systems

  • /tmp
    /usr/tmp
    /var/tmp
    • Different names for the same /tmp file system
    • Local to each individual node; very small compared to other temporary file systems
    • Note: Uses the node's local memory, which may impact the amount of memory left for the job running on the node.
    • Faster than NFS
    • No quota, no backups
    • Purged between batch jobs
  • /p/lustre#
    • Lustre parallel file systems
    • Global temporary file systems - shared by all users
    • Very large, multi-petabyte in size - varies by file system
    • Available on most OCF and SCF systems
    • Quotas are in place for the /p/lustre# file systems
    • Not subject to purging
    • No backups
  • /p/gpfs#
    /p/gscratch#
    • Large, temporary parallel file systems found on Sierra and CORAL EA systems
    • IBM Spectrum Scale product (formerly known as GPFS)
    • No quotas, subject to purging, no backups

Useful Commands

  • The following commands are useful for determining which file systems are mounted, how full a file system is, and how much space your files are consuming.
CommandDescription
bdfEasy-to-read listing of mounted file systems
dfSame as bdf but not as easy to read
df -hEasier to read version of df
df filesystemDisplays info for a specified file system. Useful if the file system is not a mount point and doesn't show up on usual df list
duListing of space used by all files current directory
du -kSame as du with size in Kbyte blocks. Kbytes is the default on Linux systems.
du -sSummary of space used for all files in current directory
du -ksCombination

Parallel File Systems

  • In a typical cluster, most nodes are compute nodes where programs actually run. A subset of the system's nodes are dedicated to serve as I/O nodes. I/O nodes are also referred to as gateway nodes.
  • I/O nodes are the interface to disk resources. All I/O performed on compute nodes is routed to the I/O nodes over the internal switch network (such as InfiniBand).
  • The I/O nodes then send the I/O requests to storage servers over the SAN (Storage Area Network) which can be 10Gbit Ethernet or InfiniBand. The storage servers then perform the actual I/O to attached physical disk resources.
  • Individual files are stored as a series of "blocks" that are striped across the disks of different storage servers. This permits concurrent access by a multi-task application when tasks read/write to different segments of a common file.
  • Internally, file striping is set to a specific block size that is configurable. At LC, the most efficient use of parallel file systems is with large files. The use of many small files is not advised if performance is important.
  • Parallelism:
    • Simultaneous reads/writes to non-overlapping regions of the same file by multiple tasks
    • Concurrent reads and writes to different files by multiple tasks
    • I/O will be serial if tasks attempt to use the same stripe of a file simultaneously.

Parallel File Systems - Lustre

  • Most of LC's Linux clusters use Lustre parallel file systems.
  • To the user, it simply appears as another mounted file system
  • Naming scheme: /p/lustre# for Linux. For example:
% bdf | grep lustre
172.19.1.165@o2ib100:172.19.1.    4.9P   1.1P   3.9P  22%  /p/lustre3
172.19.3.1@o2ib600:172.19.3.2@     15P   3.8P    12P  25%  /p/lustre2
172.19.3.98@o2ib600:172.19.3.9     15P   622T    15P   5%  /p/lustre1
  • LC's Lustre parallel file systems are usually mounted by more than one Linux cluster.
  • No backups
  • /p/lustre# enforces quotas and is NOT subject to purging
  • For additional information also see: wiki.lustre.org

Parallel File Systems - IBM Spectrum Scale

  • LC's Sierra and CORAL EA systems use IBM's Spectrum Scale parallel file systems (formerly known as GPFS).
  • From a user perspective, they look and feel like Lustre parallel file systems on other LC clusters.
  • Naming scheme: /p/gpfs# for Sierra clusters, and /p/gscratch# for CORAL EA clusters. For example:
% bdf | grep gpfs
gpfs1                             140P    13P   127P   9%  /p/gpfs1
% bdf | grep gscratch
gpfs0                             1.3P   379T   915T  30%  /p/gscratchr
  • No backups, subject to purging, no quotas

LC Parallel File Systems Summary

  • Shows which clusters mount which parallel file systems. File system capacities are also shown.
  • As of Jan 2020. Subject to change.
OCF-CZlustre1
15 PB
lustre2
15 PB
lustre3
4.9 PB
gscratchr
1.3 PB
gpfs1
24 PB
boraxXX   
boraxoXX   
catalystXXX  
coronaXXX  
lassen    X
oslicXXX X - lassengpfs1
pascalXX   
quartzXX   
ray   X 
surfaceXXX  
syrahXX   

 

OCF-RZlustre1
7.5 PB
czlustre1
15 PB
czlustre2
15 PB
czlustre3
4.9 PB
gscratchrzm
431 TB
gpfs1
1.5 PB
rzalastorX     
rzansel     X
rzgenieX     
rzhasgpuX     
rzmanta    X 
rzslic*XXXX X -rzanselgpfs1
lassengpfs1
rztopazX     
rztronaX     

* For convenience, rzslic mounts CZ lustre and gpfs file systems to facilitate RZ-CA data transfer.

SCFlustre1
15 PB
lustre2
15 PB
gscratch9
431 TB
gpfs1
140 PB
agateXX  
cslicXX X - sierragpfs1
jadeXX  
jadeitaXX  
magmaXX  
maxXX  
micaXX  
shark  X 
sierra   X
zinXX  

Archival HPSS Storage

  • High Performance Storage System (HPSS) archival storage is available on both the OCF and SCF.
    • Provides "virtually unlimited" tape archive storage in the petabyte range. Both capacity and performance are continually increasing to keep up with the ever increasing user demand.
    • GigE connectivity to all production clusters
  • Primary components:
    • Server machines
    • RAID disk cache
    • Magnetic tape libraries
    • Jumbo frame GigE network
  • FTP client on LC production machines defaults to an enhanced parallel HPSS FTP client
  • No back up, no purge

Access Methods and Usage:

  • The HPSS system is named storage.llnl.gov on both the OCF and SCF
  • All LC users automatically receive an HPSS storage account with their regular production machine account.
  • Data Transfer Tools: The more commonly used ones are simply listed here and described in more detail in the File Transfer and Sharing section that follows later.
  • Recommendation: Initiate your file transfers from one of LC's special purpose clusters, which have been optimized for high-speed data movement to storage:
    oslic on the OCF-CZ
    rzslic on the OCF-RZ
    cslic on the SCF
  • These clusters exist solely for the purpose of offloading data to storage:
    • Multiple GigE connections to the network.
    • Users can start file transfers on multiple nodes, and have them running concurrently.
    • Observed transfer rates to storage are around 120 MB/sec per HTAR session.
    • A variety of file transfer tools in addition to HTAR are supported.
  • OCF RZ / CZ Restrictions:
    • Unlike most other resources, LC decided not to duplicate the HPSS system on the RZ. There is a single HPSS system on the OCF which serves both the CZ and RZ.
    • OCF-CZ only users can access storage from:
      • CZ machines
      • Desktop machines
    • OCF-RZ only and OCF-RZ+CZ users can access storage from:
      • RZ machines
      • Desktop machines via ftp rzarchive or ftp rzstorage. Authentication is with your RZ PIN + CRYPTOCard passcode.
    • For convenience, the rzslic cluster mounts both CZ and RZ parallel (lscratch) file systems. oslic mounts only the CZ parallel file systems.
  • Also able to be accessed from Tri-lab and other remote sites. Note that for remote access to OCF storage, VPN is required.
  • Storing dual-copy files in HPSS archival storage: For mission critical files, it is possible to store two copies at once using FTP, HSI, HTAR or NFT. Technical Bulletin 435 discusses how to accomplish this, located at: lc.llnl.gov/computing/techbulletins/bulletin435.pdf (requires authentication).
  • Quotas:
  • How much storage am I using? The aquota command provides this information. For example:
oslic5% kinit
[authenticate here]

oslic5% aquota
Welcome to HPSS Quota Server oslici.llnl.gov

aq> show allowance

Pool Name                 Pool Manager  Allowance
------------------------  ------------  ---------------
lcreserve                 lc-hotline             0.0 B
default                   lc-hotline             1.5 TB
   Total                                         1.5 TB

From 10/01/2019 through 02/06/2020:
     5 files created.
     46.8 GB of data used. 3.12% of total.
     Avg. Per Month:  11.1 GB

Total Data:      85.1 TB
Total Files:     2127240

aq>

Usage notes:

  • Currently, you must be logged into oslic / rzslic / cslic to use this command.
  • Use the aquota help subcommand for additional options.
  • You may need to authenticate with the kinit command first.

Additional Information

/usr/gapps, /usr/gdata File Systems

  • LC provides shared, collaborative, NFS file space for user developed and supported applications and data on LC systems:
File SystemCZRZSCFNotes
/usr/gappsXXXUser applications. Distinct between CZ, RZ and SCF.
/collab/usr/gapps/XX User applications. Shared between RZ and CZ.
/usr/gapps2  XUser applications.
/usr/gdataXXXUser data. Distinct between CZ, RZ and SCF.
/collab/usr/gdataXX User data. Shared between RZ and CZ.
  • Unlike your home directory, these file systems can be used (with approval) to share file space within a group or even the world.
  • For convenience, OCF-RZ users can use /collab/usr/gapps and /collab/usr/gdata to share files with OCF-CZ users.
  • Backups:
    • Online: .snapshot directories.
    • Daily incremental
    • Monthly
    • Bi-annual offsite disaster recovery
    • See the Backups section for details.
  • Never purged
  • Multiple architectures are handled through the $SYS_TYPE variable:
    • Every LC machine sets this environment variable to a specific string that matches its architecture. For example:
toss_3_x86_64_ib
blueos_3_ppc64le_ib
  • Versions of code built for specific architectures are placed in subdirectories named to match $SYS_TYPE strings
  • User scripts can select the appropriate code versions based upon the $SYS_TYPE setting. For example: cd /usr/gapps/myApp/$SYS_TYPE/bin
  • Requesting a directory within /usr/gapps: submit the LC USR_GAPPS form to create/change/delete a directory.
  • Sharing files and directories in your /usr/gapps directory with a group:
    • Create and manage UNIX groups: lc-idm.llnl.gov
    • Then use UNIX permissions to permit group sharing
  • For additional information see the /usr/gapps web page

Quotas

Home Directories

  • To check usage and limits: quota -v
  • Example:
% quota -v

Disk quotas for joeuser:
Filesystem     used   quota  limit    timeleft  files  quota  limit    timeleft
/g/g0          1.1G   24.0G  24.0G              3.9K   n/a    n/a     
/g/g10         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g11         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g12         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g13         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g14         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g15         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g16         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g17         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g18         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g19         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g20         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g21         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g22         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g23         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g24         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g90         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g91         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g92         -0-    24.0G  24.0G              -0-    n/a    n/a     
/g/g99         -0-    24.0G  24.0G              -0-    n/a    n/a     
/usr/gapps     -0-    n/a    n/a                -0-    n/a    n/a     
/collab/usr/gapps
               -0-    n/a    n/a                -0-    n/a    n/a     
/usr/give      -0-    25.0G  25.0G              -0-    n/a    n/a     
/usr/global    -0-    32.0G  32.0G              -0-    n/a    n/a     
/collab/usr/global
               -0-    32.0G  32.0G              -0-    n/a    n/a     
/usr/workspace/wsa
               -0-    n/a    n/a                -0-    n/a    n/a     
/usr/workspace/wsb
               -0-    n/a    n/a                -0-    n/a    n/a         
/p/lustre1     24.5K  18.0T  20.0T              0.0K   900.0K 1.0M    
/p/lustre2     24.5K  18.0T  20.0T              0.0K   900.0K 1.0M
  • Requests for additional disk space should be directed to the LC Hotline through your computer coordinator or PI.
  • To view usage by user per file system, see the log files located in /usr/global/docs/filerUsageInfo
  • Exceeding quota:
    • Warning appears in login messages if usage over 90% quota
    • Heed quota warnings - risk of data loss if quota is exceeded!

Other File Systems

  • HPSS archival storage: Checkout our Quotas overview for up-to-date information

Purge Policies

  • When file systems become full, performance can be significantly degraded. Because of this, LC maintains policies for purging temporary file systems.
  • The following temporary file systems are subject to purging:
/tmp
/usr/tmp
/var/tmp
/nfs/tmp#
/p/lscratch#
  • The /p/lustre# temporary file systems are NOT subject to purging since they enforce quotas. Likewise for the /p/gpfs# file systems once quotas are implemented for them.
  • When are files purged?
    • /tmp, /var/tmp, /usr/tmp: node-local temporary file space is purged daily and/or in between batch jobs.
    • /p/lscratch#: as needed
  • Files in temporary file systems are not backed up
  • Don't forget: tmp or scratch in the name means temporary!

Backups

Online .snapshot Directories

  • User home directories, /usr/workspace/usr/gapps and /usr/gdata have a special, online directory for regular, automatic backups.
  • Hidden .snapshot subdirectory
    • It is not listed by the ls command but you can cd .snapshot
    • Contains multiple subdirectories, each containing a full backup and a timestamp when the backup was created.
    • .snapshot is read-only directory
  • If you delete or mangle a file it may save you:
    • If the file existed before the last .snapshot backup was done
    • Just use the cp command to copy replacement
  • Feature with Network Appliance NFS servers. See their documentation at netapp.com.
  • Example:
% ls -l .snapshot
total 80
drwx------ 98 joeuser joeuser 20480 May 8 13:27 2_per_day.2019-05-08_1900
drwx------ 98 joeuser joeuser 20480 May 8 13:27 2_per_day.2019-05-09_1200
drwx------ 98 joeuser joeuser 20480 May 8 13:27 2_per_day.2019-05-09_1900
drwx------ 98 joeuser joeuser 20480 May 8 13:27 2_per_day.2019-05-10_1200
drwx------ 97 joeuser joeuser 20480 May 3 15:39 weekly.2019-05-05_0015
% ls -l .snapshot/2_per_day.2018-05-28_1900
total 24712
-rw-------  1 joeuser joeuser   31575 Aug 30 12:19 Batch_Limits.doc
-rw-------  1 joeuser joeuser 2120192 Sep 01 12:04 FY01Blueprint.doc
drwx------  2 joeuser joeuser    4096 May 07 15:44 Mail
drwx------  2 joeuser joeuser    4096 Nov 07 2000  Misc
drwx------ 16 joeuser joeuser    4096 Oct 24 1998  NPB2.3
-rw-------  1 joeuser joeuser 3039744 Aug 30 10:22 WhitePIX.ppt
drwx------  2 joeuser joeuser    4096 Mar 29 13:09 bin
-rw-------  1 joeuser joeuser      39 May 09 09:20 blank.html
-r--------  1 joeuser joeuser 2433035 Aug 24 14:01 cforaix.pdf
....

Livermore Computing System Backups

  • LC performs regular backups of the following file systems:
    • /g/g##: User home directories
    • /usr/gapps, /usr/gdata: User application and data directories
    • /usr/local, /usr/global: LC developed or maintained application directories
    • Atlassian Tools: Jira, Confluence, Bitbucket, etc.
  • Daily backups of new or changed files
  • Full monthly backup of all files. Retained onsite for 6 months.
  • Disaster recovery backups:
    • Performed every 6 months
    • Data (both OCF and SCF) is stored offsite at the Nevada Test Facility
    • Retained for 2 years
  • For detailed information on LC backups, see the internal wiki document located at: lc.llnl.gov/confluence/display/LCBackups/LC+Backups+Home (requires authentication)

Note:Temporary file systems are not backed up:

  • /tmp, /var/tmp, /usr/tmp
  • /p/lustre#
  • /p/gscratch#
  • /p/gpfs#

Archival HPSS Storage

  • Users are responsible for backing up all other data they wish to preserve, particularly any files residing in temporary file systems.
  • The preferred location for these backups is the archival HPSS storage system available on both the OCF and SCF.
  • See the Archival HPSS Storage section for details.

File Transfer and Sharing

File Transfer Tools

  • There are a number of ways to transfer files - depending upon what you want to do.
  • hopper - A powerful, interactive, cross-platform tool that allows users to transfer and manipulate files and directories by means of a graphical user interface. Users can connect to and manage resources using most of the major file transfer protocols, including FTP, SFTP, SSH, NFT, and HTAR. See the hopper web pages ( hpc.llnl.gov/software/hopper), hopper man page or use the hopper -readme command for more information.
  • ftp - Is available for file transfer between LC machines. The ftp client at LC is an optimized parallel ftp implementation. It can be used to transfer files with machines outside LLNL if the command originates from an LLNL machine and the foreign host will permit it. FTP to LC machines from outside LLNL is not permitted unless the user is connected via an appropriate Remote Access service such as OTS or VPN. Documentation is available via the ftp man page or the FTP Usage Guide (hpc.llnl.gov/manuals/ezstorage/ftp)
  • scp - (secure copy) is available on all LC machines. Example:
    scp thisfile user@host2:thatfile
  • sftp - Performs ftp-like operations over encrypted ssh.
  • MyLC - Livermore Computing's user portal provides a mechanism for transferring files to/from your desktop machine and your home directory on an LC machine. See the "utilities" tab. Available at mylc.llnl.gov
  • nft - (Network File Transfer) is LC's utility for persistent file transfer with job tracking. This is a command line utility that assumes transfers with storage and has a specific syntax. Documentation is available via its man page or the NFT Reference Manual (hpc.llnl.gov/manuals/ezstorage/nft).
  • htar - Is highly optimized for creation of archive files directly into HPSS, without having to go through the intermediate step of first creating the archive file on local disk storage, and then copying the archive file to HPSS via some other process such as ftp. The program uses multiple threads and a sophisticated buffering scheme in order to package member files into in-memory buffers, while making use of the high-speed network striping capabilities of HPSS. Syntax resembles that of the UNIX tar command. Documentation is available via its man page or the HTAR Reference Manual (hpc.llnl.gov/manuals/ezstorage/htar).
  • hsi - Hierarchical Storage Interface. HSI is a utility that communicates with HPSS via a user- friendly interface that makes it easy to transfer files and manipulate files and directories using familiar UNIX-style commands. HSI supports recursion for most commands as well as CSH-style support for wildcard patterns and interactive command line and history mechanisms. Documentation is available via its man page.
  • Tri-lab high bandwidth file transfers over SecureNet:
    • All three Labs support wrapper scripts for enhanced data transfer between sites - classified side only.
    • Three different protocols can be used: hsi, htar and pftp.
    • Transfers can be from host to storage or host to host
    • Commands are given names that are self-explanatory - see the accompanying image at right.
    • At LLNL, these scripts should already be in your path
    • For additional information please see aces.sandia.gov/tri_lab_home.html#file_xfer (requires authentication)

File Sharing Rules

  • User home directories are required to be accessible to the user only. No group or world sharing is permitted.
  • Likewise, /usr/workshare directories are accessible by the user only.
  • Group sharing is permitted in /usr/workspace/groupname directories.
  • Group sharing is permitted in lustre directories.
  • The collaborative /usr/gapps file systems permit group sharing. World sharing is permitted with Associate Director approval.

Give and Take Utilities

  • LC provides the give and take utilities for sharing files between users.
  • Syntax:
give user file
take user file
  • Examples:
    • Give one file: give jsmith input1     
    • Give multiple files: give jsmith input1 input2     
    • Give multiple files via wildcard: give jsmith in*     
    • "ungive" (remove) a file given to jsmith: give -u jsmith input2     
    • Take one file: take ljones data     
    • Take multiple files: take ljones data2 data3   
    • Takes all ljones files - do not use asterisk: take ljones     
    • Lists files to be taken: take     
    • Lists files you have given: give   
  • Files are spooled to the /usr/give directory
    • Mounted and visible on all production clusters
    • Separate spool directories for OCF-CZ and OCF-RZ machines
    • Limited in size - if you plan on giving large files, check how full the directory is first using the df or bdf commands.
    • Currently, a 25GB quota per user
  • Files which have been given, but not taken, will be purged from the spool directory after a week or so
  • Cannot give a directory structure; tar it up and then give
  • Files must be taken on a machine where both users (giver and taker) have accounts.
  • For options and additional information, see the give and take man pages.

Anonymous FTP Server

File Interchange Service (FIS)

  • Use of LC's File Interchange Service (FIS) is required to move files between the OCF and the SCF
  • Requires that an FIS account be setup first
  • Two different types of FIS:
    • fastfis: Uses an automated, unidirectional One Way Link (OWL) from OCF to SCF. Transfers initiate quickly.
    • tapefis: Uses a manual transfer of tape by operator from OCF to SCF. Transfers initiate more slowly because of manual involvement.
  • Fastfis has one channel for small files and one channel for large (over 1 GB) files. Helps reduce blocking of small file transfer by large files.
  • Recommendation: use a tar file if many small files are to be transferred.
  • File size limits and transfer speeds:
    • No file size limits per se, on either fastfis or tapefis
    • System space is currently 1 TB for fastfis and 3 TB for tapefis (8/18)
    • Transfer speed of 100 GB/hr for fastfis and 200 GB/hr for tapefis
  • Purging: be sure to complete your transfer in a timely manner on the FROM side, as files are periodically purged from the TO and FROM directories.
  • Transferring files from the SCF to the OCF requires an Associate Director's approval and occurs via tapefis.
  • Documentation is available at:

Usage

Sending FilesHostnameAliasValid ProtocolsTransfer MethodAuthentication MethodNotes
CZ only user
From OCF-CZ or desktop machine to SCF
fis.llnl.govfisftp, sftpOWLRSA OTPcd to the TO directory and then put your files there. You will be notified by email when your files have been moved to the SCF.
fastfis.llnl.govfastfis
tapefis.llnl.govtapefistape
RZ or RZ/CZ user
From OCF-RZ or desktop machine to SCF
rzfis.llnl.govrzfisftp, sftpOWLCRYPTOcard OTP
rzfastfis.llnl.govrzfastfiscd to the TO directory and then put your files there. You will be notified by email when your files have been moved to the SCF.
rztapefis.llnl.govrztapefistape
SCF user
From SCF to OCF
tapefis.llnl.govtapefisftptapeRSA OTPcd to the TO directory and then put your files there. Requires review/approval by an Authorized Derivative Classifier (ADC) from your department/program See ocec-r.llnl.gov/ to find yours.
iSNSI CZ only user
From OCF-CZ or desktop machine to iSNSI (pinot)
snsifis.llnl.govsnsifisftp, sftpOWLLC username + RSA OTPcd to the TO directory and then put your files there.
snsitapefis.llnl.govsnsitapefis tape 
iSNSI RZ or RZ/CZ user
From OCF-RZ or desktop machine to iSNSI (pinot)
rzsnsifis.llnl.govrzsnsifisftp, sftpOWLLC username + CRYPTOcard OTPcd to the TO directory and then put your files there.
rzsnsitapefis.llnl.govrzsnsitapefis tape 
iSNSI user
From iSNSI (pinot) to OCF
tapefis.llnl.doe.sgov.govtapefisftptapeOUN + SNSI PIN with RSA OTPcd to the TO directory and then put your files there. Requires review/approval by an Authorized Derivative Classifier (ADC) from your department/program See ocec-r.llnl.gov to find yours.

 

Retrieving FilesHostnameAliasValid ProtocolsTransfer MethodAuthentication MethodNotes
SCF user
After transfer from either OCF-CZ or OCF-RZ machine
fis.llnl.govfisftp, sftpOWLRSA OTP
fastfis.llnl.govfastfiscd to the FROM directory and then get your files from there.
tapefis.llnl.govtapefisftptape
iSNSI user
After transfer from either OCF-CZ or OCF-RZ machine
fastfis.llnl.doe.sgov.govfastfisftp, sftpOWLOUN + SNSI PIN with RSA OTPcd to the FROM directory and then get your files from there.
fis.llnl.doe.sgov.gov
(tape)
fisftptape
OCF CZ user
After transfer from SCF
tapefis.llnl.govtapefisftptapeRSA OTPcd to the FROM directory and then get your files from there. Requires previous review/approval by an Authorized Derivative Classifier (ADC) from your department/program See ocec-r.llnl.gov/ to find yours.
OCF RZ or RZ/CZ user
After transfer from SCF
rztapefis.llnl.govrztapefisftptapeCRYPTOcard OTP
OCF CZ user
After transfer from iSNSI
snsitapefis.llnl.govsnsitapefisftptapeLC username + RSA OTPcd to the FROM directory and then get your files from there. Requires previous review/approval by an Authorized Derivative Classifier (ADC) from your department/program See ocec-r.llnl.gov/ to find yours.
OCF RZ or RZ/CZ user
After transfer from iSNSI
snsirztapefis.llnl.govsnsirztapefisftptapeLC username + CRYPTOcard OTP

System Status and Configuration Information

  • Before you attempt to run your parallel application, it is important to know a few details about the way the system is configured. This is especially true at LC where every system is configured differently and where things change frequently.
  • It is also useful to know the status of the machines you intend on using. Are they available or down for maintenance?
  • System configuration and status information for all LC systems is readily available from the LC Homepage and the MyLC Portal.

System Configuration Information

  • LC Homepage:
    • hpc.llnl.gov (User Portal toggle) ==> Hardware ==> Compute Platforms
    • Direct link: hpc.llnl.gov/hardware/platforms
    • All production systems appear in a summary table showing basic hardware information.
    • Diving on a machine's name will take you to a page of detailed hardware and configuration information for that machine.
  • MyLC Portal:
    • mylc.llnl.gov
    • Click on a machine name in the "machine status" portlet, or the "my accounts" portlet.
    • Then select the "details", "topology" and/or "job limits" tabs for detailed hardware and configuration information.
  • LC Tutorials:
  • Systems Summary Tables:

System Configuration Commands

  • After logging into a machine, there are a number of commands that can be used for determining detailed, real-time machine hardware and configuration information.
  • A table of some useful commands with example output is provided below. Hyperlinked commands display their man page
CommandDescriptionExample Output
news job.lim.machinenameLC command for displaying system configuration, job limits and usage policies, where machinename is the actual name of the machine.
lscpuBasic information about the CPU(s), including model, cores, sockets, threads, clock and cache.
lscpu -e One line of basic information about the CPU(s), cores, sockets, threads and clock.
cat /proc/cpuinfoModel and clock information for each thread of each core.

lstopo

Display a graphical topological map of node hardware.
lstopo --only coresList the physical cores only.
lstopo -vDetailed (verbose) information about a node's hardware components.
vmstat -sMemory configuration and usage details.
cat /proc/meminfoMemory configuration and usage details.
uname -a
distro_version
cat /etc/redhat-release
cat /etc/toss-release
Display operating system details, version.
bdf
df -h
Show mounted file systems.

System Status Information

  • LC Homepage:
    • hpc.llnl.gov (User Portal toggle) - just look on the man page for the System Status links (shown at right).
    • The same links appear under the Hardware menu.
    • Unclassified systems only
  • MyLC Portal:
    • mylc.llnl.gov
    • Several portlets provide system status information:
      • machine status
      • login node status
      • scratch file system status
      • enclave status
    • Classified MyLC is at: lc.llnl.gov/lorenz/
  • Machine status email lists:
    • Provide the most timely status information for system maintenance, problems, and system changes/updates
    • ocf-status and scf-status cover all machines on the OCF / SCF
    • Additionally, each machine has its own status list - for example: sierra-status@llnl.gov
  • Login banner & news items - always displayed immediately after logging in
    • Login banner includes basic configuration information, announcements and news items. Example login banner HERE.
    • News items (unread) appear at the bottom of the login banner. For usage, type news -h.
  • Direct links for systems and file systems status pages:
DescriptionNetworkLinks
System status web pagesOCF CZlc.llnl.gov/cgi-bin/lccgi/customstatus.cgi
OCF RZrzlc.llnl.gov/cgi-bin/lccgi/customstatus.cgi
SCFlc.llnl.gov/cgi-bin/lccgi/customstatus.cgi
File systems status web pagesOCF CZlc.llnl.gov/fsstatus/fsstatus.cgi
OCF RZrzlc.llnl.gov/fsstatus/fsstatus.cgi
OCF CZ+RZrzlc.llnl.gov/fsstatus/allfsstatus.cgi
SCFlc.llnl.gov/fsstatus/fsstatus.cgi

Examples


CZ Machine Status

CZ + RZ File System Status

mylc.llnl.gov

Exercise 1

Logging in, basic configuration, and file systems information:

  • Login to an LC cluster with X11 forwarding enabled
  • Test X11
  • Identify and SSH to other login nodes
  • Familiarize yourself with the cluster's configuration
  • Try the mxterm utility to access compute nodes
  • Learn where/how to obtain hardware, OS and other configuration information for LC clusters
  • Review basic file system info
  • Try moving files to the HPSS storage system
  • View file system status information

Software and Development Environment Overview

Development Environment Group (DEG)

  • LC's Development Environment Group (DEG) provides a stable, usable, leading-edge parallel application development environment that enables users to improve the reliability and scalable performance of LLNL applications.
  • DEG installs and supports the following LC software:
    • Compilers and Preprocessors
    • Debuggers
    • Memory Tools
    • Profiling Tools
    • Tracing Tools
    • Performance Analysis
    • Correctness Tools
    • Utilities
  • Additionally, DEG's mission includes:
    • Working to make computing tools reliable, scalable and to help users make effective use of these tools.
    • Partnering with its application development user community to identify user requirements and evaluate tool effectiveness.
    • Collaborating with vendors and other third party software developers to ensure a complete environment in the most cost effective way possible and meet the needs of today's code developers utilizing emerging technologies.
  • DEG Home Page: computing.llnl.gov/livermore-computing/development-environment-group

TOSS Operating System

  • TOSS = Tri-Laboratory Operating System Stack
  • Based on Red Hat Enterprise Linux (RHEL) with modifications to support targeted HPC hardware and cluster computing
  • Used by most LC (and Tri-lab) production Linux clusters:
    • For Blue Gene systems, the login nodes use TOSS, but the compute nodes run a special Linux-like Compute Node Kernel (CNK).
    • CORAL EA and Sierra clusters use a "TOSS-like" OS/software stack, called blueos by LC.
  • The primary components of TOSS include:
    • RHEL kernel optimized for large scale cluster computing
    • OpenFabrics Enterprise Distribution InfiniBand software stack including MVAPICH and OpenMPI libraries
    • Slurm workload manager
    • Integrated Lustre and Panasas parallel file system software
    • Scalable cluster administration tools
    • Cluster monitoring tools
    • GNU, C, C++ and Fortran90 compilers (GNU, Intel, PGI)
    • Testing software framework for hardware and operating system validation
  • See Redhat's documentation for details on the RHEL kernel.
  • Version information for LC's clusters:
    • TOSS: distro_version or cat /etc/toss-release
    • Redhat: cat /etc/redhat-release

Software Lists, Documentation, and Downloads

The table below lists and provides links to the majority of software available through LC and related organizations.

Software CategoryDescription and More Information
CompilersLists which compilers are available for each LC system: hpc.llnl.gov/software/development-environment-software/compilers
Supported Software and Computing ToolsDevelopment Environment Group supported software includes compilers, libraries, debugging, profiling, trace generation/visualization, performance analysis tools, correctness tools, and several utilities: hpc.llnl.gov/software/development-environment-software.
Graphics SoftwareGraphics Group supported software includes visualization tools, graphics libraries, and utilities for the plotting and conversion of data: hpc.llnl.gov/data-vis/vis-software
Mathematical Software OverviewLists and describes the primary mathematical libraries and interactive mathematical tools available on LC machines: hpc.llnl.gov/software/mathematical-software
LINMathThe Livermore Interactive Numerical Mathematical Software Access Utility, is a Web-based access utility for math library software. The LINMath Web site also has pointers to packages available from external sources: www-lc.llnl.gov/linmath/
Center for Applied Scientific Computing (CASC) SoftwareA wide range of software available for download from LLNL's CASC. Includes mathematical software, language tools, PDE software frameworks, visualization, data analysis, program analysis, debugging, and benchmarks: computing.llnl.gov/hpc/software, computing.llnl.gov/projects
LLNL Software PortalLab-wide portal of software repositories: software.llnl.gov/
If you are unable to find software you would like to run on LC systems, please check that it has been approved (or submit a request) at https://smdb.llnl.gov prior to installing it yourself.
 

Modules

  • Most LC clusters support Lmod modules for software packaging:
    • Provide a convenient, uniform way to select among multiple versions of software installed on LC systems.
    • Many LC software applications require that you load a particular "package" in order to use the software.
  • Using Modules:
List available modules:     module avail
Load a module:              module add|load modulefile
Unload a module:            module rm|unload modulefile
List loaded modules:        module list
Read module help info:      module
Display module contents:    module display|show modulefile

Atlassian Tools - Confluence, JIRA, etc.

  • LC supports a suite of web-based collaboration tools from Atlassian:
    • Confluence Wiki: used for documentation, collaboration, knowledge sharing, file sharing, mockups, diagrams... anything you can put on a webpage.
    • JIRA: issue tracking and project management system
    • Bitbucket: for git repository hosting. Similar to popular sites like GitHub and Bitbucket, but it is intended for internal use on intranets.
    • Bamboo: a continuous integration and delivery tool that combines automated builds, tests, and releases in a single workflow.
  • All three collaboration tools:
    • Are based on LC usernames / groups and are intended to foster collaboration between LC users working on HPC projects.
    • Are installed on the CZ, RZ and SCF networks
    • Require authentication with your LC username and RSA PIN + token
    • Have a User Guide for usage information
  • Locations:
NetworkConfluence WikiJIRABitBucket
CZlc.llnl.gov/confluence/lc.llnl.gov/jira/lc.llnl.gov/bitbucket/
RZrzlc.llnl.gov/confluence/rzlc.llnl.gov/jira/rzlc.llnl.gov/bitbucket/
SCFlc.llnl.gov/confluence/lc.llnl.gov/jira/lc.llnl.gov/bitbucket/

Spack Package Manager

  • Spack is a flexible package manager for HPC
  • Easy to download and install. For example:
% git clone https://github.com/spack/spack
% . spack/share/spack/setup-env.csh (or setup-env.sh)
  • There is an increasing number of software packages (over 4,200 as of May 2020) available for installation with Spack. Many open source contributions from the international community.
  • To view available packages: spack list
  • Then, to install a desired package: spack install packagename
  • Additional Spack features:
    • Allows installations to be customized. Users can specify the version, build compiler, compile-time options, and cross-compile platform, all on the command line.
    • Allows dependencies of a particular installation to be customized extensively.
    • Non-destructive installs - Spack installs every unique package/dependency configuration into its own prefix, so new installs will not break existing ones.
    • Creation of packages is made easy.
  • Extensive documentation is available at: spack.readthedocs.io

Compilers

Available Compilers and Invocation Commands

Linux Cluster Compilers
CompilerSerial CommandParallel Commands
IntelCiccmpicc
C++icpcmpicxx, mpic++
Fortranifortmpif77, mpif90, mpifort
GNUCgccmpicc
C++g++mpicxx, mpic++
Fortrangfortranmpif77, mpif90, mpifort
PGICpgccmpipgcc, mpicc
C++pgc++mpicxx, mpic++
Fortranpgf77, pgf90, pgfortranmpif77, mpif90, mpifort
LLVM/ClangCclangmpicc
C++clang++mpicxx, mpic++

Compiler Versions and Defaults

  • LC maintains multiple versions of each compiler.
  • The Modules module avail command is used to list available compilers and versions:

    module avail intel
    module avail gcc
    module avail pgi
    module avail clang

  • Versions: to determine the actual version you are using, issue the compiler invocation command with its "version" option. For example:
CompilerOptionExample
Intelversionifort --version
GNUversiong++ --version
PGI-Vpgf90 -V
Clang--versionclang --version
  • To use an alternate version issue the Modules command: module load module-name

Compiler Options

  • Each compiler has hundreds of options that determine what the compiler does and how it behaves.
  • The options used by one compiler mostly differ from other compilers.
  • Additionally, compilers have different default options.
  • An in-depth discussion of compiler options is beyond the scope of this tutorial.
  • See the compiler's documentation, man pages, and/or -help or --help option for details.

Compiler Documentation

  • Intel and PGI: compiler docs are included in the /opt/compilername directory. Otherwise, see Intel or PGI web pages.
  • GNU: see the web pages at gcc.gnu.org/
  • LLVM/Clang: see the web pages at clang.llvm.org/docs/
  • Man pages may/may not be available

Optimizations

  • All compilers are able to perform optimizations, though they will differ between compilers even though the compiler flags appear to be the same.
  • Optimizations are intended to make codes run faster, though this isn't guaranteed.
  • Some optimizations "rewrite" your code, and can make debugging difficult, since the source may not match the executable.
  • Optimizations can also produce wrong results, reduced precision, increased compile times and increased executable size.
  • The table below summarizes common compiler optimization options. See the compiler documentation for details and other optimization options.
OptimizationIntelGNUPGI
-OSame as O2Same as O1O1 + global optimizations. No SIMD.
-O0No optimizationDEFAULT. No optimization. Same as omitting any -O flag.No optimization
-O1Optimize for size: basic optimizations to create smallest codeReduce code size and execution time, without performing any optimizations that take a great deal of compilation time.Local optimizations, block scheduling and register allocation.
-O2DEFAULT. Optimize for speed: O1 + additional optimizations such as basic loop and vectorizationOptimize even more. O1 + nearly all supported optimizations that do not involve a space-speed tradeoff.DEFAULT. O1 + global optimizations + advanced optimizations including SIMD.
-O3O2 + aggressive loop optimizations. Recommended for loop dominated codes.O2 + further optimizationsO2 + aggressive global optimizations
-O4n/an/aO3 + hoisting of guarded invariant floating point expressions
-OfastSame as O3 (mostly)Same as O3 + optimizations that disregard strict standards compliance.n/a
-fastO3 + several additional optimizationsn/aGenerally specifies global optimization. Actual optimizations vary from release to release.
-Ogn/aEnables optimizations that do not interfere with debugging.n/a
Optimization / Vectorization report-opt-report
-vec-report
-ftree-vectorizer-verbose=[1-7]
-ftree-vectorizer-verbose=7
-Minfo=[option]
-Minfo=all

Floating-point Exceptions

  • The IEEE floating point standard defines several exceptions (FPEs) that occur when the result of a floating point operation is unclear or undesirable:
    • overflow: an operation's result is too large to be represented as a float. Can be trapped, or else returned as a +/- infinity.
    • underflow: an operation's result is too small to be represented as a normalized float. Can be trapped, or else represented as as a denormalized float (zero exponent w/ non-zero fraction) or zero.
    • divide-by-zero: attempting to divide a float by zero. Can be trapped, or else returned as a +/- infinity.
    • inexact: result was rounded off. Can be trapped or returned as rounded result.
    • invalid: an operation's result is ill-defined, such as 0/0 or the sqrt of a negative number. Can be trapped or returned as NaN (not a number).
  • By default, the Xeon processors used at LC mask/ignore FPEs. Programs that encounter FPEs will not terminate abnormally, but instead, will continue execution with the potential of producing wrong results.
  • Compilers differ in their ability to handle FPEs. See the relevant compiler documentation for details.

Precision, Performance, and IEEE 754 Compliance

  • Typically, most compilers do not guarantee IEEE 754 compliance for floating-point arithmetic unless it is explicitly specified by a compiler flag. This is because compiler optimizations are performed at the possible expense of precision.
  • Unfortunately for most programs, adhering to IEEE floating-point arithmetic adversely affects performance.
  • If you are not sure whether your application needs this, try compiling and running your program both with and without it to evaluate the effects on both performance and precision.
  • See the relevant compiler documentation for details.

Mixing C and Fortran

  • If you are linking C/C++ and FORTRAN code together, and need to explicitly specify the FORTRAN or C/C++ libraries on the link line.
  • All of the other issues involved with mixed language programming apply, such as:
    • Column-major vs. row-major array ordering
    • Routine name differences - appended underscores
    • Arguments passed by reference versus by value
    • Common blocks vs. extern structs
    • Memory alignment differences
    • File I/O - Fortran unit numbers vs. C/C++ file pointers
    • C++ name mangling
    • Data type differences
  • Some useful references:

Debuggers

  • Note: This section only touches on selected highlights. For more information users will definitely need to consult the relevant documentation mentioned below. Also, please consult the "Supported Software and Computing Tools" web page located at hpc.llnl.gov/software.

TotalView

  • TotalView is probably the most widely used debugger for parallel programs. It can be used with C/C++ and Fortran programs and supports all common forms of parallelism, including pthreads, openMP, MPI, accelerators and GPUs.
  • Starting TotalView for serial codes: simply issue the command:

        totalview myprog

  • Starting TotalView for interactive parallel jobs:
    • Some special command line options are required to run a parallel job through TotalView under SLURM. You need to run srun under TotalView, and then specify the -a flag followed by 1)srun options, 2)your program, and 3)your program flags (in that order). The general syntax is: totalview srun -a -n #processes -p pdebug myprog [prog args]
    • To debug an already running interactive parallel job, simply issue the totalview command and then attach to the srun process that started the job.
    • Debugging batch jobs is covered in LC's TotalView tutorial and in the "Debugging in Batch" section below.
  • Documentation:

DDT

  • DDT stands for "Distributed Debugging Tool", a product of Allinea Software Ltd.
  • DDT is a comprehensive graphical debugger designed specifically for debugging complex parallel codes. It is supported on a variety of platforms for C/C++ and Fortran. It is able to be used to debug multi-process MPI programs, and multi-threaded programs, including OpenMP.
  • Currently, LC has a limited number of fixed and floating licenses for OCF and SCF Linux machines.
  • Usage information: see LC's DDT Quick Start information located at: hpc.llnl.gov/software/development-environment-software/allinea-ddt
  • Documentation: see the vendor website: www.allinea.com

STAT - Stack Trace Analysis Tool

  • The Stack Trace Analysis Tool gathers and merges stack traces from a parallel application's processes.
  • STAT is particularly useful for debugging hung programs.
  • It produces call graphs: 2D spatial and 3D spatial-temporal
    • The 2D spatial call prefix tree represents a single snapshot of the entire application (see image).
    • The 3D spatial-temporal call prefix tree represents a series of snapshots from the application taken over time.
  • In these graphs, the nodes are labeled by function names. The directed edges, showing the calling sequence from caller to callee, are labeled by the set of tasks that follow that call path. Nodes that are visited by the same set of tasks are assigned the same color, giving a visual reference to the various equivalence classes.
  • This tool should be in your default path as:
    • /usr/local/bin/stat-gui - GUI
    • /usr/local/bin/stat-cl - command line
    • /usr/local/bin/stat-view - viewer for DOT format output files
    • /usr/local/tools/stat - install directory, documentation
  • More information: see the STAT web page at: hpc.llnl.gov/software/development-environment-software/stat-stack-trace-analysis-tool

Debugging in Batch: mxterm / sxterm

  • Debugging batch parallel jobs on LC production clusters is fairly straightforward. The main idea is that you need to submit a batch job that gets your partition allocated and running.
  • Once you have your partition, you can rsh to any of the nodes within it, and then starting running as though you’re in the interactive pdebug partition.
  • For convenience, LC has developed the mxterm / sxterm utilities which makes the process even easier.
  • How to use mxterm / sxterm:
    • If you are on a Windows PC, start your X11 application (such as X-Win32)
    • Make sure you enable X11 tunneling for your ssh session
    • ssh and login to your cluster
    • Issue the command as follows:
      mxterm #nodes #tasks #minutes
      sxterm #nodes #tasks #minutes

      Where: #nodes = number of nodes your job requires
      #tasks = number of tasks your job requires
      #minutes = how long you need to keep your partition for debugging
    • This will submit a batch job for you that will open an xterm when it starts to run.
    • After the xterm appears, cd to the directory with your source code and begin your debug session.
    • This utility does not have a man page, however you can view the usage information by simple typing the name of the command.

Other Debuggers

  • Several other common debuggers are available on LC Linux clusters, though they are not recommended for parallel programs when compared to TotalView and DDT.
  • PGDBG: the Portland Group Compiler debugger. Documentation: www.pgroup.com/products/pgdbg.htm
  • GDB: GNU GDB debugger, a command-line, text-based, single process debugger. Documentation: www.gnu.org/software/gdb
  • DDD: GNU DDD debugger is a graphical front-end for command-line debuggers such as GDB, DBX, WDB, Ladebug, JDB, XDB, the Perl debugger, the bash debugger, or the Python debugger. Documentation: www.gnu.org/software/ddd

A Few Additional Useful Debugging Hints

  • Core Files:
    • It is quite likely that your shell's core file size setting may limit the size of a core file so that it is inadequate for debugging, especially with TotalView.
    • To check your shell's limit settings, use either the limit (csh/tcsh) or ulimit -a (sh/ksh/bash) command. For example:

        limit
        cputime      unlimited
        filesize     unlimited
        datasize     unlimited
        stacksize    unlimited
        coredumpsize 16 kbytes
        memoryuse    unlimited
        vmemoryuse   unlimited
        descriptors  1024
        memorylocked 7168 kbytes
        maxproc      1024

        ulimit -a
        address space limit (kbytes)   (-M)  unlimited
        core file size (blocks)        (-c)  32
        cpu time (seconds)             (-t)  unlimited
        data size (kbytes)             (-d)  unlimited
        file size (blocks)             (-f)  unlimited
        locks                          (-L)  unlimited
        locked address space (kbytes)  (-l)  7168
        nofile                         (-n)  1024
        nproc                          (-u)  1024
        pipe buffer size (bytes)       (-p)  4096
        resident set size (kbytes)     (-m)  unlimited
        socket buffer size (bytes)     (-b)  4096
        stack size (kbytes)            (-s)  unlimited
        threads                        (-T)  not supported
        process size (kbytes)          (-v)  unlimited

  • To override your default core file size setting, use one of the following commands:
csh/tcshunlimit
-or-
limit coredumpsize 64
sh/ksh/bashulimit -c 64
  • Some users have complained that for many-process jobs, they actually don't want core files or only want small core files because normal core files can fill up their disk space. The limit (csh/tcsh) or ulimit -c (sh/ksh/bash) commands can be used as shown above to set smaller / zero sizes.
  • Add the sinfo and squeue commands to your batch scripts to assist in diagnosing problems. In particular, these commands will document which nodes your job is using.
  • Also add the -l option to your srun command so that output statements are prepended with the task number that created them.
  • Be sure to check the exit status of all I/O operations when reading or writing files in Lustre. This will allow you to detect any I/O problems with the underlying OST servers.
  • If you know/suspect that there are problems with particular nodes, you can use the srun -x option to skip these nodes. For example: srun -N12 -x "cab40 cab41" -ppdebug myjob

Performance Analysis Tools

We Need a Book!

  • The subject of Performance Analysis Tools is far too broad and deep to cover here. Instead, a few pointers are being provided for those who are interested in further research.
  • The first place to check are the "Development Environment Software" web pages at:  hpc.llnl.gov/software/development-environment-software for what may be available here. Some example tools are listed below.

Memory Correctness Tools

Memcheck: Valgrind's Memcheck tool detects a comprehensive set of memory errors, including reads and writes of unallocated or freed memory and memory leaks.

TotalView: Allows you to stop execution when heap API problems occur, list memory leaks, paint allocated and deallocated blocks, identify dangling pointers, hold onto deallocated memory, graphically browse the heap, identify the source line and stack backtrace of an allocation or deallocation, summarize heap use by routine, filter and dump heap information, and review memory usage by process or by library.

memP: The memP tool provides heap profiling through the generation of two reports: a summary of the heap high-water-mark across all processes in a parallel job as well as a detailed task-specific report that provides a snapshot of the heap memory currently in use, including the amount allocated at specific call sites.

Intel Inspector: Primarily a thread correctness tool, but memory debugging features are included.

Profiling, Tracing, and Performance Analysis

Open|SpeedShop: Open|SpeedShop is a comprehensive performance tool set with a unified look and feel that covers most important performance analysis steps. It offers various different interfaces, including a flexible GUI, a scripting interface, and a Python class. Supported experiments include profiling using PC sampling, inclusive and exclusive execution time analysis, hardware counter support, as well as MPI, I/O, and floating point exception tracing. All analysis is applied on unmodified binaries and can be used on codes with MPI and/or thread parallelism.

TAU: TAU is a robust profiling and tracing tool from the University of Oregon that includes support for MPI and OpenMP. TAU provides an instrumentation API, but source code can also be automatically instrumented and there is support for dynamic instrumentation as well. TAU is generally viewed as having a steep learning curve, but experienced users have applying the tool with good results at LLNL. TAU can be configured with many feature combinations. If the features you are interested in are not available in the public installation, please request the appropriate configuration through the hotline. TAU developer response is excellent, so if you are encountering a problem with TAU, there is a good chance it can be quickly addressed.

HPCToolkit: HPCToolkit is an integrated suite of tools for measurement and analysis of program performance on computers ranging from multicore desktop systems to the largest supercomputers. It uses low overhead statistical sampling of timers and hardware performance counters to collect accurate measurements of a program's work, resource consumption, and inefficiency and attributes them to the full calling context in which they occur. HPCToolkit works with C/C++/Fortran applications that are either statically or dynamically linked. It supports measurement and analysis of serial codes, threaded codes (pthreads, OpenMP), MPI, and hybrid (MPI + threads) parallel codes.

mpiP: A lightweight MPI profiling library that provides time spent in MPI functions by callsite and stacktrace. This tool is developed and maintained at LLNL, so support and modifications can be quickly addressed. New run-time functionality can be used to generate mpiP data without relinking through the srun-mpip and poe-mpip scripts on Linux and AIX systems.

gprof: Displays call graph profile data. The gprof command is useful in identifying how a program consumes CPU resources. Gprof does simple function profiling and requires that the code be built and linked with -pg. For parallel programs, in order to get a unique output file for each process, you will need to set the undocumented environment variable GMON_OUT_PREFIX to some non-null string. For example: setenv GMON_OUT_PREFIX 'gmon.out.'`/bin/uname -n`

pgprof: PGI profiler - pgprof is a tool which analyzes data generated during execution of specially compiled programs. This information can be used to identify which portions of a program will benefit the most from performance tuning.

PAPI: Portable hardware performance counter library.

PapiEx: A PAPI-based performance profiler that measures hardware performance events of an application without having to instrument the application.

VTune Amplifier: The Intel VTune Amplifier tool is a performance analysis tool for finding hotspots in serial and multithreaded codes. Note the installation on LC machines does not include the advanced hardware analysis capabilities.

Intel Profiler: The Intel Profiler tool is built into the Intel compiler along with a simple GUI to display the collected results.

Vampir / Vampirtrace: Full featured trace file visualizer and library for generating trace files for parallel programs.

Beyond LC

  • Beyond LC, the web offers endless opportunities for discovering tools that aren't available here.
  • In many cases, users can install tools in their own directories if LC doesn't have/support them.

Graphics Software and Resources

Consulting

  • Consulting on scientific visualization issues (demos, classes, getting started, visualization advice)
  • Maintaining a graphics environment that is current and consistent across LC computing platforms
  • Graphics software support
  • Troubleshooting graphics related problems
  • Also available to work with customers to develop custom data visualization applications

Video Production

  • State-of-the-art unclassified and classified facilities
  • 3D modeling and computer animation
  • Video and audio editing; quick turnaround of videos
  • Also, photo quality output and DVD/CDROM creation

Visualization Machine Resources

  • LC provides dedicated clusters for visualization work.
  • Accounts must be requested through the LC Hotline for use of the visualization machines.

Power Walls

Large screen displays for viewing high-resolution images and data comparisons

Running Jobs

Where to Run?

Note:This section only provides a general overview on running jobs on LC systems. Details associated with running jobs are covered in depth in other LC tutorials (Slurm and Moab, MPI, BG/Q, OpenMP, Pthreads, etc.)

Determining Your Job's Requirements

  • The first thing that needs to be done before deciding where to run your jobs is to determine your jobs' requirements.
  • There are several important factors to consider, because not all LC clusters may match your jobs' requirements:
RequirementQuestions?
Machine architecture
  • Sierra? Linux?
  • GPU?
  • Specific processor model?
Number of nodes/cores
  • How many nodes/cores will your jobs require?
  • What job queues are available on a cluster and will your jobs fit into their node limits?
  • Are you running a serial application?
Wall clock run time
  • How much time will your jobs require to complete?
  • What job queues are available on a cluster and will your jobs fit into their time limits?
Memory
  • Will your jobs need more memory than is available on a cluster?
  • May need to use a memory profiling tool to accurately assess your application's requirements.
Communication/network
  • Will your jobs require the ability to communicate between nodes?
Accounts/banks
  • Which machines are you able to get an account on?
  • Which banks are configured and available for you to use on a given machine?

Getting Machine Configuration Information

  • After determining your jobs' requirements, the next step is to see which LC machines match those requirements.
  • LC's production clusters vary widely in their configurations.
  • Fortunately, getting configuration details for any production LC machine is easy to do.
  • See the previous System Status and Configuration Information section for details.

Job Limits

  • For all production clusters, there are defined job limits which vary from cluster to cluster. The primary job limits apply to:
    • How many nodes/cores a job may use
    • How long a job may run
    • How many simultaneous jobs can a user run?
    • What time of the day/week can jobs run?
  • Most job limits are enforced by the batch system
  • Some job limits are enforced by a "good neighbor" policy
  • An easy way to determine the job limits for a machine where you are logged in is to use the command: news job.lim.machinename where machinename is the name of the machine you are logged into.
  • Job limits are also documented on the "MyLC" web pages:
    OCF-CZ: mylc.llnl.gov
    OCF-RZ: rzmylc.llnl.gov
    SCF: lc.llnl.gov/lorenz
    Just click on any machine name in the "machine status" or "my accounts" portlets. Then select the "job limits" tab.
  • Further discussion, and a summary table of job limits for all production machines are available in the Queue Limits section of the Slurm and Moab tutorial.

Accounts and Banks

  • In order to run on a machine, you need a valid login account, and at least one "bank" to charge your usage against.
  • Getting information about your assigned bank(s) is covered in the Banks and Banks and Usage Information sections of the Slurm and Moab tutorial.

Serial vs Parallel

  • Parallel jobs can range in size from a single node (multi-core) to the full system.
  • All but a few of LC's production clusters are intended to be used for parallel jobs.
  • Serial jobs by definition require only one core on a node.
  • Because running serial jobs on parallel clusters would waste compute resources, LC provides several clusters where serial jobs can be run without wasting compute resources.
  • Running serials jobs is discussed here: /training/tutorials/slurm-and-moab#Serial

Dedicated Application Time (DAT)

  • Some LC systems allow users to request dedicated machine use.
  • DAT is primarily for "big runs" that require a larger number of nodes and/or longer time than the normal batch limits permit.
  • During DATs, other user jobs are drained/removed and the batch queue is suspended while the dedicated job runs.
  • DATs are posted to the relevant machine status news list.
  • DATs are requested through the LC Hotline. Forms for requesting "Expedited Priority Runs" are available at: hpc.llnl.gov/accounts/forms.

Batch Versus Interactive

Interactive Jobs

  • Refers to jobs that you launch and interact with real-time from the command line.
  • Don't use Login Nodes (in most cases):
    • Login nodes are shared, oftentimes by many users.
    • Intended for non-cpu intensive interactive tasks, such as editing, working with files, running GUIs, browsers, debuggers, etc.
    • Not intended for running CPU-intensive production jobs, which can negatively impact other users.
  • Most clusters have a pdebug partition of compute nodes dedicated to small, short, interactive jobs. Another common interactive partition is pvis.
  • There are several ways to run interactive jobs on LC's Linux clusters, as described below.
  • Method 1: Use srun on a login node specifying compute nodes in the pdebug or other interactive partition.
    • To view the partitions on a cluster use the sinfo -s command. It shows which partitions are configured, their time limit and how the nodes are currently being used (Allocated / Idle / Other / Total). For example, on Quartz note the pdebug partition:
      % sinfo -s
      PARTITION AVAIL  TIMELIMIT   NODES(A/I/O/T)  NODELIST
      pdebug       up      30:00       20/26/0/46  quartz[1-46]
      pbatch*      up 1-00:00:00  2865/52/13/2930  quartz[47-186,193-378,...,2887-3072]
      pall       down   infinite  2885/78/13/2976  quartz[1-186,193-378,...,2887-3072]
      
    • Then use the srun command to run the job on compute nodes in pdebug. For example, to run your executable on 2 pdebug nodes using 16 tasks:
      % srun -p pdebug -N 2 -n 16 myexe
  • Method 2: Use salloc to acquire compute nodes for interactive use.
    • On a login node, request an allocation of compute nodes, and optionally the partition to acquire them from. The default is usually pbatch, but small short jobs usually get better turnaround time in pdebug. For example, to request 2 pdebug compute nodes:
      % salloc -p pdebug -N 2
      salloc: Pending job allocation 3891090
      salloc: job 3891090 queued and waiting for resources
      salloc: job 3891090 has been allocated resources
      salloc: Granted job allocation 3891090
    • After your allocation is granted, you will be placed on the first node of the allocation. You can verify this as follows:
      % squeue -u joeuser
        JOBID PARTITION     NAME      USER ST       TIME  NODES NODELIST(REASON)
      3891090    pdebug       sh   joeuser  R       0:07      2 quartz[1-2]
      % hostname
      quartz1
    • You can now run whatever you'd like interactively on your job's compute nodes.
  • Method 3: Use sxterm / mxterm to acquire compute nodes for interactive use.
    • Similar to salloc, but different because it will actually "behind the scenes" create a batch job for your request, queue it and then when it starts to run on a compute node, pop open an xterm window on your desktop.
    • For example requesting 1 node / 1 task / for 30 minutes in the pdebug partition:
      % sxterm 1 1 30 -p pdebug
      Submitted batch job 3048379
    • Wait until the xterm window appears on your desktop. You will be on the first node of your job's allocation and can run whatever you'd like interactively.
    • There's no man page - just invoke "sxterm" with no options and a usage message will display.
    • The mxterm command is similar, with different options. Just invoke "mxterm" for usage information.
  • Method 4: Submit a batch job and then rsh/ssh to the compute nodes where it is running.
    • You can submit a batch job (covered next) that really doesn't do anything except "sleep".
    • The squeue command can be used to show when and where your job is running.
    • After it starts running, you can rsh or ssh to any of its compute nodes.
    • Once on a compute node, you can run whatever you'd like interactively.
  • Important usage notes about the pdebug partition:
    • As the name pdebug implies, interactive jobs should be short, small debugging jobs, not production runs.
    • Shorter time limit
    • Fewer number of nodes permitted
    • There is usually a "good neighbor" policy in effect - don't monopolize the queue or setup streams of jobs

Batch Jobs

Note:This section only provides a quick summary of batch usage on LC's Linux clusters. For details, see the Slurm and Moab tutorial. For CORAL/Sierra systems see the Sierra tutorial.

  • Typically, most of a cluster's compute nodes are configured into a pbatch partition.
  • The pbatch partition is intended for production work:
    • Longer time limits
    • Larger number of nodes per job
    • Limits enforced by batch system rather than "good neighbor" policy
  • The pbatch partition is managed by the workload manager
  • Batch jobs must be submitted in the form of a job control script. Example batch job script:
#!/bin/tcsh
##### These lines are for Slurm
#SBATCH -N 16
#SBATCH -J parSolve34
#SBATCH -t 2:00:00
#SBATCH -p pbatch
#SBATCH --mail-type=ALL
#SBATCH -A myAccount
#SBATCH -o /p/lustre1/joeuser/par_solve/myjob.out

##### These are shell commands
date
cd /p/lustre1/joeuser/par_solve
##### Launch parallel job using srun - assumes 36 cores/node
srun -n576 a.out
echo 'Done'
  • Example batch job script submission commands:
sbatch myjobscript
sbatch myjobscript -p ppdebug -A mic
sbatch myjobscript -t 45:00
  • After successfully submitting a job, you may then check its progress and interact with it (hold, release, alter, terminate) by means of other batch commands - discussed in the Interacting with Jobs section.
  • Some clusters have additional partitions permitting batch jobs.
  • Interactive use of pbatch nodes is facilitated by using the mxterm command - discussed in the Where to Login section.
  • Interactive debugging of batch jobs is possible - covered in the Debuggers section.

Starting Jobs - srun

Note:This section only provides a quick summary of batch usage on LC's Linux clusters. For details, see the Slurm and Moab tutorial.

The srun Command

  • The SLURM srun command is required to launch parallel jobs - both batch and interactive.
  • It should also be used to launch serial jobs in the pdebug and other interactive queues.
  • Syntax:

    srun   [option list]   [executable]   [args]

    Note that srun options must precede your executable.
  • Interactive use example, from the login node command line. Specifies 2 nodes (-N), 16 tasks (-n) and the interactive pdebug partition (-p):
% srun -N2 -n16 -ppdebug myexe
  • Batch use example requesting 16 nodes and 256 tasks (assumes nodes have 16 cores):
    • First create a job script that requests nodes via #SBATCH -N and uses srun to specify the number of tasks and launch the job.
#!/bin/csh
#SBATCH -N 16
#SBATCH -t 2:00:00
#SBATCH -p pbatch
# Run info and srun job launch
cd /p/lscratch3/joeuser/par_solve
srun -n256 a.out
echo 'Done'
  • Then submit the job script from the login node command line:
% sbatch myjobscript
  • Primary differences between batch and interactive usage:
DifferenceInteractiveBatch
Where usedFrom login node command lineIn batch script
PartitionRequires specification of an interactive partition, such as pdebug with the -p flagpbatch is default
SchedulingIf there are available interactive nodes, job will run immediately. Otherwise, it will queue up (fifo) and wait until there are enough free nodes to run it.The batch scheduler handles when to run your job regardless of the number of nodes available.
  • More Examples:
srun -n64 -ppdebug my_app64 process job run interactively in pdebug partition
srun -N64 -n512 my_threaded_app512 process job using 64 nodes. Assumes pbatch partition.
srun -N4 -n32 -c2 my_threaded_app4 node, 32 process job with 2 cores (threads) per process. Assumes pbatch partition.
srun -N8 my_app8 node job with a default value of one task per node (8 tasks). Assumes pbatch partition.
srun -n128 -o my_app.out my_app128 process job that redirects stdout to file my_app.out. Assumes pbatch partition.
srun -n32 -ppdebug -i my.inp my_app32 process interactive job; each process accepts input from a file called my.inp instead of stdin
  • Behavior of srun -N and -n flags - using 4 nodes in batch, each of which has 16 cores:

srun Options

  • srun is a powerful command with @100 options affecting a wide range of job parameters.
  • For example:
    • Accounting
    • Number and placement of processes/threads
    • Process/thread binding
    • Job resource requirements; dependencies
    • Mail notification options
    • Input, output options
    • Time limits
    • Checkpoint, restart options
    • and much more....
  • Some srun options may be set via @60 SLURM environment variables. For example, SLURM_NNODES behaves like the -N option.
  • See the srun man page for details.

Interacting with Jobs

Note:This section only provides a quick summary of batch usage on LC's Linux clusters. For details, see the Slurm and Moab tutorial. For CORAL/Sierra systems see the Sierra tutorial.

Monitoring Jobs and Displaying Job Information

  • There are several different job monitoring commands. Some are based on Moab, some on Slurm, and some on other sources.
  • The more commonly used job monitoring commands are summarized in the table below with links to additional information and examples.
CommandDescription
squeueDisplays one line of information per job by default. Numerous options.
showqDisplays one line of information per job. Similar to squeue. Several options.
mdiag -jDisplays one line of information per job. Similar to squeue.
mjstatSummarizes queue usage and displays one line of information for active jobs.
checkjob jobidProvides detailed information about a specific job.
sprio -l
mdiag -p -v
Displays a list of queued jobs, their priority, and the primary factors used to calculate job priority.
sviewProvides a graphical view of a cluster and all job information.
sinfoDisplays state information about a cluster's queues and nodes

Holding / Releasing Jobs

  • Jobs can be placed "on hold" when they are submitted.
  • They can also be placed on hold while they are waiting to run.
  • Held jobs can then be released to run later
  • More information/examples: see Holding and Releasing Jobs in the Slurm and Moab tutorial.
CommandDescription
sbatch -H jobscript
msub -h jobscript
Put job on hold when it is submitted
scontrol hole jobid
mjobctl -h jobid
Place a specific idle jobid on hold
scontrol release jobid
mjobctl -u jobid
Release a specific held jobid

Modifying Jobs

  • After a job has been submitted, certain attributes may be modified using the scontrol update and mjobctl -m commands.
  • Examples of parameters that can be changed: account, queue, job name, wall clock limit...
  • More information/examples: see Changing Job Parameters in the Slurm and Moab tutorial.

Terminating / Canceling Jobs

  • Interactive srun jobs launched from the command line should normally be terminated with a SIGINT (CTRL-C):
    • The first CTRL-C will report the state of the tasks
    • A second CTRL-C within one second will terminate the tasks
  • For batch jobs, the mjobctl -c and canceljob commands can be used.
  • More information/examples: see Canceling Jobs in the Slurm and Moab tutorial.

Other Topics of Interest

Optimizing Core Usage

  • Fully utilizing the cores on a node requires that you use the right combination of srun and Slurm/Moab options, depending upon what you want to do and which type of machine you are using.
  • MPI only: for example, if you are running on a cluster that has 36 cores per node, and you want your job to use all 36 cores on 4 nodes (36 MPI tasks per node), then you would do something like:
InteractiveBatch
srun -n144 -ppdebug a.out#SBATCH -N 4
srun -n144 a.out
  • MPI with Threads: If your MPI job uses POSIX or OpenMP threads within each node, you will need to calculate how many cores will be required in addition to the number of tasks. For example, running on a cluster having 16 cores per node, an 8-task job where each task creates 4 OpenMP threads, would need a total of 32 cores, or 2 nodes:
8 tasks * 4 threads / 16 cores/node = 2 nodes
InteractiveBatch
srun -N4 -n8 -ppdebug a.out#SBATCH -N 4
srun -N4 -n8 a.out
  • You can include multiple srun commands within your batch job command script. For example, suppose that you were conducting a scalability run on a 36 core/node Linux cluster. You could allocate the maximum number of nodes that you would use with #SBATCH -N and then have a series of srun commands that use varying numbers of nodes:
#SBATCH -N 8

srun -N1 -n36 myjob
srun -N2 -n72 myjob
srun -N3 -n108 myjob
....
srun -N8 -n228 myjob

Diskless Nodes

  • Because LC's Linux clusters employ a 64-bit architecture, 16 exabytes of memory can be addressed - which is about 4 billion times more than 4 GB limit of 32-bit architectures. By current standards, this is virtually unlimited memory.
  • In reality, machines are usually configured with only GBs of memory, so any address access that exceeds physical memory will result (on most systems) with paging and degraded performance.
  • However, LC machines are diskless and have no paging space.
  • This has very important implications for programs that exceed physical memory. For example, most compute nodes have only 16-64 GB of physical memory.
  • Programs that exceed physical memory will terminate with an OOM (out of memory) error and/or segmentation fault.

Process/Thread Binding to Cores

  • By default, jobs run on LC's Linux clusters have their processes bound to the available cores on a node. For example:
% srun -l -n2 numactl -s | grep physc | sort
0: physcpubind: 0 1 2 3 4 5 6 7
1: physcpubind: 8 9 10 11 12 13 14 15

% srun -l -n4 numactl -s | grep physc | sort
0: physcpubind: 0 1 2 3
1: physcpubind: 4 5 6 7
2: physcpubind: 8 9 10 11
3: physcpubind: 12 13 14 15

% srun -l -n8 numactl -s | grep physc |sort
0: physcpubind: 0 1
1: physcpubind: 2 3
2: physcpubind: 4 5
3: physcpubind: 6 7
4: physcpubind: 8 9
5: physcpubind: 10 11
6: physcpubind: 12 13
7: physcpubind: 14 15
  • Binding processes to cores may improve performance by keeping cache local to cores.
  • If a process is multi-threaded (such as with OpenMP), the threads will run on any of the cores bound to their process.
  • To bind an OpenMP thread to a single core, the OMP_PROC_BIND environment variable can be used (set to "TRUE").
  • Additionally, LC provides a couple useful utilities for binding processes and threads to cores:

Vectorization

  • Historically, the Xeon architecture has provided support for SIMD (Single Instruction Multiple Data) vector instructions through Intel's Streaming SIMD Extensions (SSE, SSE2, SSE3, SSE4) instruction sets.
  • AVX - Advanced Vector Extension instruction set (2011) improved on SSE instructions by increasing the size of the vector registers from 128-bits to 256-bits. AVX2 (2013) offered further improvements, such as fused multiply-add (FMA) instructions to double FLOPS.
  • The primary purpose of these instructions is to increase CPU throughput by performing operations on vectors of data elements, rather than on single data elements. For example:
Instruction SetSingle-precision
Flops/Clock
Double-precision
Flops/Clock
SSE484
AVX168
AVX23216
  • Sandy Bridge-EP (TLCC2) processors support SSE and AVX instructions.
  • Broadwell (CTS-1) processors support SSE, AVX and AVX2.
  • To take advantage of the potential performance improvements offered by vectorization, all you need to do is compile with the appropriate compiler flags. Some recommendations are shown in the table below.
CompilerSSE FlagAVX FlagReporting
Intel-vec (default)-axAVX
-axCORE-AVX2
-qopt_report
PGI-Mvect=simd:128-Mvect=simd:256-Minfo=all
GNU-O3-mavx
-mavx2
-fopt-info-all

Hyper-threading

  • On Intel processors, hyper-threading enables 2 hardware threads per core.
  • Hyper-threading benefits some codes more than others. Tests performed on some LC codes (pF3D, IMC, Ares) showed improvements in the 10-30% range. Your mileage may vary.
  • On TOSS 3 systems, hyper-threading is turned on by default. Details are available at: lc.llnl.gov/confluence/display/TCE/Hyper-Threading (authentication required).

Clusters Without an Interconnect (serial and single-node jobs)

  • The agate, borax and rztrona clusters fall into this category.
  • Limited to serial and single-node parallel jobs.
  • Multiple users and jobs can run on a single node (shared node).
  • Jobs are allocated ONE core by default. For this reason, it is very important that you tell the scheduler how many cores your job actually requires:
    • For batch jobs, be sure to include in your job script:
      • Slurm: #SBATCH -n #cores
      • Moab: #MSUB -l ttc=#cores
    • For interactive jobs, be sure to use the srun -c flag.
  • MPI jobs: these are started with the srun command as described previously, but because there is no interconnect, jobs are limited to a single node and communications are done in shared memory.
  • Pthreads and OpenMP jobs can be run as usual - keeping in mind the required number of cores per node, and that a node may be shared with other users.
  • Important: Please see the Running on Serial Clusters section of the Moab and SLURM tutorial for further discussion on running jobs on these clusters.

Batch Systems

  • The batch system used depends upon the cluster:
    • Linux TOSS 3 clusters: Run Slurm stand-alone with Moab wrappers
    • CORAL Early Access clusters: Run Spectrum LSF with no wrappers
  • Because there is so much to discuss, we won't say much more here.
  • The Slurm and Moab batch systems are covered in-depth in LC's Slurm and Moab tutorial and Running Jobs.
  • Spectrum LSF is covered under Running Jobs.

Miscellaneous Topics

Clusters with GPUs

Big Data at LC

  • Recently "Big Data" cluster computing frameworks have become popular. There has been interest in running some frameworks, such as Hadoop and Spark, on LC clusters.
  • LC clusters differ from most "Big Data" cluster architectures:
    • LC uses a different scheduler/resource manager (Moab/Slurm)
    • LC uses a networked file system (Lustre)
    • Most LC systems have no local disk
  • LC provides Magpie, a set of scripts to provide a framework to run several popular "Big Data" frameworks on LC systems.
    • Setup a Hadoop, Spark, etc. environment to run Big Data jobs
    • Work with LC file systems and scheduler/resource manager
    • Hide a number of LC-isms from users
    • Provide secure mechanisms to run within LC environment
    • Magpie instructions and information can be found on LC Confluence: lc.llnl.gov/confluence/display/BigData/Magpie+Guide
  • Catalyst is a special LC cluster dedicated to Big Data research and testing.

Green Data Oasis (GDO)

  • Green Data Oasis (GDO) is a large data store (1.4 PB as of Nov 2019) on the LLNL open network.
  • Purpose:
    • Facilitate the sharing of scientific data with external collaborators by providing an easy way to share data
    • Designed as a data portal and is not intended for doing data analysis or other CPU-intensive activities.
  • LC managed program for the Multiprogrammatic and Institutional Computing (M&IC) Initiative and Laboratory Science and Technology Office (LSTO).
  • More information: hpc.llnl.gov/data-vis/green-data-oasis

Security Reminders

  • Follow procedures when escorting into limited areas
  • Restrictions on cell phones, cameras, lap-tops and other items
  • Separation between classified and unclassified equipment, including media
  • Never share your password/account with anyone (duh!!). SCF passwords should be regarded as Secret Restricted Data (SRD)
  • Keep classified information off unclassified systems - including email
  • Ask an Authorized Derivative Classifier (ADC) if you're not sure
  • LC can assist with the disposal of classified data
  • For the full story, visit the LLNL security web page located at: security-r.llnl.gov (LLNL internal)

Where to Get Information & Help

LC Hotline

  • The LC Hotline staff provide walk-in, phone and email assistance weekdays 8:00am - noon, 1:00pm - 4:45pm.
  • Walk-in Consulting:
    • On-site users can visit the LC help desk consultants in Building 453, Room 1103.
    • Note that this is a Q-clearance area.
    • Need a map?
  • Phone:
    • (925) 422-4531 - Main number
    • 422-4532 - Direct phone line for technical consulting help
    • 422-4533 - Direct phone line for support help (accounts, passwords, forms, etc)
  • Email:
 

LC Users Home Page: hpc.llnl.gov

  • hpc.llnl.gov: LC maintains extensive web documentation for all systems and also for computing in the LC environment:
  • A few highlights:
    • Accounts - how to request an account; forms
    • Access Information - how to access and login to LC systems
    • Training - online tutorials and workshops
    • Compute Platforms - complete list with details for all LC systems
    • Machine Status - shows current OCF machines status with links to detailed information such as MOTD, currently running jobs, configuration, announcements, etc.
    • Software - including compilers, tools, debuggers, visualization, math libs
    • Running Jobs - including using LC's workload managers
    • Documentation - user manuals for a range of topics, technical bulletins, user meetings slides
    • Getting Help - how to contact the LC Hotline
  • Some web pages are password protected. If prompted to enter a userid/password, use your OTP login.
  • Some web pages may only be accessed from LLNL machines or by using one of the LC Remote Access Services covered previously.

Lorenz User Dashboard: mylc.llnl.gov

  • Provides a wealth of real-time information in a user-friendly dashboard
  • Simply enter "mylc" into your browser's address bar. The actual URL is: lc.llnl.gov/lorenz/mylc/mylc.cgi
  • Click on the screenshot at right to see a larger version. Note: If your browser squeezes the large image into a single window, try zooming to get more detail.

Login Banner

  • Login banner / MOTD may be very important!
    • News topics for LC, for the login system
    • Some configuration information
    • Useful references and contact information
    • System status information
    • Quota and password expiration warnings also appear when you login

News Items

  • News postings on each LC System:
    • Unread news items appear with login messages
    • news -l - list all news items
    • news -a - display content of all news messages
    • news -n - lists unread messages
    • news -s - shows number of unread items
    • news item - shows specified news item
    • You can also list/read the files in /var/news on any system. This is useful when your searching for a topic you've already read and can't remember the news item name. You can also "grep" on these files.

Machine Email Lists

  • Machine status email lists exist for all LC machines
  • Provide important, timely information not necessarily announced elsewhere
  • ocf-status@llnl.gov and scf-status@llnl.gov are general lists for all users
  • Plus each machine has its own list, for example: zin-status@llnl.gov.
  • The LC Hotline initially populates a list with subscribers, but you can subscribe/unsubscribe yourself anytime using the listserv.llnl.gov website.

LC User Meetings

  • When held, is usually scheduled for the first Tuesday of the month at 9:30 am
  • Building 132 Auditorium (or as otherwise announced)
  • Agenda and viewgraphs on LC Home Page (hpc.llnl.gov) See "Documentation" and look for "User Meeting Viewgraphs". Note that these are LLNL internal web pages.

Exercise 2

Compiling, running, job and system status information:

  • Get information about running and queued jobs
  • Get compiler information
  • Compile and run serial programs
  • Compile and run parallel MPI and OpenMP programs, both interactively and in batch
  • Check hyper-threading
  • Get online system status information (and more)