Start here Preferred, policy-aligned options (recommended)
Use Option 1 or Option 2 first. They are designed to address security controls, policy compliance, and operational support with minimal user burden.
Option 1 (Preferred): LivAI Endpoints
What
Centralized, managed access to commercial and advanced LLMs (e.g. OpenAI) via a secure API.
How to access
Pros
-
Fast setup
-
Access to recent commercial models
-
Institution tracks which models meet policy guidelines
-
Usage tracking and budget management
-
Access a subset of models and LLNL data interactively via the LivChat interface
-
Caveats / constraints:
-
$500/year/user cap (can request more for projects)
-
CUI only, no PII/classified data allowed
Option 2 (Preferred): LLamaMe (LC-Hosted Open Weight LLMs)
What
Locally hosted open-source LLMs (for example, Llama, Mistral, Codestral) available in LC’s Collaboration, Restricted, and Secure zones.
How to access
-
Request API key via LaunchIT catalog
Pros
-
Fast Setup
-
Data stays within LC zones: process data up to the classification supported by each zone.
-
LC tracks which models meet policy guidelines
-
No additional spend
-
Good for workflows requiring open-source models
Caveats / constraints
-
Rate limits (default: 20 requests/min)
-
Keys expire every 30 days
Option 3 (Fallback): Self-Hosting on LC Compute Nodes
Use this approach only when the preferred options cannot satisfy a specific technical requirement (for example, custom model builds, unusual model sizes, specialized frameworks, or multi-node inference). This option has the most user responsibility and should be done in coordination with your ISSO.
What
Run your own LLMs on LC batch compute nodes using tools like vLLM, Llama.cpp, or Mastodon for multi-node, multi-GPU distributed inference.
How to get started
-
User responsibility: You must ensure any model you download, or run is approved for use at LLNL. (See: LC LLM Model Download Decision Guide)
-
Pick an inference framework, e.g.: vLLM or Llama.cpp
-
Work with your ISSO on model approval and any security or policy questions
Pros
-
Full control over models, parameters, and data
-
Supports very large models (multi-node, multi-GPU)
-
Caveats / constraints:
-
Requires local exclusive resources like an LC batch compute node
-
Do not use shared resources, like a login node, when running services that open ports
-
You manage toolchain, containers, dependencies, and scaling
-
You monitor which models meet policy requirements, including approvals being revoked
-
Contact your ISSOs if you have questions about model approval
