We are excited to announce that we have expanded the selection of on-prem large language models (LLMs) available through the LLamaMe service in the SCF. Existing LLamaMe API keys provisioned through LaunchIT now provide access to all of the following models in the SCF.

For more information, visit the LLamaMe documentation.

SCF LLamaMe Models

Large Language Model Max Content Length Infrastructure GPU
meta-llama/Llama-3.3-70B-Instruct 110000 2 H100 80GB RAM
gpt-oss-20b 128000 1 H100 80GB RAM
intfloat/e5-mistral-7b-instruct* 32768 1 H100 80GB RAM
Meta-Llama-3.1-8B-Instruct 32768 24 AMD MI250 120GB
Llama-4-Scout-17B-16E-Instruct 128000
gpt-oss-120b 131072

*embedding model