LLamaMe: SCF LLM Additions

We are excited to announce that we have expanded the selection of on-prem large language models (LLMs) available through the LLamaMe service in the SCF. Existing LLamaMe API keys provisioned through LaunchIT now provide access to all of the following models in the SCF.

For more information, visit the LLamaMe documentation.

SCF LLamaMe Models

Large Language Model	Max Content Length	Infrastructure GPU
meta-llama/Llama-3.3-70B-Instruct	110000	2 H100 80GB RAM
gpt-oss-20b	128000	1 H100 80GB RAM
intfloat/e5-mistral-7b-instruct*	32768	1 H100 80GB RAM
Meta-Llama-3.1-8B-Instruct	32768	24 AMD MI250 120GB
Llama-4-Scout-17B-16E-Instruct	128000
gpt-oss-120b	131072

*embedding model