We are excited to announce that we have expanded the selection of on-prem large language models (LLMs) available through the LLamaMe service in the SCF. Existing LLamaMe API keys provisioned through LaunchIT now provide access to all of the following models in the SCF.
For more information, visit the LLamaMe documentation.
SCF LLamaMe Models
| Large Language Model | Max Content Length | Infrastructure GPU |
|---|---|---|
| meta-llama/Llama-3.3-70B-Instruct | 110000 | 2 H100 80GB RAM |
| gpt-oss-20b | 128000 | 1 H100 80GB RAM |
| intfloat/e5-mistral-7b-instruct* | 32768 | 1 H100 80GB RAM |
| Meta-Llama-3.1-8B-Instruct | 32768 | 24 AMD MI250 120GB |
| Llama-4-Scout-17B-16E-Instruct | 128000 | |
| gpt-oss-120b | 131072 |
*embedding model
