LLamaMe is LC's locally hosted LLM service. LC now provides limited availability API access to locally hosted open-source LLMs served using the vLLM library in the CZ and RZ. Note that hosted models are subject to change as they may be upgraded in the future.
Getting a LLamaMe API Key
Provision an API key to access the LLamaMe endpoint through LaunchIT catalog. For further information please visit our documentation on LaunchIT.
Once in the LaunchIT catalog, select the workspace for the project you will be using API access for. Note that keys can also be directly provisioned from a workspace.
![Persistent Data Services](/sites/default/files/styles/large/public/2025-02/Screenshot%202025-01-10%20at%201.07.32%E2%80%AFPM.png?itok=8eIwzffD)
Once your API key has been created, you may access it at any time through your workspace dashboard. Your LLamaMe will be listed as a separate resource under your workspace dashboard, and the LLamaMe endpoint and model you have provisioned a key for will be displayed along with the key.
![](/sites/default/files/styles/large/public/2025-02/Screenshot%202025-02-07%20at%204.49.23%E2%80%AFPM.png?itok=szUF3A2P)
Note that keys expire every 30 days and must be regenerated to maintain your access to the LLamaMe API. API keys may be regenerated at any time through your LaunchIT workspace dashboard.
![](/sites/default/files/styles/large/public/2025-02/Screenshot%202025-02-07%20at%204.46.43%E2%80%AFPM.png?itok=WieUb4ZP)
Getting Started with the LLamaMe API
Set your API_KEY as an environment variable (can be copied from your LaunchIT workspace dashboard).
export API_KEY = <your API key>
Here's an example Python script to check which model is being hosted and ask the LLM to tell you a joke. Replace the endpoint and model with the endpoint and model displayed in your LaunchIT workspace dashboard.
import os import OpenAI API_KEY = os.environ.get("API_KEY") client = OpenAI(base_url=<LLamaMe endpoint>, api_key=API_KEY) # Check which LLModel LC is hosting print(client.models.list()) chat_response = client.chat.completions.create( model="<LLamaMe model>", messages=[ {"role": "user", "content": "Tell me a joke."}, ] ) print("Chat response:", chat_response) # Enjoy!