-
how to use a online LLM API,instead of local vllm loaded. |
Beta Was this translation helpful? Give feedback.
Answered by
cpfiffer
Dec 2, 2024
Replies: 1 comment
-
Please review this doc. vLLM is OpenAI compliant, meaning you can just use the class Testing(BaseModel):
"""
A class representing a testing schema.
"""
name: str
age: int
openai_client = openai.OpenAI(
base_url="http://0.0.0.0:1234/v1",
api_key="dopeness"
)
# Make a request to the local LM Studio server
response = openai_client.beta.chat.completions.parse(
model="hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF",
messages=[
{"role": "system", "content": "You are like so good at whatever you do."},
{"role": "user", "content": "My name is Cameron and I am 28 years old. What's my name and age?"}
],
response_format=Testing
) |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
cpfiffer
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Please review this doc. vLLM is OpenAI compliant, meaning you can just use the
openai
python library and use a differentbase_url
for whatever your inference server is.