this post was submitted on 08 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

Hi all, admit I didn't pay much attention to OpenAI's dev day, so I got tripped up this evening when I did a virtual env refresh, and all my local LLM access code broke. Turns out they massively revved their API. This is mostly news for folks like me who either maintain an LLM-related project, or who just prefer to write their own API access clients. I think we probably have enough of us here to share some useful notes—I see ya'll posting Python code now and then.

Anyway, the best news is that they deprecated the nasty old "hack this imported global resource approach" in favor of something more encapsulated.

Here's a quick example of using the updated API with a local LLM (at localhost). Works with my llama-cpp-python hosted LLM. Their new API docs are a bit on the sparse side, so I had to do some spelunking in the upstream code to straighten it all out.

from openai import OpenAI
client = OpenAI(api_key='dummy', base_url='http://127.0.0.1:8000/v1/')
chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Say this is a test",
        }
    ],
    # Just use whatever llama-cpp-python or whatever mounts for the model
    model="dummy",
)
print(chat_completion.choices[0].message.content)

Final line prints just the actual first choice response message text you might be familiar with from the underlying JSON structure.

I'll continue to maintain notes in this ticket as I update OgbujiPT (open source client-side LLM toolkit), but I'll also update this thread with any other really interesting bits I come across.

you are viewing a single comment's thread
view the rest of the comments
[–] DreamGenX@alien.top 1 points 1 year ago

I have been using the Python API client 1.0 preview version (which was just released) for some time with vLLM OpenAI compatible server and it worked well -- at least I did not notice any issues.