LocalLLaMA

3 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago

MODERATORS

How can I make it easy for players playing a game in pygame that performs api calls to localhost oobabooga and extracts the generated text to include it in the game for NPCs? (alien.top)

submitted 11 months ago by swagonflyyyy@alien.top to c/localllama@poweruser.forum

3 comments fedilink hide all child comments

Hey guys,

I'm running the quantized version of mistral-7B-instruct and its pretty fast and accurate for my use case. On my PC I'm generating approximately 4 tokens per second with the idea of generating one-sentence responses for my NPC characters, which is good enough for what I need.

After fiddling around with oobabooga a bit I found out that you can perform API calls on localhost and print out the text, which is exactly what I need for this to work.

The issue I'm running into here is that if I were to make a game with AI-generated content, how can I make it easy for players to run their own localhost and perform api calls in the game this way? I feel like for the unexperienced, setting all this up would be a nightmare for them and I don't want to alienate non-tech savvy players.

you are viewing a single comment's thread
view the rest of the comments

[–] DarthNebo@alien.top 1 points 11 months ago

The fastest way would be to ingest the ggerganov server.cpp module & make HTTP calls to it. Way easier to package into other apps & supports parallel decoding with 30tok/s on Apple Silicon(M1 Pro)