swagonflyyyy

joined 10 months ago
[–] swagonflyyyy@alien.top 1 points 9 months ago

Inb4 The Bloke Quantizes it to about 100B size.

[–] swagonflyyyy@alien.top 1 points 9 months ago (2 children)

Mistral-7B-Instruct 4_K quant and openhermes2.5-7B-mistral 4_K quant. Still testing the waters but starting with these two first.

[–] swagonflyyyy@alien.top 1 points 9 months ago

Hermes is the messenger of the gods. It is a metaphor. I'm sure the rest of them have their own meaning as well.

[–] swagonflyyyy@alien.top 1 points 10 months ago (1 children)

Mistral 7B instruct can get you pretty far. Even the quantized model has been pretty useful for me.

 

Hey guys,

I'm running the quantized version of mistral-7B-instruct and its pretty fast and accurate for my use case. On my PC I'm generating approximately 4 tokens per second with the idea of generating one-sentence responses for my NPC characters, which is good enough for what I need.

After fiddling around with oobabooga a bit I found out that you can perform API calls on localhost and print out the text, which is exactly what I need for this to work.

The issue I'm running into here is that if I were to make a game with AI-generated content, how can I make it easy for players to run their own localhost and perform api calls in the game this way? I feel like for the unexperienced, setting all this up would be a nightmare for them and I don't want to alienate non-tech savvy players.

[–] swagonflyyyy@alien.top 1 points 10 months ago

Huggingface transformers has such models available.

[–] swagonflyyyy@alien.top 1 points 10 months ago

I barely search anymore, unless its images or websites or something. I've found a lot of use for Bing chat despite the hate it gets. It gives me a lot of utility value.