overview for Charuru

Venus-120b: A merge of three different models in the style of Goliath-120b in c/localllama@poweruser.forum

[–] Charuru@alien.top 1 points 9 months ago

I don't think so, this is something you do when you're GPU poor, closedai would just not undertrain their models in the first place.

Why is Mistral-7b so capable? Any ideas re: dataset? in c/localllama@poweruser.forum

[–] Charuru@alien.top 1 points 10 months ago

The results are okay, but I'm hard-pressed to call it "very capable". My perspective on it is that other bigger models are making mistakes they shouldn't be making because they were "trained wrong".

What is the best code generation model aside from gpt-4? in c/localllama@poweruser.forum

[–] Charuru@alien.top 1 points 10 months ago

I've not heard of text-generator.io, is it as performant as vllm on multibatch or is it a wrapper around it?

1

What is the best code generation model aside from gpt-4? (alien.top)

submitted 10 months ago by Charuru@alien.top to c/localllama@poweruser.forum

23 comments fedilink

Using and losing lots of money on gpt-4 ATM, it works great but for the amount of code I'm generating I'd rather have a self hosted model. What should I look into?