overview for ortegaalfredo

🐺🐦‍⬛ LLM Comparison/Test: 2x 34B Yi (Dolphin, Nous Capybara) vs. 12x 70B, 120B, ChatGPT/GPT-4 in c/localllama@poweruser.forum

[–] ortegaalfredo@alien.top 1 points 2 years ago

Check panchovix repo on huggingface.

🐺🐦‍⬛ LLM Comparison/Test: 2x 34B Yi (Dolphin, Nous Capybara) vs. 12x 70B, 120B, ChatGPT/GPT-4 in c/localllama@poweruser.forum

[–] ortegaalfredo@alien.top 0 points 2 years ago (2 children)

I'm hosting Goliath 120b with a much better quant (4.5b exl2, need 3x3090) and its scary, it feels alive sometimes. Also, with exllama2 it has about the same speed as a 70B model.

Is there a technical reason that distributed LLMs don't exist? in c/localllama@poweruser.forum

[–] ortegaalfredo@alien.top 1 points 2 years ago

Because LLama2-70B is similar or better in most metrics, and it small enough to not need distributed inference.

Alternatives to chat.lmsys.org? in c/localllama@poweruser.forum

[–] ortegaalfredo@alien.top 1 points 2 years ago

LLMs on neuroengine.ai should support way more than 400 words. Don't know exactly the limit.