this post was submitted on 28 Nov 2023
1 points (100.0% liked)

LocalLLaMA

11 readers
4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago
MODERATORS
 

There has been a lot of movement around and below the 13b parameter bracket in the last few months but it's wild to think the best 70b models are still llama2 based. Why is that?

We have 13b models like 8bit bartowski/Orca-2-13b-exl2 approaching or even surpassing the best 70b models now

you are viewing a single comment's thread
view the rest of the comments
[โ€“] zBlackVision11@alien.top 1 points 2 years ago (3 children)

Qwen 72b is comming in 2 days ๐Ÿ‘ Will be a real beast.

[โ€“] ninjasaid13@alien.top 1 points 2 years ago

2 days? Bro if they said November and haven't released it by now, it's not two days.

[โ€“] a_beautiful_rhind@alien.top 1 points 2 years ago (1 children)

I heard, if it comes out then finally it might be worth exllama supporting it. I heard the 14b was fairly strong.

[โ€“] zBlackVision11@alien.top 1 points 2 years ago

Yes I also hope it get's exllamav2 support, here is a issue regarding it: (Qwen model not supported) ยท Issue #160 ยท turboderp/exllamav2 (github.com)

[โ€“] FaustBargain@alien.top 1 points 2 years ago (1 children)

Qwen 72b

I can't seem to find anything about qwen 72b except two tweets from a month ago that said it was coming out. who makes it? what's it trained on? any details?

[โ€“] Thireus@alien.top 1 points 2 years ago

Curiously nobody from the previous comment upvoters have provided an answer to your question.