LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Could multiple 7b models outperform 70b models? (alien.top)

submitted 2 years ago by freehuntx@alien.top to c/localllama@poweruser.forum

22 comments fedilink hide all child comments

If i have multiple 7b models where each model is trained on one specific topic (e.g. roleplay, math, coding, history, politic...) and i have an interface which decides depending on the context which model to use. Could this outperform bigger models while being faster?

you are viewing a single comment's thread
view the rest of the comments

[–] feynmanatom@alien.top 1 points 2 years ago (1 children)

Lots of rumors, but tbh I think it’s highly unlikely they’re using an MoE. MoEs work on batch size = 1 (you can take advantage of sparsity) but not on larger batch sizes. You would need so much RAM and would miss out on the point of using an MoE.

[–] remghoost7@alien.top 1 points 2 years ago

Lots of rumors...

Very true.

We honestly have no clue what's going on behind ClosedAI's doors.

I don't know enough about MoEs to say one way or the other, so I'll take your word on it. I'll have to do more research on them.