LocalLLaMA

14 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

RWKV v5 7b, Fully Open-Source, 60% trained, approaching Mistral 7b in abilities or surpassing it. (alien.top)

submitted 2 years ago by vatsadev@alien.top to c/localllama@poweruser.forum

32 comments fedilink hide all child comments

So RWKV 7b v5 is 60% trained now, saw that multilingual parts are better than mistral now, and the english capabilities are close to mistral, except for hellaswag and arc, where its a little behind. all the benchmarks are on rwkv discor, and you can google the pro/cons of rwkv, though most of them are v4.

Thoughts?

you are viewing a single comment's thread
view the rest of the comments

[–] Maykey@alien.top 1 points 2 years ago

I don't think a linear transformer has a serious chance to beat a standard transformer with the same number of parameters.

I do. Transformers are not good on long range area.. They perform well only if they are backed by better architectures as in case of MEGA.