LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

RWKV v5 7b, Fully Open-Source, 60% trained, approaching Mistral 7b in abilities or surpassing it. (alien.top)

submitted 2 years ago by vatsadev@alien.top to c/localllama@poweruser.forum

32 comments fedilink hide all child comments

So RWKV 7b v5 is 60% trained now, saw that multilingual parts are better than mistral now, and the english capabilities are close to mistral, except for hellaswag and arc, where its a little behind. all the benchmarks are on rwkv discor, and you can google the pro/cons of rwkv, though most of them are v4.

Thoughts?

you are viewing a single comment's thread
view the rest of the comments

[–] ambient_temp_xeno@alien.top 1 points 2 years ago (2 children)

Fully open source?

[–] _Lee_B_@alien.top 1 points 2 years ago (2 children)

The source is actually available (which is good), but sadly the dataset is not (which is bad, and makes it not truly open, since you can're reliably reproduce it).

[–] Disastrous_Elk_6375@alien.top 1 points 2 years ago (2 children)

Not looking to start drama here, but I feel we're moving the goalposts a bit here... Source available and under a permissive license is opensource.

I feel the discussion around training sets is too risky at this point. Everyone is doing at least gray stuff, using dubious-sourced material and I feel like everyone wants to wait out some lawsuits before we can get truthful stuff about datasets.

[–] _Lee_B_@alien.top 1 points 2 years ago (1 children)

No, we're not. Not really.

You could call this "open source", yes, but by a very narrow and worthless definition of that, which has always been controversially narrow and abusive. What people MEAN when they say open source is "like Linux". Linux is based on, and follows the principles of Free Software:

0) The freedom to run the program as you wish, for any purpose.
1) The freedom to study how the program works, and change it so it does your computing as you wish. Access to the source code is a precondition for this.
2) The freedom to redistribute copies so you can help others.
3) The freedom to distribute copies of your modified versions to others
-- gnu.org/philosophy

When an LLM model's weights are free, but it's censored, you have half of freedom 0.

When an LLM model gives you the weights, but doesn't give you the code or the data, AND it's an uncensored model, you have freedom 0, but none of the others.

When you have the source code but no weights or data, you only have half of freedom 1 (you can study it, but not rebuild and run it, without a supercomputer and the data).

When you have the source code, the weights, AND the data, you have all four freedoms, assuming that you have the compute to rebuild the weights, or can pool resources to rebuild them.

[–] Disastrous_Elk_6375@alien.top 1 points 2 years ago

So you list the gnu stuff, and then add "censored", but that's not goalpost moving? Come on.

0,1,2 and 3 ALL apply with an apache 2.0 license. Saying this is not open-source at this point is being contrarian for the sake of being contrarian, and I have no energy to type on this subject.

Quoting your own post fron gnu: Take the sourcecode, plug in c4 or redpajama or whatever, pay for the compute and you can get your own product. With the posted source code. I got nothing else.

[–] Slimxshadyx@alien.top 1 points 2 years ago

You are right but I think a big part of Open source is being able to modify it however you like.

You can’t really modify anything here except for fine tuning without the og dataset

[–] vatsadev@alien.top 1 points 2 years ago (1 children)

Um The dataset is opensource, its all public HF datasets

[–] _Lee_B_@alien.top 1 points 2 years ago

"World = Some_Pile + Some_SlimPajama + Some_StarCoder + Some_OSCAR + All_Wikipedia + All_ChatGPT_Data_I_can_find"

"some" as in customized.

[–] satireplusplus@alien.top 1 points 2 years ago

Models are Apache 2.0 afaik, there are not that many base models that can be used commercially without restrictions.