this post was submitted on 09 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

PS. This is Text from Bing AI.

top 32 comments
sorted by: hot top controversial new old
[–] SpaceCockatoo@alien.top 1 points 10 months ago

one week to train it
six months to finetune it until it's safely and ethically useless

[–] Ilforte@alien.top 1 points 10 months ago (2 children)

Maybe. If it won't be killed because safety. I also think that it's implausible it'll be much better than what we have, or even catch up to 3.5 fully.

[–] Amgadoz@alien.top 1 points 10 months ago

Well, two areas where Llama can improve are:

  1. Multilingual capabilities
  2. Mixture of Experts architecture
[–] dogesator@alien.top 1 points 10 months ago

Already Mistral 7B fine tunes reaching parity with gpt-3.5 in most benchmarks.

I’d be very surprised if Llama-3 70B fine tunes don’t significantly outperform GPT-3.5 in nearly every metric.

[–] FPham@alien.top 1 points 10 months ago (1 children)

All they need to do is make it 180B and most people will have no way to abuse it. (Or use it)

[–] gfth45fghmnfs@alien.top 1 points 10 months ago (1 children)

Llama 3 180B model on every quantization & compression steroid available might bang. Also it will show how far or close oss llms are from gpt4

[–] koehr@alien.top 1 points 10 months ago

Assuming it will be open.

[–] Bulb93@alien.top 1 points 10 months ago

Awesome. I hope there's a 30b~ model

[–] llamaShill@alien.top 1 points 10 months ago

That's the prevailing idea based on all the info we have so far:

  • Llama 1 training was from around July 2022 to January 2023, Llama 2 from January 2023 to July 2023, so Llama 3 could plausibly be from July 2023 to January 2024.
  • In August, a credible rumor from an OpenAI researcher claimed that Meta talked about having the compute to train Llama 3 and Llama 4, with Llama 3 being as good as GPT-4.
  • In an interview with Lex Fridman published Sept. 28, Mark Zuckerberg has said they're always training another model and already working on the next generation when talking about Llama.
  • At Meta Connect on Sept. 27 - 28, they said more news about Llama will be put out next year.

WSJ published an exclusive on Sept. 10 that said Meta's next LLM won't start training until early 2024, meaning a release wouldn't happen until much later, but they may have been mistaken since this seems to contradict Mark's recent words. Meta could have also accelerated their plans to stay relevant in the LLM race, especially since leaks about their LLM development have shown they've put more emphasis on productizing Llama and incorporating it within their apps.

[–] ttkciar@alien.top 1 points 10 months ago

Dell's LLM-for-businesses plan is a joke, btw. They seem to not know that quantized models even exist, or perhaps they are pretending to so that their customers have to buy more Dell hardware.

Half of the regulars in this sub could set up a business with a better in-house LLM inference system than what Dell's offering.

[–] diggingbighole@alien.top 1 points 10 months ago (1 children)

Personally I'm waiting for the Llama Box 2 - Zuckerberg Signature Edition.

[–] Plusdebeurre@alien.top 1 points 10 months ago

Heard the manufacturing plant burned down

[–] herozorro@alien.top 1 points 10 months ago

it should have a plug in system making fine tuning a breeze

[–] Dazzling_Ad1507@alien.top 1 points 10 months ago (1 children)
[–] mrjackspade@alien.top 1 points 10 months ago

I tried coaxing an answer out of it, and as far as I got was

  1. One random redditors comment saying "next year" was announced at Meta Connect
  2. "It makes sense" due to the spacing between 1 and 2
[–] Sabin_Stargem@alien.top 1 points 10 months ago (1 children)

My speculation is that the "safety" of major LLM providers like Facebook won't be substantial. They are probably designed so that "popping the lock" isn't difficult. A fig leaf and shrug to placate outsiders, while allowing their actual audience to carry on.

[–] Available_Screen_922@alien.top 1 points 10 months ago

I'm not sure that it's plausible to add meaningful safety to open source LLMs.

Incidentally, that does worry me some in the long term.

[–] FPham@alien.top 1 points 10 months ago (2 children)

Ooooh, I really want to see the marvelous "mechanism" to prevent open source model from misuse.

[–] cvdbdo@alien.top 1 points 10 months ago

*Provided you use the non fine tuned model straight from Meta's download page so that they're safe.

[–] a_beautiful_rhind@alien.top 1 points 10 months ago (1 children)

They can ingrain those refusals pretty deep to where it will be irritating to use and hard to fine tune out. Vicuna has a bit of this.

[–] FPham@alien.top 1 points 10 months ago

The point is Meta till now always also released base models.

That would require them to release only fine-tuned model like llama2-chat was. Then they can bork it which would be irreversible. But if they give out base model without some stuff - it can be easily added.

[–] obvithrowaway34434@alien.top 1 points 10 months ago (1 children)

I'm more interested in the next Mistral release, none of that corporate "safety" bs. It would also be good to have a truly open-source model (that releases both weight and the training data).

[–] ExternalOpen372@alien.top 1 points 10 months ago

The funny things the next mistral rumored not open-source

[–] Monkey_1505@alien.top 1 points 10 months ago

What are these imaginary 'abuses' this random picture of some text is talking about?

[–] dethorin@alien.top 1 points 10 months ago

Bing AI as a source is shit. Many times the quoted primary sources don't support whatever Bing AI claims.

[–] xRolocker@alien.top 1 points 10 months ago

Always check Bing’s sources. It hallucinates out the wazoo and won’t change its mind if corrected or confronted.

[–] stddealer@alien.top 1 points 10 months ago (1 children)

Putting "safety" mechanism in foundational models is dumb imo. They are not just text generators, they are statistical models about human languages, and it shouldn't have made up arbitrary biases about what language should look like.

[–] api@alien.top 1 points 10 months ago (1 children)

It's not hard to fine tune base models for any bias you want. "Zero bias" isn't possible. There's always some bias in the training data.

[–] stddealer@alien.top 1 points 10 months ago

Sure, but that's not a reason to purposefully add more biases into it.

[–] TheWildOutside@alien.top 1 points 10 months ago

"Meta had partnered with Dell" Oh so if you hit a key to hard while inputting a prompt the entire model will break?

[–] yahma@alien.top 1 points 10 months ago

YES! This is the good news I needed this morning!

[–] l33chy@alien.top 1 points 10 months ago