this post was submitted on 20 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
top 17 comments
sorted by: hot top controversial new old
[–] sergeant113@alien.top 1 points 11 months ago

Can’t wait!!!

[–] mcmoose1900@alien.top 1 points 11 months ago (1 children)
[–] a_beautiful_rhind@alien.top 1 points 11 months ago

Which is a shame because the same performance + the extra context would have been huge.

[–] kristaller486@alien.top 1 points 11 months ago (2 children)

Is there a code for distillation?

[–] llama_in_sunglasses@alien.top 1 points 11 months ago

I had okayish results blowing up layers from 70b... but messing with the first or last 20% lobotomizes the model, and I didn't snip more than a couple layers from any one place. By the time I got the model far enough down in size that q2_K could load in 24GB of VRAM it fell apart, so I didn't consider mergekit all that useful of a distillation/parameter reduction process.

[–] mcmoose1900@alien.top 1 points 11 months ago

Oh yeah, it be busted.

[–] roselan@alien.top 1 points 11 months ago (1 children)

and of course TheBloke already prepped everything for our fine consumption.

[–] LocoMod@alien.top 1 points 11 months ago

Had the same problem last night and I promptly deleted it.

[–] mpasila@alien.top 1 points 11 months ago (2 children)

Did anyone manage to get them working? I tried GGUF/GPTQ and running then unquantized with trust-remote-code and they just produced garbage. (I did try removing BOS tokens and still same thing)

[–] Jelegend@alien.top 1 points 11 months ago

Yeah, exactly the same thing. Produced absolutely rubbish whatever i tried. I tried 8B 15B and 23B

[–] watkykjynaaier@alien.top 1 points 11 months ago (1 children)

I've completely fixed gibberish output on Yi-based and other models by setting the RoPE Frequency Scale to a number less than one, which seems to be the default. I have no idea why that works, but it does.

What I find even more strange is the models often keep working after setting the frequency scale back to 1.

[–] Aaaaaaaaaeeeee@alien.top 1 points 11 months ago

What value specifically worked?

[–] vasileer@alien.top 1 points 11 months ago (1 children)

did you test the model before advertising it?

[–] bearbarebere@alien.top 1 points 11 months ago
[–] ltduff69@alien.top 1 points 11 months ago (1 children)
[–] No_Afternoon_4260@alien.top 1 points 11 months ago (1 children)

You took a picture of nous capybara..

[–] ltduff69@alien.top 1 points 11 months ago

Yeah I am kinda petty lol.