this post was submitted on 30 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

How come llama2 70B is so much worse than many other code-llama 34B?

I'm not talking specifically for coding questions but the 70B seems utterly stupid.. repeats nonsense patterns, starts talking of unrelated stuff and sometimes get stuck in a loop of repeating the same word. Seems utter garbage and I downloaded the official model from the meta HF.

Has anyone experienced the same? Am I doing something wrong with the 70B model?

top 4 comments
sorted by: hot top controversial new old
[–] Saofiqlord@alien.top 1 points 9 months ago (1 children)

Did you forget to unset the rope settings?

Codellama requires different rope than regular llama.

Also check your sampler settings.

[–] Specialist_Ice_5715@alien.top 1 points 9 months ago

No I didn't even know rope was a thing, I'm reading about it now... if you have any tl;dr please post it, this stuff seems pretty complicated.

I was loading the model with a llama.cpp invocation, didn't know about rope. What would change if I left the default values on?

[–] Paulonemillionand3@alien.top 1 points 9 months ago

worked great for me

[–] dinoaide@alien.top 1 points 9 months ago

Snowflake has a very nice comparison of the two:

Fine-Tuning Improves the Performance of Meta’s Code Llama on SQL Code Generation

The answer is you need more fine tuning.