LocalLLaMA

1 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago

MODERATORS

communick@poweruser.forum

llama2 70B from HF vs code-llama 34B (alien.top)

submitted 9 months ago by Specialist_Ice_5715@alien.top to c/localllama@poweruser.forum

4 comments fedilink hide all child comments

How come llama2 70B is so much worse than many other code-llama 34B?

I'm not talking specifically for coding questions but the 70B seems utterly stupid.. repeats nonsense patterns, starts talking of unrelated stuff and sometimes get stuck in a loop of repeating the same word. Seems utter garbage and I downloaded the official model from the meta HF.

Has anyone experienced the same? Am I doing something wrong with the 70B model?

top 4 comments

sorted by: hot top controversial new old

[–] Saofiqlord@alien.top 1 points 9 months ago (1 children)

Did you forget to unset the rope settings?

Codellama requires different rope than regular llama.

Also check your sampler settings.

[–] Specialist_Ice_5715@alien.top 1 points 9 months ago

No I didn't even know rope was a thing, I'm reading about it now... if you have any tl;dr please post it, this stuff seems pretty complicated.

I was loading the model with a llama.cpp invocation, didn't know about rope. What would change if I left the default values on?

[–] Paulonemillionand3@alien.top 1 points 9 months ago

worked great for me

[–] dinoaide@alien.top 1 points 9 months ago

Snowflake has a very nice comparison of the two:

Fine-Tuning Improves the Performance of Meta’s Code Llama on SQL Code Generation

The answer is you need more fine tuning.