LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Is DeepSeek Coder 1.3b meant to be this bad? (alien.top)

submitted 2 years ago by East-Awareness-249@alien.top to c/localllama@poweruser.forum

7 comments fedilink hide all child comments

I am using kobold.cpp and it couldn't code anything outside of hello world. Am I doing something wrong?

https://preview.redd.it/xdo6q7a25z1c1.png?width=1454&format=png&auto=webp&s=30d0eaed2c6d4d95070f2312a4bc3add0dcc2840

top 7 comments

sorted by: hot top controversial new old

[–] FullOf_Bad_Ideas@alien.top 1 points 2 years ago (1 children)

Is that a base or some instruct-tuned fine-tune? It wouldn't be too much out of ordinary if it's base, they tend to get crazy. You can try setting repetition penalty to 1, might help a touch.

[–] AfterAte@alien.top 1 points 2 years ago

Also, set the temperature to 0.1 or 0.2. those two things helped me getting it to work nicely.

[–] ttkciar@alien.top 1 points 2 years ago

It is only 1.3B :-) I have noticed that smaller models work a lot better with longer, more detailed prompts (at least 440 characters, better with twice that many).

[–] LocoLanguageModel@alien.top 1 points 2 years ago

Try Setting temperature to .1

Ive had really good luck with this model for 6.7b and and 33b. The 1.3 is more of a novelty because of how fast it runs on ancient GPUs, but not nearly as good as the other 2 sizes in my attempts, though it is amazing for its size.

[–] ButlerFish@alien.top 1 points 2 years ago

Looks like a very small model. Maybe better for a code completion usecase.

[–] vasileer@alien.top 1 points 2 years ago (1 children)

2 ideas

- use deepseek-coder-1.3b-instruct not the base model

- check that you use the correct prompting template for the model

[–] East-Awareness-249@alien.top 1 points 2 years ago

It is the instruct model. You can see underneath the prompt box that it's the deepseek-coder-1.3b-instruct_Q5_K_s model. I used the prompting template in the model, and it slightly improved answers.

But if I ask if to write some code, it almost never does and says something gibberish.

Does your GPU/CPU quality affect the AI's output? My device is potato.