Is that a base or some instruct-tuned fine-tune? It wouldn't be too much out of ordinary if it's base, they tend to get crazy. You can try setting repetition penalty to 1, might help a touch.
LocalLLaMA
Community to discuss about Llama, the family of large language models created by Meta AI.
Also, set the temperature to 0.1 or 0.2. those two things helped me getting it to work nicely.
It is only 1.3B :-) I have noticed that smaller models work a lot better with longer, more detailed prompts (at least 440 characters, better with twice that many).
Try Setting temperature to .1
Ive had really good luck with this model for 6.7b and and 33b. The 1.3 is more of a novelty because of how fast it runs on ancient GPUs, but not nearly as good as the other 2 sizes in my attempts, though it is amazing for its size.
Looks like a very small model. Maybe better for a code completion usecase.
2 ideas
- use deepseek-coder-1.3b-instruct not the base model
- check that you use the correct prompting template for the model
It is the instruct model. You can see underneath the prompt box that it's the deepseek-coder-1.3b-instruct_Q5_K_s model. I used the prompting template in the model, and it slightly improved answers.
But if I ask if to write some code, it almost never does and says something gibberish.
Does your GPU/CPU quality affect the AI's output? My device is potato.