Try at FP16 or 8bit at most, probably a 13B models suffers too much at 4bits.
this post was submitted on 21 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
I was not impressed with Orca2 13B GPTQ for java and javascript coding type questions. It almost seemed reluctant to give answers. I had to explicitly prompt to print out the code and even then it did not do a very good job.