this post was submitted on 18 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

I have tried to set up 3 different versions of it, TheBloke GPTQ/AWQ versions and the original deepseek-coder-6.7b-instruct .

I have tried the 34B as well.

My specs are 64GB ram, 3090Ti , i7 12700k

In AWQ I get just bugged response (""""""""""""""") until max tokens,

GPTQ works much better, but all versions seem to add unnecessary * at the end of some lines.

and gives worse results than on the website (deepseek.com) Let's say il ask for a snake game in pygame, it usually gives an unusable version, and after 5-6 tries il get somewhat working version but still il need to ask for a lot of changes.

While on the official website il get the code working on first try, without any problems.

I am using the Alpaca template with adjustment to match the deepseek version (oogabooga webui)

What can cause it? Is the website version different from the huggingface model?

you are viewing a single comment's thread
view the rest of the comments
[–] mantafloppy@alien.top 1 points 10 months ago (1 children)

I have'nt been able to run the .gguf in either LM Studio, Ollama or oobabooga/text-generation-webui.

I had tu run it directly with llama.cpp in command line to get it working.

Something about using a special end token and not having standart transformer or something...

https://huggingface.co/TheBloke/deepseek-coder-33B-instruct-GGUF/discussions/2

[–] nullnuller@alien.top 1 points 10 months ago (1 children)

Do you just copy and paste the terminal output containing \n and whitepspaces in the .\main output to VSCode or similar IDE and it works?

[–] mantafloppy@alien.top 1 points 10 months ago

Now that you pointing it out, they are there because i copy/pasted this from a code block somewhere.

But that what i write in my command line and it dont seem to cause issue.