this post was submitted on 18 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

I have tried to set up 3 different versions of it, TheBloke GPTQ/AWQ versions and the original deepseek-coder-6.7b-instruct .

I have tried the 34B as well.

My specs are 64GB ram, 3090Ti , i7 12700k

In AWQ I get just bugged response (""""""""""""""") until max tokens,

GPTQ works much better, but all versions seem to add unnecessary * at the end of some lines.

and gives worse results than on the website (deepseek.com) Let's say il ask for a snake game in pygame, it usually gives an unusable version, and after 5-6 tries il get somewhat working version but still il need to ask for a lot of changes.

While on the official website il get the code working on first try, without any problems.

I am using the Alpaca template with adjustment to match the deepseek version (oogabooga webui)

What can cause it? Is the website version different from the huggingface model?

you are viewing a single comment's thread
view the rest of the comments
[–] mantafloppy@alien.top 1 points 1 year ago (2 children)

I have'nt been able to run the .gguf in either LM Studio, Ollama or oobabooga/text-generation-webui.

I had tu run it directly with llama.cpp in command line to get it working.

Something about using a special end token and not having standart transformer or something...

https://huggingface.co/TheBloke/deepseek-coder-33B-instruct-GGUF/discussions/2

[–] nullnuller@alien.top 1 points 1 year ago (1 children)

Do you just copy and paste the terminal output containing \n and whitepspaces in the .\main output to VSCode or similar IDE and it works?

[–] mantafloppy@alien.top 1 points 1 year ago

Now that you pointing it out, they are there because i copy/pasted this from a code block somewhere.

But that what i write in my command line and it dont seem to cause issue.