this post was submitted on 13 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

I'm trying to run zephyr-7b, on my local machine with an RX580 8G using Text generation web UI. It works for the most part but sometimes gets into giving unrelated responses. After which I have to restart the app! Sometimes it even prints out right out gibberish..

I'm running zephyr-7b-beta.Q4\_K\_M.gguf\. With the following options:

n-gpu-layers: > 35
n_ctx: 8000

And parameters:

max_new_tokens: 2000
top_p: 0.95
top_k: 40
Instruction Template: ChatML

But if I run the above exact setup on a cloud GPU (vast.ai) it runs perfect.. What am I doing wrong?

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here