ForsookComparison

joined 11 months ago

Cheapest GPU/Way to run 30b or 34b "Code" Models with GPT4ALL? (alien.top)

submitted 11 months ago by ForsookComparison@alien.top to c/localllama@poweruser.forum

1 comments fedilink

Currently running them on-CPU:

Ryzen 9 3950x
64gb DDR4 3200mhz
6700xt 12gb (does not fit much more than 13b models, so not relevant here)

While running on-CPU with GPT4All, I'm getting 1.5-2 tokens/sec. It finishes, but man is there a lot of waiting.

What's the most affordable way to get a faster experience? The two models I play with the most are Wizard-Vicuna 30b, and WizardCoder and CodeLlama 34b