Currently running them on-CPU:
While running on-CPU with GPT4All, I'm getting 1.5-2 tokens/sec. It finishes, but man is there a lot of waiting.
What's the most affordable way to get a faster experience? The two models I play with the most are Wizard-Vicuna 30b, and WizardCoder and CodeLlama 34b