LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

most powerful model for an A6000? (alien.top)

submitted 2 years ago by crackinthekraken@alien.top to c/localllama@poweruser.forum

11 comments fedilink hide all child comments

so I got this shiny new GPU and I want to push it to the limit. What’s the most powerful, smartest model out there? Ideally something with as much long-term memory as possible. I’m coming off of ChatGPT 4 and want something local and uncensored

you are viewing a single comment's thread
view the rest of the comments

[–] Sea_Particular_4014@alien.top 1 points 2 years ago (1 children)

I'd try Goliath 120B and lzlv 70B. Those are the absolute best I've used, assuming you're doing story writing / RP and stuff.

LZLV should be speedy as can be and easily done in VRAM.

Goliath won't quite fit at 4 bit but you could do lower precision or sacrifice some speed and do q4_k_m GGUF with most of the layers offloaded. That'd be my choice, but I have a high tolerance for slow generation.

[–] crackinthekraken@alien.top 1 points 2 years ago (1 children)

I'm willing to wait for quality so that's no problem!

Where can I go to find these models? And how do I set them up and get them running?

[–] Sea_Particular_4014@alien.top 1 points 2 years ago

If you're on Windows, I'd download KoboldCPP and TheBloke's GGUF models from HuggingFace.

Then you just launch KoboldCPP, select the .gguf file, select your GPU, enter the number of layers to offload, set the context size (4096 for those), etc and launch it.

Then you're good to start messing around. Can use the Kobold interface that'll pop up or use it through the API with something like SillyTavern.