overview for simcop2387

Motherboards for AMD EPYC 9004 series build? in c/localllama@poweruser.forum

[–] simcop2387@alien.top 1 points 9 months ago

Not sure which models you'd want specificially, but take a look at Asrock Rack, they've got a lot of ATX compatible server motherboards. https://www.asrockrack.com/minisite/EPYC9004/

Buying a p40 for 70b-120b in c/localllama@poweruser.forum

[–] simcop2387@alien.top 1 points 10 months ago (1 children)

The big issue is that you're going to have to disable 16bit floats for doing all the work and do it all in 32bit floats (not storing weights, but the calculations themselves) once you try to combine with a P40, you can still get alright performance on them (I'm using 4 of them) but you'll cripple the performance of the 4090 doing that. I don't know if any of the libraries for running things will handle conversion and different kernels on different cards to avoid that since it's a completely different set of code for that.

You'd do much much better with adding a used 3090 from ebay (assuming it works) really.

Why are you running local models? What are you doing with them? in c/localllama@poweruser.forum

[–] simcop2387@alien.top 1 points 10 months ago

A reliable AI assistant that I know is safe, secure and private. Any information about myself, my household or my proprietary ideas won't be saved on some company's server to be reviewed and trained upon. I don't want to ask sensitive questions about stuff like taxes or healthcare or whatnot, just to have some person review it and it end up in a model

I'm slowly working on a change to Home Assistant (https://www.home-assistant.io/) to take the OpenAI conversation addon that they have and make it support connecting to any base url. Along with that I'm going to make some more addons for other inference servers (particularly koboldcpp, exllamav2, and text-gen-webui) so that with all their new voice work this year I can plug things in and have a conversation with my smart home and other data that I provide it.