this post was submitted on 27 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

So I'm interested in applications that require memory more than speed, with high quality and a big context. I'm talking 100GB or more. Speed is still an important consideration. I don't need snappy conversations, but getting through more stuff 'overnight' is still valuable.

3090s are affordable, but it would take 4 to 8 to get into the big memory category, and the primary issue is energy use. For batch use the PC could shut down after finishing, so idle power use wouldn't be an issue. Are there motherboards that can completely shut off power to extra cards when they aren't needed?

Mac Studio M2 Ultra can get 192GB of unified memory, with about 140GB usable. This isn't as fast, obviously, but is meant to be acceptable for many applications.

What about PCs/servers with lots of mainboard RAM? Is this way slower than the Macs due to different architecture? If not it's probably a lot cheaper. The CPU would need to do all the work, and I don't know about how the energy efficiency would compare.

I would be grateful if anyone has data comparing speeds or joules per token for these broad options.

you are viewing a single comment's thread
view the rest of the comments
[–] EvokerTCG@alien.top 1 points 11 months ago

A valid option. I haven't looked into prices for renting but it could make sense unless I will use it a lot.