this post was submitted on 12 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

Hi all I'm wondering if is there a possibility to spread load of localLLM on multiple hosts instead of adding gpu's to speed up responses. My host do not have gpu's since I want to be power effective, but they have decent ammont of ram 128. Thx for all ideas.

you are viewing a single comment's thread
view the rest of the comments

On another note these gpu manufactures must get their head out of their ass and start cranking out cards with much higher memory capacities. First one to do it cost effectively will gain massive market share and huge profits. Nvidia's A100 etc doesn't qualify for this as it's prohibitively expensive.