erick-fear

joined 1 year ago
 

Hi all I'm wondering if is there a possibility to spread load of localLLM on multiple hosts instead of adding gpu's to speed up responses. My host do not have gpu's since I want to be power effective, but they have decent ammont of ram 128. Thx for all ideas.