chewbie

joined 2 years ago

Macs with 32GB of memory can run 70B models with the GPU. in c/localllama@poweruser.forum

[–] chewbie@alien.top 1 points 2 years ago

Does anyone know how many stream of LLAMA 2 70b a apple studio can run in parrallel ? Does it need the same amount of ram for each completion, or does llama.cpp manage to share it between different stream ?

permalink
fedilink
source