chewbie

joined 2 years ago
[–] chewbie@alien.top 1 points 2 years ago

Does anyone know how many stream of LLAMA 2 70b a apple studio can run in parrallel ? Does it need the same amount of ram for each completion, or does llama.cpp manage to share it between different stream ?