chewbie

joined 11 months ago
[–] chewbie@alien.top 1 points 11 months ago

Does anyone know how many stream of LLAMA 2 70b a apple studio can run in parrallel ? Does it need the same amount of ram for each completion, or does llama.cpp manage to share it between different stream ?