Does anyone know how many stream of LLAMA 2 70b a apple studio can run in parrallel ? Does it need the same amount of ram for each completion, or does llama.cpp manage to share it between different stream ?
Does anyone know how many stream of LLAMA 2 70b a apple studio can run in parrallel ? Does it need the same amount of ram for each completion, or does llama.cpp manage to share it between different stream ?