nhbis0n

joined 1 year ago
[–] nhbis0n@alien.top 1 points 11 months ago

I run 7B’s on my 1070. ollama run llama2 produces between 20 and 30 tokens per second in ubuntu.

 

Anyone know the largest model size that will fit into the new M3 Pro with 36GB RAM? I am looking to run some 23GB models with long context.

[–] nhbis0n@alien.top 1 points 11 months ago

Did the AI coordinate his sacking?