nhbis0n

joined 1 year ago

[–] nhbis0n@alien.top 1 points 11 months ago

I run 7B’s on my 1070. ollama run llama2 produces between 20 and 30 tokens per second in ubuntu.

submitted 11 months ago by nhbis0n@alien.top to c/localllama@poweruser.forum

0 comments fedilink

Anyone know the largest model size that will fit into the new M3 Pro with 36GB RAM? I am looking to run some 23GB models with long context.

[–] nhbis0n@alien.top 1 points 1 year ago

Did the AI coordinate his sacking?