Just as open weight models are getting good. Qwen 3.6 27B just dropped with claimed performance approaching Opus 4.6, but it can run on a Mac with a M-series SoC. I tested it out today on a M4 Pro with Ollama and Cline and was impressed with its reasoning, but it was slow. Going to try with llama.cpp tomorrow and mess around tweaking it for speed.
https://ai.rs/ai-developer/qwen-3-6-27b-local-coding-model
AI coding agents are useful, but it’s time for the cloud-based models to chill out so we can get cheap RAM again to run our shit locally.
