this post was submitted on 28 May 2026
15 points (69.2% liked)
Programming
27076 readers
369 users here now
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Rules
- Follow the programming.dev instance rules
- Keep content related to programming in some way
- If you're posting long videos try to add in some form of tldr for those who don't want to watch videos
Wormhole
Follow the wormhole through a path of communities !webdev@programming.dev
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I run this setup with 36GB (32+4). Local LLMs can be really effective BUT you are constrained by context size in a way you aren't on cloud services.
Cline supports running a local model through lmstudio but my experience feeding it any significant tasks is it just can't handle reading and holding the contexts to build components for enterprise scale applications.
I use Claude to write a lot of utility one-off scripts. With a maximum window of 1M tokens I can hit 30+% context just writing Python scripts. API contracts, development standards, existing reusable modules, and sometimes reading the code/documentation of the services I'm going to be calling.
My MacBook can't handle 300k token contexts. 30k seems doable. I should see how it handles my utility script folder...
Anyway that's still no Claude but if you need a cheaper model and you can afford for developers to spend time on it before ultimately deciding they need to spend for Claude or Codex or Gemini, then rubbing a local model on a beefy MacBook is 100% an option.
Stepping up from there to building a locally hosted LLM is probably the worst of all worlds. It will be a beefy CapEx, prone to saturation by all the users, and you will most likely still have to punt the hardest jobs to cloud AI. It can certainly be done and done well, but the best example I know runs on $250-500k worth of hardware (to service a pretty big number of users to be fair).