tl;dr: I'm considering building a budget machine for tinkering with LLMs, but I'm not sure if this is a good idea and how to go about it.
For context: I work in a university department. I currently have access to a 2080 Ti on a shared machine, and we're in the process of acquiring a small server with 2 L40 cards. So for any larger experiments, I will be able to use this shared machine.
However, I think I would like to have my own small machine for tinkering: trying different models and techniques, and just playing around, and preparing larger experiments to be run on the server. My focus is on teaching and education not on state-of-the-art research.
With aiming for a good amount of VRAM, the 4060 Ti 16GB seems to be the most obvious choice; I also like the low power requirements (regarding energy and cooling). But this card seems to have a poor reputation overall. I'm also not sure what currently the sweet spot w.r.t. the the CPU and memory is โ I completely lost track of Intel's and AMD's generations over the last years.
Some additional comment regarding some common opinions
- I simply like to have my own hardware and cloud services seem to be more expensive in the long run.
- There is not really a good market of used GPUs where I'm located (Singapore), so the common suggestion "go with as used 3090" does not really work.
Any good suggestions, or am I naive with my idea of a budget machine? Thanks a lot!
Mistral 7B is very good and can be run on 8gb vram. It was blazing fast on my 3070. I have a 4090 as well and for all intents and purposes its indistinguishable.
Right now Mistral7B competes with the best 13B paramater models. Unless you plan on using code LLMs there aren't many new 30B parameter models that matter that much.
I have a 3070 on my proxmox home server with I think only 2 physical cores and 16gb ram allocated and I'm getting 40 + tokens per second.
You wouldn't be futureproofed but would work fine now.