this post was submitted on 28 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

Yes. This has to be the worst ram you guys have ever seen but hear me out. Is it possible? I want to run the full 70gb model but that’s far out of question and I’m not even going to bother. Can I atleast run the 13gb or at least the 7gb?

you are viewing a single comment's thread
view the rest of the comments
[–] m18coppola@alien.top 1 points 9 months ago

I have run 7B models with Q2_K on my raspberry pi with 4GB lol. It's kinda slow (still faster than I bargained for), but Q2_K models tend to be pretty stupid at the 7B size, no matter the speed. You can theoretically run a bigger model using swap-space (kind of like using your storage drive as ram), but then the token generation speeds come crawling to a halt.