DarthInfinix

joined 1 year ago
[–] DarthInfinix@alien.top 1 points 11 months ago

Hmm, theoretically if you switch to a super light Linux distro, and get the q2 quantization 7b, using llama cpp where mmap is on by default, you should be able to run a 7b model, provided i can run a 7b on a shitty 150$ Android which has like 3 GB Ram free using llama cpp

[–] DarthInfinix@alien.top 1 points 11 months ago

Probably be more economical to rent a a100

[–] DarthInfinix@alien.top 1 points 11 months ago (1 children)

first off, why is your title formatted like an article? you mention being a student, so do you imply you want resources for studying the underlying architecture on the large language models? in which case you could watch the channel of "Andej Karpathy" which was pretty enlightening to a layman like me, but it's pretty hard to progress further than that on the 'science' of llms without a cs degree.

other than that, there really isn't a 'study' to be done of llms, as it's a pretty new field, unless you want to get into hardcore ml stuff with a cs degree and all, as for models, with your not quite 'cutting edge' pc, you could try a yi 34b finetune for longer context , though it's prone to break from my testing, or you could try many smaller 7b models, of which i have been enjoying the whole family of mistral finetunes the most (openorca, openhermes, etc).

for roleplay and stuff the LLaMa Tiefighter model is pretty cool. if you truly want access to cutting edge hardware that will be capable of running the best open source models. eg llama 70b or goliath 120b, you could look into paid cloud gpu services like runpod which are pretty easy to use and i had a mostly positive experience running llms there. hope this answer helps.