overview for opi098514

How fast is 3090 for Codellama 70B 4/8bit? in c/localllama@poweruser.forum

[–] opi098514@alien.top 1 points 11 months ago

For a 34b model you should be fine. I run 34b models on my duel 3060s and it’s very nice. Usually like 20ish tokens a second. If you want to run like a 7b model you can get basically instant results. With Mistal 7b I’m getting almost 60 tokens a second. It’s crazy. But it really depends on what you are using it for and how much accuracy you need.

Step by step guide for local with voice? in c/localllama@poweruser.forum

[–] opi098514@alien.top 1 points 11 months ago

For anyone that is interested. Here is a code that will do this as long as you have some knowledge of python and conda you should maybe be able to get it to work. Just follow the instructions. Maybe.

https://github.com/opisaac9001/TTS-With-ooba-and-voice

Why is a single a100 so slow? in c/localllama@poweruser.forum

[–] opi098514@alien.top 1 points 11 months ago

Sounds like you might be using the standard transformer loader. Try exllama or exlamav2

100B, 220B, and 600B models on huggingface! in c/localllama@poweruser.forum

[–] opi098514@alien.top 1 points 11 months ago

Well that’s because he’s not. Sam is actually my dad.

X.AI Grok could potentially be open sourced on a 6 month delay from launch in c/localllama@poweruser.forum

[–] opi098514@alien.top 1 points 11 months ago

Lol cause musk is totally reliable with what he says.

Optimizing Your Language Model Experience: A Student's Journey with a Cutting-Edge PC featuring Core i7 14th Gen, RTX 4070 Ti, and 32GB DDR5 RAM in c/localllama@poweruser.forum

[–] opi098514@alien.top 1 points 11 months ago

I didn’t say it wasn’t. But getting into LLMs really just shows you how much better your PC can be and you will never been as cutting edge as you think or want.

100B, 220B, and 600B models on huggingface! in c/localllama@poweruser.forum

[–] opi098514@alien.top 1 points 11 months ago

Everything on that page is hype for something that doesn’t exist.

100B, 220B, and 600B models on huggingface! in c/localllama@poweruser.forum

[–] opi098514@alien.top 1 points 11 months ago (5 children)

Uuummmm no. It’s for sure real. And the best one out there. No questions asked. It’s better that CHATGPT 4 and OpenAI has been trying to hack this new company to get the 600b model because they are scared that it will end OpenAI for good.

Obligatory /s

100B, 220B, and 600B models on huggingface! in c/localllama@poweruser.forum

[–] opi098514@alien.top 1 points 11 months ago (11 children)

It’s the best out there…. But no you can’t try it because it’s to dangerous.

Optimizing Your Language Model Experience: A Student's Journey with a Cutting-Edge PC featuring Core i7 14th Gen, RTX 4070 Ti, and 32GB DDR5 RAM in c/localllama@poweruser.forum

[–] opi098514@alien.top 1 points 11 months ago (2 children)

So you are soon gunna realize that unfortunately your pc is not as cutting edge as you think. Your main need is vram. For the 4070 ti you only have 12 gigs of vram. So you will be limited to 7b and 13b models. You can load into ram though but your speeds plummet. Mistal 7b is a good option to start with.

New APU’s close to Gpu processing, but with unlimited memory? in c/localllama@poweruser.forum

[–] opi098514@alien.top 1 points 11 months ago

Ram memory bandwidth is still gunna screw you over.

Train Smarter, Not Harder? - MiniSymposium 7b in c/localllama@poweruser.forum

[–] opi098514@alien.top 1 points 11 months ago

I’ll fuck around with it when I get home.