this post was submitted on 14 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

(title)

you are viewing a single comment's thread
view the rest of the comments
[–] rgar132@alien.top 1 points 10 months ago

Eh I’ll differ here and say yes it sometimes matters. For example, if you’re writing code that requires you to load the model to debug the code and watch it break because of some mundane thing, only to tweak it a bit and have to wait to reload the model again. I store most models on slow drives but have a ram disk set up for the one I’m working with. Loading from spinning rust to ram disk takes a few minutes sometimes but after that it’s quick to load the model into vram.

As far as quality or speed of the output after model load then no it doesn’t really matter as long as the model fully fits in vram or other fast storage when you’re using it.

All that said, a t7 is plenty fast anyway, much faster than a spinning platter or even many of the sata ssd’s, so you’ll be fine.