this post was submitted on 23 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

Hi team

I'm new to this and installed LM studio (I'm on a M1 Pro 16GB RAM). I'm looking for a model and I get a lot of options - which one to go for and why? (per the screenshot below)

Also, can you help me understand the capabilities of the machine, and some of the models you'd recommend for your use cases/ fun?

Thank you!!!

โ€‹

https://preview.redd.it/2jim35m0432c1.png?width=1650&format=png&auto=webp&s=970cecd07f537c5352428436a0f9f1840bf562f2

top 1 comments
sorted by: hot top controversial new old
[โ€“] brobruh211@alien.top 1 points 10 months ago

The options you are seeing are different quants of the same model. For 7Bs, you generally want to stick to Q4_K_M and up. Generally, the bigger the file size, the closer its quality is to the original unquantized model.

For 7B models, your 16GB unified memory should be able to run the Q6_K variant with 8192 context size no problem. The model you're looking at is good but it's slightly dated at this point. Hard to recommend models without knowing your specific use case for it, but here goes nothing:

  • TheBloke/OpenHermes-2.5-Mistral-7B-GGUF (creative, decent at following instructions, good for roleplaying but also as an all-around model).
  • TheBloke/zephyr-7B-beta-GGUF (great at following instructions, good prose, less creative than the above for roleplaying purposes.)
  • TheBloke/Synatra-7B-v0.3-RP-GGUF (creative model that seems specialized for roleplaying purposes).

I recommend trying out some 13Bs as well. In my experience, a good 13B is still better than a good 7B (for roleplaying purposes at least). With 13Bs, I recommend using Q5_K_M variants with 6144 context size. KoboldCpp sets the role scaling automatically, but I'm not sure how LMStudio handles it. Here are some models you can try out:

  • KoboldAI/LLaMA2-13B-Tiefighter-GGUF (great all-around model for its intelligence and creativity).
  • TheBloke/X-NoroChronos-13B-GGUF (creative merged model that seems specialized for roleplaying purposes).