LocalLLaMA

1 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago

MODERATORS

communick@poweruser.forum

Models Megathread #2 - What models are you currently using? (alien.top)

submitted 9 months ago by Technical_Leather949@alien.top to c/localllama@poweruser.forum

56 comments fedilink hide all child comments

As requested, this is the subreddit's second megathread for model discussion. This thread will now be hosted at least once a month to keep the discussion updated and help reduce identical posts.

I also saw that we hit 80,000 members recently! Thanks to every member for joining and making this happen.

Welcome to the r/LocalLLaMA Models Megathread

What models are you currently using and why? Do you use 7B, 13B, 33B, 34B, or 70B? Share any and all recommendations you have!

Examples of popular categories:

Assistant chatting
Chatting
Coding
Language-specific
Misc. professional use
Role-playing
Storytelling
Visual instruction

Have feedback or suggestions for other discussion topics? All suggestions are appreciated and can be sent to modmail.

^(P.S. LocalLLaMA is looking for someone who can manage Discord. If you have experience modding Discord servers, your help would be welcome. Send a message if interested.)

Previous Thread | New Models

you are viewing a single comment's thread
view the rest of the comments

[–] Helpful-Gene9733@alien.top 1 points 9 months ago (2 children)

With a system limited machine (2017 i5 iMac Cpu only) I am getting very pleasing results with:

Openhermes2-mistral (7B 4bit K_M quant) for general chat, desktop assistant, and some coding assistance - Ollama backend with my own front end U/I and llama-index libraries implementation. Haven’t tried 2.5 but may.

Synatra 7B mistral fine tune (4bit K_M quant) seems to produce longer responses and spicier with same system prompt (same use case as above)

Deepseek-coder 6.7B (4bit quant) as a coding assistant alternative to GPT-3.5 - just trying out in last week or so and building the personalized coding assistant front end u/I for fun

OrcaMini-3B - for chat when I just want something smaller and faster to run on my machine - the 7B quants are about max for the old iMac. But OrcaMini sometimes doesn’t give great stuff for me.

[–] SideShow_Bot@alien.top 1 points 9 months ago (1 children)

IIUC, for coding you suggest deepseek-coder-6.7b-instruct.Q4_K_M.gguf, right? Can I run it with 16 Gb? I'm on a i5 Windows machine, using LM Studio.

[–] Helpful-Gene9733@alien.top 1 points 9 months ago

Yes that’s the one from The Bloke. I imagine you could, but try it! I can run it on an old i5 3.4 GHz chip with 8GB RAM and it seems to run as long as I’m not trying to keep a bunch of stuff open and using up RAM. I haven’t really used it a lot so can’t tell fully yet.