LocalLLaMA

4 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago

MODERATORS

communick@poweruser.forum

Models Megathread #2 - What models are you currently using? (alien.top)

submitted 1 year ago by Technical_Leather949@alien.top to c/localllama@poweruser.forum

56 comments fedilink hide all child comments

As requested, this is the subreddit's second megathread for model discussion. This thread will now be hosted at least once a month to keep the discussion updated and help reduce identical posts.

I also saw that we hit 80,000 members recently! Thanks to every member for joining and making this happen.

Welcome to the r/LocalLLaMA Models Megathread

What models are you currently using and why? Do you use 7B, 13B, 33B, 34B, or 70B? Share any and all recommendations you have!

Examples of popular categories:

Assistant chatting
Chatting
Coding
Language-specific
Misc. professional use
Role-playing
Storytelling
Visual instruction

Have feedback or suggestions for other discussion topics? All suggestions are appreciated and can be sent to modmail.

^(P.S. LocalLLaMA is looking for someone who can manage Discord. If you have experience modding Discord servers, your help would be welcome. Send a message if interested.)

Previous Thread | New Models

you are viewing a single comment's thread
view the rest of the comments

[–] sophosympatheia@alien.top 1 points 1 year ago (1 children)

I'm one of those weirdos merging 70b models together for fun. I mostly use my own merges now as they've become quite good. (Link to my Hugging Face page where I share my merges.) I'm mostly interested in roleplaying and storytelling with local LLMs.

[–] Simusid@alien.top 1 points 1 year ago (1 children)

What method do you use to merge them? Mixture of experts?

[–] sophosympatheia@alien.top 1 points 1 year ago

There are several popular methods, all supported by the lovely mergekit project at https://github.com/cg123/mergekit.

The ties merge method is the newest and most advanced method. It works well because it implements some logic to minimize how much the models step on each other's toes when you merge them together. Mergekit also makes it easy to do "frankenmerges" using the passthrough method where you interleave layers from different models in a way that extends the resultant model's size beyond the normal limits. For example, that's how goliath-120b was made from two 70b models merged together.