this post was submitted on 17 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
top 6 comments
sorted by: hot top controversial new old
[–] Wonderful_Ad_5134@alien.top 1 points 10 months ago (1 children)

I'm getting tired of all those merges, as if this was the magical solution to everything

[–] arekku255@alien.top 1 points 10 months ago

At a high level, merges allows you to get a lot of training done cheaply.

If you have a model finetuned on set A and another model finetuned on set B, merging these would allow you to very cheaply create a model that was trained on both set A and set B.

It is the magical solution to "I can't afford to finetune a model".

[–] Desm0nt@alien.top 1 points 10 months ago (1 children)

Is it still only 4k context size?

I hope one day someone somehow find a way to extend context of Tiefighter atleast to 8k.
Because it's the perfect model for real-time RP and stories even on weak PCs. It's smarter than all 7b and 13b models and smarter than many 30b models, but the modest context of 4k tokens is eaten up faster than you can enjoy its potential...

[–] Majestical-psyche@alien.top 1 points 10 months ago

Sorry I was playing around with it for the last day… So far I prefer it over 34B Dolphin Yi (GGUF Q4_K_m)… As for context size I only used 8k and it was pretty good with going far back. It might be able to do 12K, idk.. Haven’t tried it.

[–] FPham@alien.top 1 points 10 months ago

Such a dejavu from CivitAi - randomly merge lot of stuff, and see. In my view Mythomax 13B was probably the best merge and also a lucky strike, because the same formula didn't work for other merges that well, nor the new mythomax redo surpassed the old one.

It's 40% voodoo and 60% of luck.

[–] frozen_tuna@alien.top 1 points 10 months ago

Anyone know a good tutorial for how merges are made?