LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Look for a model better than MythoMax for Chat/RP (alien.top)

submitted 2 years ago by Maxumilian@alien.top to c/localllama@poweruser.forum

8 comments fedilink hide all child comments

Somehow I keep coming back to MythoMax. I dunno if I'm prompting newer models wrong or what but in the 13B space MythoMax just keeps giving me the best results.

Anyone have someone else they like and can recommend? Maybe something with a longer context? I feel like I have to be screwing something up is why newer models aren't performing as well for me but I also kind of want a head nod saying that's the case and that there's better stuff out there.

Edit: Sorry for typo in the title but I can't fix it. T_T

top 8 comments

sorted by: hot top controversial new old

[–] BackyardAnarchist@alien.top 1 points 2 years ago

these are my suggestions.

https://huggingface.co/TheBloke/cat-v1.0-13B-GPTQ

https://huggingface.co/TheBloke/Augmental-Unholy-13B-GPTQ

https://huggingface.co/TheBloke/HornyEchidna-13B-v0.1-GPTQ

and the one i keep coming back too but can barely run.

https://huggingface.co/TheBloke/MXLewd-L2-20B-GPTQ

[–] Material1276@alien.top 1 points 2 years ago

Heres a link to a up to date ranking of models for RP. Currently 400+ models ranked.

http://ayumi.m8geil.de/ayumi_bench_v3_results.html

[–] Tacx79@alien.top 1 points 2 years ago (1 children)

What about Cat 13b 1.0? It slipped through here without much attention but it looks really good, with 16gb you could run q8

[–] Herr_Drosselmeyer@alien.top 1 points 2 years ago (1 children)

with 16gb you could run q8

Not really though. Any kind of context will push you over 16gb. Or I'm doing something wrong.

[–] Tacx79@alien.top 1 points 2 years ago (1 children)

GGUF? Even on gtx 1080 you get like 4t/s with q8 which is almost as fast as average person read speed, with 16gb it should be 4-5x faster

[–] Herr_Drosselmeyer@alien.top 1 points 2 years ago

Hadn't thought of that. I have 24gb so I've always used GPTQ and with that, you really need more than 16gb.

[–] WolframRavenwolf@alien.top 1 points 2 years ago

Chat/RP is one of my main use cases so I test for that specifically - check out my latest LLM Comparison/Test which includes links to my previous tests.

[–] spatenkloete@alien.top 1 points 2 years ago

I really liked Echidna-Tiefighter. Characters act way more natural than with any other 13B model I tried.