overview for CosmosisQ

Why didn't gpt4 work at first and how did they "fix it"? in c/localllama@poweruser.forum

[–] CosmosisQ@alien.top 1 points 11 months ago

He's just alluding to the fact that most enterprise customers are too stupid to use base models as they expect to be interacting with a human-like dialogue-driven agent or chatbot rather than a supercharged text completion engine. It's a shame given that, used properly, the GPT-4 base model is far superior to the lobotomized version made generally available through the API.

I need people to test my experiment - Dynamic Temperature in c/localllama@poweruser.forum

[–] CosmosisQ@alien.top 1 points 11 months ago

This is probably really difficult for this sub to understand but maybe it makes sense.

Bitch, I model Shannon entropy for a living. 🧐

Orca 2: Teaching Small Language Models How to Reason in c/localllama@poweruser.forum

[–] CosmosisQ@alien.top 1 points 11 months ago

IANAL, but theoretically, it's not possible to copyright model weights (at least in the US). While the licensing of large language models hasn't been specifically tested in court, people have tried and failed with other machine learning models. The alleged copyright holder may refuse to do business with you in the future, but you're unlikely to face legal repercussions.

Mistral Premium Model just mentioned at MS Ignite in c/localllama@poweruser.forum

[–] CosmosisQ@alien.top 1 points 11 months ago

Ooh, open or not, I'm seriously excited about this. Given the performance of Mistral-7B, a Mistral-180B model made available over an API of some sort would have a serious chance of dethroning GPT-4.

🐺🐦‍⬛ LLM Format Comparison/Benchmark: 70B GGUF vs. EXL2 (and AWQ) in c/localllama@poweruser.forum

[–] CosmosisQ@alien.top 1 points 11 months ago

Hell yeah! Two days in a row! We need more people doing format comparisons and benchmarks in general. Again, thank you for all of your hard work, and keep 'em coming!

How would you say EXL2 subjectively compares to GGUF? Have you had the chance to roleplay with both formats outside of Voxta+VaM (i.e., in SillyTavern)? I ask because I'm sure the increased generation speed is more important than anything when using Voxta+VaM so it might be easier to compare their output quality in SillyTavern.

On that note, would you say you now prefer using lzlv (70B, EXL2) over OpenChat 3.5 (7B, GGUF) with Voxta+VaM?