overview for Wonderful_Ad

Qwen-72B released in c/localllama@poweruser.forum

[–] Wonderful_Ad_5134@alien.top 1 points 11 months ago

If the US keeps going full woke and are too afraid to work as hard as possible on the LLM ecosystem, China won't wait twice before winning this battle (which is basically the 21th century battle in terms of technology)

Feels sad to see the US decline like that...

A new way to speed up the work of transformers. in c/localllama@poweruser.forum

[–] Wonderful_Ad_5134@alien.top 1 points 11 months ago (1 children)

" we provide high-level CPU code achieving 78x speedup over the optimized baseline feedforward implementation"

Big if true, we wouldn't need to buy 3090 cards anymore to get sufficiant memory, just buying more RAM would suffice

NeuralHermes-2.5: Boosting SFT models' performance with DPO in c/localllama@poweruser.forum

[–] Wonderful_Ad_5134@alien.top 1 points 11 months ago

The improvement is so small it can be a margin of error

Starling-RM-7B-alpha: New RLAIF Finetuned 7b Model beats Openchat 3.5 and comes close to GPT-4 in c/localllama@poweruser.forum

[–] Wonderful_Ad_5134@alien.top 1 points 11 months ago (1 children)

"Close to GPT4" is as true as "Me, Close to Usain bolt in the 100m dash" lol

Why LocalLLaMa when GPT-4 exists? in c/localllama@poweruser.forum

[–] Wonderful_Ad_5134@alien.top 1 points 11 months ago

Local models aren't censored lol

Identity-PO: DeepMind takes the ELO out of DPO in c/localllama@poweruser.forum

[–] Wonderful_Ad_5134@alien.top 1 points 11 months ago (1 children)

So that means that we can get even better finetunes in the future? Noice!

Sam right now: in c/localllama@poweruser.forum

[–] Wonderful_Ad_5134@alien.top 1 points 11 months ago

Why?

The newly released Psyfighter2 13B, A better version of Tiefighter? in c/localllama@poweruser.forum

[–] Wonderful_Ad_5134@alien.top 1 points 11 months ago (1 children)

I'm getting tired of all those merges, as if this was the magical solution to everything

"Base" models were actually trained with some GPT instruct datasets in c/localllama@poweruser.forum

[–] Wonderful_Ad_5134@alien.top 1 points 1 year ago

I know right, getting that much investment on something you can easily cheat makes me sick

"Base" models were actually trained with some GPT instruct datasets in c/localllama@poweruser.forum

[–] Wonderful_Ad_5134@alien.top 1 points 1 year ago

Llama2 has been pre-trained on old data (before the chatGPT AI poisoning was significant)

https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md

"Data Freshness The pretraining data has a cutoff of September 2022, but some tuning data is more recent, up to July 2023."

"Model Dates Llama 2 was trained between January 2023 and July 2023."

StableLM3b has been trained on more recent datasets (cutoff of march 2023) yet it doesn't have this amount of chatgpt poisoning in it

https://huggingface.co/stabilityai/stablelm-base-alpha-3b-v2

https://preview.redd.it/gl46fo50n10c1.png?width=518&format=png&auto=webp&s=c7cae52b292dcba45dee735a4ca7efac5630a4ae

1

"Base" models were actually trained with some GPT instruct datasets (alien.top)

submitted 1 year ago by Wonderful_Ad_5134@alien.top to c/localllama@poweruser.forum

9 comments fedilink

Look at this, apart Llama1, all the other "base" models will likely answer "language" after "As an AI". That means Meta, Mistral AI and 01-ai (the company that made Yi) likely trained the "base" models with GPT instruct datasets to inflate the benchmark scores and make it look like the "base" models had a lot of potential, we got duped hard on that one.

https://preview.redd.it/vqtjkw1vdyzb1.png?width=653&format=png&auto=webp&s=91652053bcbc8a7b50bced9bbf8638fa417387bb

RAG - Vectara's Hallucination leaderboard in c/localllama@poweruser.forum

[–] Wonderful_Ad_5134@alien.top 1 points 1 year ago

"llama2 7b > llama2 13b"

lol

Look's like Mistral's cooking something tasty... no word on release date yet, though. in c/localllama@poweruser.forum

[–] Wonderful_Ad_5134@alien.top 1 points 1 year ago

please make a 13b model...