overview for Thistleknot

Starling-RM-7B-alpha: New RLAIF Finetuned 7b Model beats Openchat 3.5 and comes close to GPT-4 in c/localllama@poweruser.forum

[–] Thistleknot@alien.top 1 points 2 years ago

rm is the reward model... not the same as the lm model. I tried the lm, wasn't impressed. Gpt-3.5 did better for summarizing quotes. It was good, but I honestly think open hermes and or synthia 1.3b do better

1

PEG (Progressively Learned Textual Embedding) (huggingface.co)

submitted 2 years ago by Thistleknot@alien.top to c/localllama@poweruser.forum

0 comments fedilink

The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data in c/localllama@poweruser.forum

[–] Thistleknot@alien.top 1 points 2 years ago

Yes I understand all that

Auto regressive is like arima In time series forecasting

Then rnn came along

Then sequence to sequence

They all have the last prediction is used as input for the next prediction in common

Hence auto regressive

The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data in c/localllama@poweruser.forum

[–] Thistleknot@alien.top 1 points 2 years ago (2 children)

I had to read that a few times.

Auto-Regressive is like forecasting, it's iterative.

LLM reliability is this vague concept of trying to get to the right answer.

Hence tree of thoughts as a way to 'plan' to that vague concept of the right answer.

Circumvents the univariate next token prediction limitation with parallel planning.

1

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding (alien.top)

submitted 2 years ago by Thistleknot@alien.top to c/localllama@poweruser.forum

4 comments fedilink

https://lmsys.org/blog/2023-11-21-lookahead-decoding/

The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data in c/localllama@poweruser.forum

[–] Thistleknot@alien.top 1 points 2 years ago (4 children)

https://twitter.com/ylecun/status/1728126868342145481?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Etweet

1

The Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic data (alien.top)

submitted 2 years ago by Thistleknot@alien.top to c/localllama@poweruser.forum

8 comments fedilink

https://www.interconnects.ai/p/q-star

[D] "AI systems are always deterministic," AI teacher says. How can I reply (with examples and papers)? in c/machinelearning@academy.garden

[–] Thistleknot@alien.top 1 points 2 years ago

sounds like free will vs determinism.

The real question is, are minds deterministic.

[D] "AI systems are always deterministic," AI teacher says. How can I reply (with examples and papers)? in c/machinelearning@academy.garden

[–] Thistleknot@alien.top 1 points 2 years ago

btw

instead of starting from a premise and trying to justify it.

Evaluate the facts before forming your thesis.

Because this sounds like a religious argument.

'Help me defend my belief.'

You shouldn't be asking for facts to support your conclusion, because you shouldn't have a conclusion without facts.

[D] "AI systems are always deterministic," AI teacher says. How can I reply (with examples and papers)? in c/machinelearning@academy.garden

[–] Thistleknot@alien.top 1 points 2 years ago

sounds like free will vs determinism.

The real question is, are minds deterministic.

Microsoft hires former OpenAI CEO Sam Altman in c/localllama@poweruser.forum

[–] Thistleknot@alien.top 1 points 2 years ago

How would a non compete work in this agreement

Felladrin/TinyMistral-248M-Alpaca in c/localllama@poweruser.forum

[–] Thistleknot@alien.top 1 points 2 years ago

I was going to try to knowledge distill but they modified their tokenizer.

Either way neo has a 125M model, so a 248M model is x2 that. I imagine this could be useful for shorter context tasks. Idk, or to continue training for very tight uses cases

I came across it while looking for tiny mistral config jsons to replicate⁸

https://preview.redd.it/l9l7a39u3a1c1.jpeg?width=720&format=pjpg&auto=webp&s=80589cb6fbb2268b0d8af65b4ec27647185b4780

1

Felladrin/TinyMistral-248M-Alpaca (huggingface.co)

submitted 2 years ago by Thistleknot@alien.top to c/localllama@poweruser.forum

1 comments fedilink

[P] Higgsfield.AI – Anyone can train Llama 70B or Mistral for free in c/machinelearning@academy.garden

[–] Thistleknot@alien.top 1 points 2 years ago

Same

Biden Executive Order regulates VERY large models in c/localllama@poweruser.forum

[–] Thistleknot@alien.top 1 points 2 years ago (1 children)

I imagine creating an app, putting it on everyone's cell phone, and using a fraction of the power, you can build an llm easily that would surpass any single data center.