overview for residentmouse

X.AI Grok could potentially be open sourced on a 6 month delay from launch in c/localllama@poweruser.forum

[–] residentmouse@alien.top 1 points 1 year ago (6 children)

Elron's track record with respect to product announcements is so woeful you'd be better off, statistically speaking, to assume Grok has zero chances of being open source now that it's been announced.

[R] Cross-Axis Transformer with 2d Rotary Embeddings in c/machinelearning@academy.garden

[–] residentmouse@alien.top 1 points 1 year ago

Very cool paper.

[D] - What is the latest in tree-based approaches for LLMs? Has there been any significant research using RL for this? in c/machinelearning@academy.garden

[–] residentmouse@alien.top 1 points 1 year ago (1 children)

Great question, curious in the answer myself.

I think it’s pretty cool that just iteratively reusing an LLM without additional training, i.e chaining prompts, improves quality in most of these methods. I see quite a few of these papers (e.g System 2 Attention).

The Promptbreeder paper has some benchmarking of these methods & proposes an interesting evolutionary prompting strategy.

But like you I’ve been looking / waiting for the papers that explore specifically finetuning the model “nodes”, using LoRA perhaps, or with a meta network or hyper network.

[R] Rethinking Open'sAI's Q-Learning : Insights from the Award-Winning 'Non-delusional Q-learning' Paper in c/machinelearning@academy.garden

[–] residentmouse@alien.top 1 points 1 year ago

Yeah, so largely I think you’ve hit the nail but just in case you don’t know the fervour is a deliberately leaked project name “Q*” and the suggestion it precipitated the OpenAI board drama. Now, is this probably a tactic to keep prices high so stock sells @ the 65B valuation OAI had prior to the drama? Sure.

But it’s still fun to speculate.

[P] Unsupervised clustering of time series data in c/machinelearning@academy.garden

[–] residentmouse@alien.top 1 points 1 year ago (1 children)

If you don’t know how many of the 3000 events are detectable by an expert, how do you know your 60% classifier isn’t better than an expert already?

[D] Exclusive: Sam Altman's ouster at OpenAI was precipitated by letter to board about AI breakthrough in c/machinelearning@academy.garden

[–] residentmouse@alien.top 1 points 1 year ago (5 children)

OK, so full speculation: this project could be an impl. of Q-Learning (i.e unsupervised reinforcement learning) on an internal GPT model. This would no doubt be an agent model.

Other evidence? The * implies a graph traversal algorithm, which obviously plays a huge role in RL exploration, but also GPT models are already doing their own graph traversal via beam search to do next token prediction.

Are they perhaps hooking up an RL trained model to replace their beam search?

A[r]xiv Dives - Fine-tuning with LoRA paper deep dive in c/machinelearning@academy.garden

[–] residentmouse@alien.top 1 points 2 years ago (1 children)

Well now I feel almost obligated to click - is the part of the title "deep dive" completely misleading or is the post really just a LoRA explanation?

[D]Three things I think should get more attention in large language models in c/machinelearning@academy.garden

[–] residentmouse@alien.top 1 points 2 years ago

I’d add a few others to this list but I largely agree with the premise that we focus too much on attention. We lavish praise on the Transformer model but there is so much extra machinery that goes into it to make it work even a little bit, and now papers are coming out claiming ConvNets scale at the same learning rate, and the RetNet paper claims you can swap out attention altogether.

Obv. the issue is “emergence” (terrible term, but I mean non-linear training performance) and the sheer cost of testing permutations of LLM architecture at scale. To what extent has the ML community become the victim of sunk cost?