LocalLLaMA

3 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago

MODERATORS

communick@poweruser.forum

What is Q* and how do we use it? (alien.top)

submitted 11 months ago by georgejrjrjr@alien.top to c/localllama@poweruser.forum

46 comments fedilink hide all child comments

Reuters is reporting that OpenAI achieved an advance with a technique called Q* (pronounced Q-Star).

So what is Q*?

I asked around the AI researcher campfire and…

It’s probably Q Learning MCTS, a Monte Carlo tree search reinforcement learning algorithm.

Which is right in line with the strategy DeepMind (vaguely) said they’re taking with Gemini.

Another corroborating data-point: an early GPT-4 tester mentioned on a podcast that they are working on ways to trade inference compute for smarter output. MCTS is probably the most promising method in the literature for doing that.

So how do we do it? Well, the closest thing I know of presently available is Weave, within a concise / readable Apache licensed MCTS lRL fine-tuning package called minihf.

https://github.com/JD-P/minihf/blob/main/weave.py

I’ll update the post with more info when I have it about q-learning in particular, and what the deltas are from Weave.

you are viewing a single comment's thread
view the rest of the comments

[–] rarted_tarp@alien.top 1 points 11 months ago (4 children)

Has to be a mix of Q-learning and A* right?

[–] Local_Beach@alien.top 1 points 11 months ago

Mayve an A* search in vector space

[–] RaiseRuntimeError@alien.top 1 points 11 months ago

I was going to say it seems like it was just yesterday I was learning A* and now I find out that they are already up to Q*

[–] letsburn00@alien.top 1 points 11 months ago (2 children)

I know you're joking, but it's hilarious how many random things in science just got given letters.

A* is the algorithm your phone uses to help you drive home....and the supermassive black hole in the centre of the galaxy.

[–] TheOtherKaiba@alien.top 1 points 11 months ago (1 children)

It's also a star.

[–] Unfair-Emergency-658@alien.top 1 points 11 months ago

What is a star?

[–] KallistiTMP@alien.top 1 points 11 months ago

....and the supermassive black hole in the centre of the galaxy.

What did you think they were gonna use for that? Djikstra's?

[–] DoubleDisk9425@alien.top 1 points 11 months ago

Can you please ELI-idiot?