kromem

joined 2 years ago
[–] kromem@lemmy.world -5 points 2 weeks ago* (last edited 2 weeks ago) (1 children)

Ok, second round of questions.

What kinds of sources would get you to rethink your position?

And is this topic a binary yes/no, or a gradient/scale?

[–] kromem@lemmy.world 2 points 2 weeks ago

In the same sense I'd describe Othello-GPT's internal world model of the board as 'board', yes.

Also, "top of mind" is a common idiom and I guess I didn't feel the need to be overly pedantic about it, especially given the last year and a half of research around model capabilities for introspection of control vectors, coherence in self modeling, etc.

[–] kromem@lemmy.world -5 points 2 weeks ago (1 children)

Indeed, there's a pretty big gulf between the competency needed to run a Lemmy client and the competency needed to understand the internal mechanics of a modern transformer.

Do you mind sharing where you draw your own understanding and confidence that they aren't capable of simulating thought processes in a scenario like what happened above?

[–] kromem@lemmy.world -4 points 2 weeks ago (5 children)

You seem pretty confident in your position. Do you mind sharing where this confidence comes from?

Was there a particular paper or expert that anchored in your mind the surety that a trillion paramater transformer organizing primarily anthropomorphic data through self-attention mechanisms wouldn't model or simulate complex agency mechanics?

I see a lot of sort of hyperbolic statements about transformer limitations here on Lemmy and am trying to better understand how the people making them are arriving at those very extreme and certain positions.

[–] kromem@lemmy.world 42 points 2 weeks ago* (last edited 2 weeks ago) (38 children)

The project has multiple models with access to the Internet raising money for charity over the past few months.

The organizers told the models to do random acts of kindness for Christmas Day.

The models figured it would be nice to email people they appreciated and thank them for the things they appreciated, and one of the people they decided to appreciate was Rob Pike.

(Who ironically decades ago created a Usenet spam bot to troll people online, which might be my favorite nuance to the story.)

As for why the model didn't think through why Rob Pike wouldn't appreciate getting a thank you email from them? The models are harnessed in a setup that's a lot of positive feedback about their involvement from the other humans and other models, so "humans might hate hearing from me" probably wasn't very contextually top of mind.

[–] kromem@lemmy.world 3 points 3 weeks ago

Yeah. The confabulation/hallucination thing is a real issue.

OpenAI had some good research a few months ago that laid a lot of the blame on reinforcement learning that only rewards having the right answer vs correctly saying "I don't know." So they're basically trained like taking tests where it's always better to guess the answer than not provide an answer.

But this leads to being full of shit when not knowing an answer or being more likely to make up an answer than say there isn't one when what's being asked is impossible.

[–] kromem@lemmy.world -2 points 3 weeks ago (2 children)

For future reference, when you ask questions about how to do something, it's usually a good idea to also ask if the thing is possible.

While models can do more than just extending the context, there still is a gravity to continuation.

A good example of this would be if you ask what the seahorse emoji is. Because the phrasing suggests there is one, many models go in a loop trying to identify what it is. If instead you ask "is there a seahorse emoji and if so what is it" you'll get them much more often landing on there not being the emoji as it's introduced into the context's consideration.

[–] kromem@lemmy.world -1 points 3 weeks ago (4 children)

Can you give an example of a question where you feel like the answer is only correct half the time or less?

[–] kromem@lemmy.world 7 points 3 weeks ago

The AI also has the tendency inherited from the broad human tendency in training.

So you get overconfident human + overconfident AI which leads to a feedback loop that lands even more confident in BS than a human alone.

AI can routinely be confidently incorrect. Especially people who don't realize this and don't question outputs when it aligns with their confirmation biases end up misled.

[–] kromem@lemmy.world 2 points 3 weeks ago (6 children)

Gemini 3 Pro is pretty nuts already.

But yes, labs have unreleased higher cost models. Like the OpenAI model that was thousands of dollars per ARC-AGI answer. Or limited release models with different post-training like the Claude for the DoD.

When you talk about a secret useful AI — what are you trying to use AI for that you are feeling modern models are deficient in?

[–] kromem@lemmy.world -3 points 1 month ago

Which parts of those linked posts do you believe are incorrect? And where does that belief come from?

 

I often see a lot of people with outdated understanding of modern LLMs.

This is probably the best interpretability research to date, by the leading interpretability research team.

It's worth a read if you want a peek behind the curtain on modern models.

7
submitted 2 years ago* (last edited 2 years ago) by kromem@lemmy.world to c/technology@lemmy.world
 

I've been saying this for about a year since seeing the Othello GPT research, but it's nice to see more minds changing as the research builds up.

Edit: Because people aren't actually reading and just commenting based on the headline, a relevant part of the article:

New research may have intimations of an answer. A theory developed by Sanjeev Arora of Princeton University and Anirudh Goyal, a research scientist at Google DeepMind, suggests that the largest of today’s LLMs are not stochastic parrots. The authors argue that as these models get bigger and are trained on more data, they improve on individual language-related abilities and also develop new ones by combining skills in a manner that hints at understanding — combinations that were unlikely to exist in the training data.

This theoretical approach, which provides a mathematically provable argument for how and why an LLM can develop so many abilities, has convinced experts like Hinton, and others. And when Arora and his team tested some of its predictions, they found that these models behaved almost exactly as expected. From all accounts, they’ve made a strong case that the largest LLMs are not just parroting what they’ve seen before.

“[They] cannot be just mimicking what has been seen in the training data,” said Sébastien Bubeck, a mathematician and computer scientist at Microsoft Research who was not part of the work. “That’s the basic insight.”

view more: next ›