currentscurrents

joined 1 year ago
[–] currentscurrents@alien.top 1 points 11 months ago (1 children)

Model based RL is looking a little more stable in the last year. Dreamerv3 and TD-MPC2 claim to be able to train on hundreds of tasks with no per-task hyperparameter tuning, and report smooth loss curves that scale predictably.

Have to wait and see if it pans out though.

[–] currentscurrents@alien.top 1 points 11 months ago

They are all known to be stable, because they have a ground-truth simulator to test with. Stable doesn't necessarily mean useful, but that wasn't the point.

The benefit here is that training a neural network on simulator data allows you to generate instead of search. The simulator is very computationally expensive (even compared to a deep neural network) and the search space is large and high-dimensional.

[–] currentscurrents@alien.top 1 points 11 months ago

I have done something similar using gradient descent and this differentiable vector graphics library. It converges much faster than genetic algorithms.

A good initialization, like the other commenter's voronoi idea, would speed it up considerably.

[–] currentscurrents@alien.top 1 points 11 months ago

Aren't these "natural attacks" (rotating or blurring the maze, etc) just distribution shifts?

They are in-distribution for us, since we've seen objects under all sorts of optical conditions. But they're out-of-distribution for the RL model, which has only ever seen this exact maze format.

[–] currentscurrents@alien.top 1 points 11 months ago

Word is that AMD support is getting better, but by far everyone is still using NVidia.

[–] currentscurrents@alien.top 1 points 11 months ago (1 children)

It really doesn't, because no one has any clue what Q* is or if it's even real.

[–] currentscurrents@alien.top 1 points 11 months ago (1 children)

TBH this sub would be a lot better if they banned OpenAI news/rumors.

I don't even mind the "I'm a noob, why won't my model train" posts - at least those people have genuine interest in ML and are trying to learn. OpenAI news attracts people who have a more science-fiction idea of AI and are just interested in the hype.

[–] currentscurrents@alien.top 1 points 11 months ago

no ML technique has been shown to do anything more than just mimic statistical aspects of the training set.

Reinforcement learning does far more than mimic.

[–] currentscurrents@alien.top 1 points 11 months ago

IMO interpretability and debugging are inherently related. The more you know about how the network works, the easier it will be to debug it.

[–] currentscurrents@alien.top 1 points 11 months ago (3 children)

you won't get published without doing proper evaluation

Idk man, I've seen some pretty sketchy papers this year.

[–] currentscurrents@alien.top 1 points 11 months ago (1 children)

Then you lose the 2D grid structure of the image, which is why you want to use a CNN in the first place.

I think it's possible to apply many of these optimizations to 2D convs as well though. This group is just more interested in language modeling than images.

[–] currentscurrents@alien.top 1 points 11 months ago (3 children)

Just built this to try on my CNN and then realized it was only for 1D convolutions. Whoops.

view more: next ›