"I'm millions of times faster than god was" is such a fantastic take. I'm going to remember that one.
Smart-Emu5581
joined 10 months ago
Yes. The overhead used depends on how often you make it store the data and in how much detail. Both of this is configurable. I find the overhead to be negligible in practice.
Mechanistic Interpretability
It's primarily intended for debugging, but it can also help with mechanistic interpretability. Being able to see the internals of your network for any input and at different stages of training can help a lot with understanding what's going on.
Most problems in machine learning have learning of the form "you did something wrong, here is what you should have done instead". RL has learning of the form "you did something wrong, but I'm not telling you what you should have done instead". Imagine trying to learn anything difficult like that.