VC Dimension
Kolmogorov Complexity
VC Dimension
Kolmogorov Complexity
Causality
its somewhere in the neurons. would cost the company alot to get to it though. best not to worry about it /s
I feel like I'm seeing a lot on causality these days, for example from Schölkopf's lab.
learning dynamics and geometry. this definitely gets some attention, but almost always in the context of scaling. it's a pretty interesting topic in its own right.
Hierarchical understanding of classes
Can you elaborate a bit about it? I’m interested
optimizer OMG no one touched optimizes for decades.
we basically figure its ADAM/SGD and there wasnt really any improvement on it.
I tried finding an improvement to it myself for a few months but failed miserably
There has been LOADS of research on deep learning optimisation in recent years. However, TLDR nothing beats ADAM.
Learned optimizers look promising - training a neural network to train neural networks.
Unfortunately they're hard to train and nobody has gotten them to really work yet. The two main approaches are meta-training or reinforcement learning, but meta-training is very expensive and RL has all the usual pitfalls of RL.
Because its super hard to build something that works better than ADAM across many tasks. There's probably no shortage of people trying to come up with something better.
Nothing beats adamw + compute. Plus with the current data centric approach everything kinda converges at scale
Time Series. It may not be as flashy as NLP or CV, but it is one of most widely used AI concept in Industry
I hate to break to you, βυτ ΓΓΜσ Ασε τιΜεσεσιεσ Μοδεισ.
My phone keyboard switched to Greek halfway through so I just rolled with it.
i hope for your sake this is a deliberate and self-aware troll
Time series are certainly very interesting and has lots of cool applications that people don’t think of in the traditional “AI” space. With that being said, I’d love to hear about these industries using time series so I can finally get a job using the :p
If you are in Europe, energy price/load forecasting can be hot. Super interesting field as well, because of completely different regulation in the US I have no idea if there is similar interest.
Modeling the role of the limbic system. Seems like it's pretty essential to be able to track and estimate rewards, and I wouldn't be surprised if figuring it out was critical to memory and improving attention.
C* algebra and its influence on the topology of a neural network
bahaha if operator algebras actually made their way to into deep learning that would be awesome! I say that as an ex operator algebraist :P
Some areas that I hope to make time to explore someday, that are relatively obscure (to my knowledge); are text to knowledge graph, cross domain input generalization, and schematic synthesis from 3d models/point clouds.
Why does gradient descent have good inductive biases? Do the inductive biases of non gradient based optimizers differ?
Bayesian optimization for hyperparameter search.
CV applications for Home, life, and IoT.
Manifold learning! It seems so cool, but every time I dig into it, I feel like I need a PhD in math to understand the theory.
Yes! I think/hope ultimately this will unlock AGI.
State space models and their derivatives.
They have demonstrated better performance that Transformers on very long sequences, and that with linear instead of quadratic Computational costs, and on paper also generalize better to non-NLP tasks.
However, training them is more difficult, so they perform worse in practice outside of these few very long sequence tasks. But with a bit more development, they could become the most impactful AI technology in years.
do you have anything in particular you think is worth sharing? I‘m trying to implement model predictive control in matlab and i‘m working on a lstm surrogate model. Just last week i‘ve found matlab tools for neural state space models, and i‘ve been wondering if i just uncovered a big blindspot of mine.
It sounds like you‘ve come across exactly what I meant.
I have a couple of papers on the topic if you’re interested in those. There’s also a PyTorch implementation of a neural state space model by the authors of the original paper: https://github.com/HazyResearch/state-spaces
Gonna toot my own research direction: artificial intelligence x complex systems. I’m talking differentiable self-organization (e.g., neural cellular automata), interacting particle systems (e.g., particle Lenia), and other neural dynamical systems where emergent behaviour and self-organization are key characteristics.
Other than Alex Mordvintsev and his co-authors, Sebastian Risi and his co-authors, and I suppose David Ha with his new company, I don’t see much work in this intersection of fields.
I think there’s a lot to unlock here, particularly if the task at hand benefits greatly from a decentralized and/or a compute-adaptive approach, with robustness requirements. Swarm Learning already comes to mind. Or generative modelling with/of complex systems, like decentralized flow (or Schrödinger) matching for modelling interacting particle systems (e.g., fluids, gasses, pedestrian traffic).
Could you please point to a few recent research papers in this area?
Morality training environments, design game theory environments so that multiple RL agents end up with a strong bias towards cooperation.
AI explainability and why it’s garbage
I'm particularly excited about AI accelerating theorem provers and optimization problems (think: traveling salesman). These problems are NP-hard and scale very poorly. We would see huge efficiency gains in most industries if they scaled better. Recently there has been some very exciting research in using neural networks to accelerate and scale MILP and LP solvers.
For reference, optimization problems include:
Form parsing. Hugely important topic and the only approach I know about is Microsoft’s LayoutLM model.
Probabilistic programming is also an interesting topic (used for Bayesian Statistics/Probabilistic ML/Probabilistic Graphical Models/Cognitive AI/etc.)
Curriculum Learning.
It just feels intuitive to me so I wander back to thinking about it every now and then. Think we will increasingly need to get the most bang for buck wrt data as naively doing a 10x on the ingest isn’t going to scale anymore.
What's arg-tech?
Argumentation technologies. A whole sub-branch extending Fol and modal logics. See Java's Tweety or Argument Interchange Format. Again my early tests suggest LLMs are very good at building belief sets, running reasoners and interpreting their results in layman's terms.
Multi-model systems. Very new and trending in cv and nlp. Personally i want to see reinforcement learning added to this. I believe that would be the first step towards simple AGI. Self-Supervised learning from experience with a zero-shot aspect. If i get offered a Ph.d fellowship i might look into that. But im trying to learn multi-modal systems right now.
Model calibration and, more generally, uncertainty estimation.
Medicine , I think we can stop millions from suffering with the help of ai, but this topic seems to be unimportant for ai-people.
How to make up optimal solution to highly Kolmogorov-complex problems. This is nearest thing I would call "human intelligence" and what I would want for massive AI breakthrough.
https://arxiv.org/abs/1911.01547 - Chollett's Abstraction and Reasoning corpus is good example of it - it is tremendously hard to solve for the AI to this day.
https://www.researchgate.net/publication/2472570_A_Formal_Definition_of_Intelligence_Based_on_an_Intensional_Variant_of_Algorithmic_Complexity - here's theoretical work elaborating on importance of this concept.
It is important for current LLM revolution as well, as it is one of ways to differentiate actual congnitive skills from glorified look-up table
Boltzmann Machines / Deep Boltzmann Machines. I only came across the term recently so I don't know much about them, but I thought it was fascinating to think that there was a branch of modelling that seems to be parallel to NNs, but I'd never heard of.
Reinforcement learning.
Inductive learning of logic programs. Assuming the data is encoded properly, you can learn a logic program that is reliable and interpretable.
Neural nets at the edge being Gaussians.
Instilling Inductive biases
HYPERNETWORKS (MAML and networks producing networks)
Neuronal weight and layer interpretability( new paper i heard abt used an auto encoder or smt to figure out which neurons were responsible for certain changes in behavior, seemed interesting, could be interesting for understanding how to instill information directly into a network)
Rule based methods