this post was submitted on 09 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 11 months ago
MODERATORS
 

I'm a data engineer who somehow ended up as a software developer. So many of my friends are working now with the OpenAI api to add generative capabilities to their product, but they lack A LOT of context when it comes to how LLMs actually works.

This is why I started writing popular-science style articles that unpack AI concepts for software developers working on real-world application. It started kind of slow, honestly I wrote a bit too "brainy" for them, but now I've found a voice that resonance with this audience much better and I want to ramp up my writing cadence.

I would love to hear your thoughts about what concepts I should write about next?
What get you excited and you find hard to explain to someone with a different background?

top 50 comments
sorted by: hot top controversial new old
[–] ohell@alien.top 1 points 10 months ago

VC Dimension

Kolmogorov Complexity

[–] Majestij@alien.top 1 points 10 months ago (2 children)
[–] samrus@alien.top 1 points 10 months ago

its somewhere in the neurons. would cost the company alot to get to it though. best not to worry about it /s

[–] RandomTensor@alien.top 1 points 10 months ago

I feel like I'm seeing a lot on causality these days, for example from Schölkopf's lab.

[–] DigThatData@alien.top 1 points 10 months ago

learning dynamics and geometry. this definitely gets some attention, but almost always in the context of scaling. it's a pretty interesting topic in its own right.

[–] Ok_Attitude_2376@alien.top 1 points 10 months ago (1 children)

Hierarchical understanding of classes

[–] diegoquezadac21@alien.top 1 points 10 months ago

Can you elaborate a bit about it? I’m interested

[–] rejectedlesbian@alien.top 1 points 10 months ago (4 children)

optimizer OMG no one touched optimizes for decades.
we basically figure its ADAM/SGD and there wasnt really any improvement on it.

I tried finding an improvement to it myself for a few months but failed miserably

[–] charlesGodman@alien.top 1 points 10 months ago

There has been LOADS of research on deep learning optimisation in recent years. However, TLDR nothing beats ADAM.

[–] currentscurrents@alien.top 1 points 10 months ago

Learned optimizers look promising - training a neural network to train neural networks.

Unfortunately they're hard to train and nobody has gotten them to really work yet. The two main approaches are meta-training or reinforcement learning, but meta-training is very expensive and RL has all the usual pitfalls of RL.

[–] satireplusplus@alien.top 1 points 10 months ago

Because its super hard to build something that works better than ADAM across many tasks. There's probably no shortage of people trying to come up with something better.

[–] koolaidman123@alien.top 1 points 10 months ago

Nothing beats adamw + compute. Plus with the current data centric approach everything kinda converges at scale

[–] Snoo_72181@alien.top 1 points 10 months ago (3 children)

Time Series. It may not be as flashy as NLP or CV, but it is one of most widely used AI concept in Industry

[–] osdd_alt_123@alien.top 1 points 10 months ago (1 children)

I hate to break to you, βυτ ΓΓΜσ Ασε τιΜεσεσιεσ Μοδεισ.

My phone keyboard switched to Greek halfway through so I just rolled with it.

[–] samrus@alien.top 1 points 10 months ago

i hope for your sake this is a deliberate and self-aware troll

[–] A_HumblePotato@alien.top 1 points 10 months ago (1 children)

Time series are certainly very interesting and has lots of cool applications that people don’t think of in the traditional “AI” space. With that being said, I’d love to hear about these industries using time series so I can finally get a job using the :p

[–] MCC0nfusing@alien.top 1 points 10 months ago

If you are in Europe, energy price/load forecasting can be hot. Super interesting field as well, because of completely different regulation in the US I have no idea if there is similar interest.

load more comments (1 replies)
[–] CreationBlues@alien.top 1 points 10 months ago

Modeling the role of the limbic system. Seems like it's pretty essential to be able to track and estimate rewards, and I wouldn't be surprised if figuring it out was critical to memory and improving attention.

[–] tripple13@alien.top 1 points 10 months ago (1 children)

C* algebra and its influence on the topology of a neural network

[–] AbjectDrink3276@alien.top 1 points 10 months ago

bahaha if operator algebras actually made their way to into deep learning that would be awesome! I say that as an ex operator algebraist :P

[–] michelin_chalupa@alien.top 1 points 10 months ago

Some areas that I hope to make time to explore someday, that are relatively obscure (to my knowledge); are text to knowledge graph, cross domain input generalization, and schematic synthesis from 3d models/point clouds.

[–] bestgreatestsuper@alien.top 1 points 10 months ago

Why does gradient descent have good inductive biases? Do the inductive biases of non gradient based optimizers differ?

[–] MelonheadGT@alien.top 1 points 10 months ago

Bayesian optimization for hyperparameter search.

CV applications for Home, life, and IoT.

[–] -Django@alien.top 1 points 10 months ago (1 children)

Manifold learning! It seems so cool, but every time I dig into it, I feel like I need a PhD in math to understand the theory.

[–] hookers@alien.top 1 points 10 months ago

Yes! I think/hope ultimately this will unlock AGI.

[–] General_Service_8209@alien.top 1 points 10 months ago (1 children)

State space models and their derivatives.

They have demonstrated better performance that Transformers on very long sequences, and that with linear instead of quadratic Computational costs, and on paper also generalize better to non-NLP tasks.

However, training them is more difficult, so they perform worse in practice outside of these few very long sequence tasks. But with a bit more development, they could become the most impactful AI technology in years.

[–] BEEIKLMRU@alien.top 1 points 10 months ago (1 children)

do you have anything in particular you think is worth sharing? I‘m trying to implement model predictive control in matlab and i‘m working on a lstm surrogate model. Just last week i‘ve found matlab tools for neural state space models, and i‘ve been wondering if i just uncovered a big blindspot of mine.

[–] General_Service_8209@alien.top 1 points 10 months ago

It sounds like you‘ve come across exactly what I meant.

I have a couple of papers on the topic if you’re interested in those. There’s also a PyTorch implementation of a neural state space model by the authors of the original paper: https://github.com/HazyResearch/state-spaces

[–] tesfaldet@alien.top 1 points 10 months ago (3 children)

Gonna toot my own research direction: artificial intelligence x complex systems. I’m talking differentiable self-organization (e.g., neural cellular automata), interacting particle systems (e.g., particle Lenia), and other neural dynamical systems where emergent behaviour and self-organization are key characteristics.

Other than Alex Mordvintsev and his co-authors, Sebastian Risi and his co-authors, and I suppose David Ha with his new company, I don’t see much work in this intersection of fields.

I think there’s a lot to unlock here, particularly if the task at hand benefits greatly from a decentralized and/or a compute-adaptive approach, with robustness requirements. Swarm Learning already comes to mind. Or generative modelling with/of complex systems, like decentralized flow (or Schrödinger) matching for modelling interacting particle systems (e.g., fluids, gasses, pedestrian traffic).

[–] hophophop1233@alien.top 1 points 10 months ago (1 children)

I think this is the most important topic in this thread so far.

load more comments (1 replies)
[–] kau_mad@alien.top 1 points 10 months ago

Could you please point to a few recent research papers in this area?

load more comments (1 replies)
[–] PrincessPiratePuppy@alien.top 1 points 10 months ago

Morality training environments, design game theory environments so that multiple RL agents end up with a strong bias towards cooperation.

[–] nicolas-gervais@alien.top 1 points 10 months ago

AI explainability and why it’s garbage

[–] PhilsburyDoboy@alien.top 1 points 10 months ago

I'm particularly excited about AI accelerating theorem provers and optimization problems (think: traveling salesman). These problems are NP-hard and scale very poorly. We would see huge efficiency gains in most industries if they scaled better. Recently there has been some very exciting research in using neural networks to accelerate and scale MILP and LP solvers.

For reference, optimization problems include:

  • SpaceX rocket landing
  • Car navigation systems
  • Electric grid operations/markets
  • Portfolio optimization
  • Stock and options trading
  • Airline fleet operations
  • Ship/Truck logistics
[–] coffeecoffeecoffeee@alien.top 1 points 10 months ago

Form parsing. Hugely important topic and the only approach I know about is Microsoft’s LayoutLM model.

[–] Exotic_Zucchini9311@alien.top 1 points 10 months ago

Probabilistic programming is also an interesting topic (used for Bayesian Statistics/Probabilistic ML/Probabilistic Graphical Models/Cognitive AI/etc.)

[–] 10110110100110100@alien.top 1 points 10 months ago

Curriculum Learning.

It just feels intuitive to me so I wander back to thinking about it every now and then. Think we will increasingly need to get the most bang for buck wrt data as naively doing a 10x on the ingest isn’t going to scale anymore.

[–] Jessynoo@alien.top 1 points 10 months ago (1 children)
  • Symbolic learning (Kbil etc.) has kind of faded away, and the whole chapter was nuked from AIMA. Waiting for its comeback.
  • Game theory has made a lot of progresses both on the algorithmic and the societal sides (Regret minimisation, bayesian and differential games, topology of elementary games, mechanism design, social choice theory etc). Hopefully it will get democratized at some point, because it is needed.
  • Probabilistic programming does not seem to get as much traction recently, but it seems the corresponding approaches extend ML and provide a bridge with Symbolic approaches.
  • Arg-tech and more generally semantic-web still seem niche, whereas LLMs are the perfect tools to finally get it done. It can also do some good to our current societal issues.
[–] pfaya@alien.top 1 points 10 months ago (1 children)
[–] Jessynoo@alien.top 1 points 10 months ago (1 children)

Argumentation technologies. A whole sub-branch extending Fol and modal logics. See Java's Tweety or Argument Interchange Format. Again my early tests suggest LLMs are very good at building belief sets, running reasoners and interpreting their results in layman's terms.

[–] pfaya@alien.top 1 points 10 months ago

Ah, you might find this interesting: https://compphil.github.io/truth/

[–] OriginalUser99@alien.top 1 points 10 months ago

Multi-model systems. Very new and trending in cv and nlp. Personally i want to see reinforcement learning added to this. I believe that would be the first step towards simple AGI. Self-Supervised learning from experience with a zero-shot aspect. If i get offered a Ph.d fellowship i might look into that. But im trying to learn multi-modal systems right now.

[–] elsrda@alien.top 1 points 10 months ago

Model calibration and, more generally, uncertainty estimation.

[–] Serasul@alien.top 1 points 10 months ago

Medicine , I think we can stop millions from suffering with the help of ai, but this topic seems to be unimportant for ai-people.

[–] FormerIYI@alien.top 1 points 10 months ago

How to make up optimal solution to highly Kolmogorov-complex problems. This is nearest thing I would call "human intelligence" and what I would want for massive AI breakthrough.

https://arxiv.org/abs/1911.01547 - Chollett's Abstraction and Reasoning corpus is good example of it - it is tremendously hard to solve for the AI to this day.

https://www.researchgate.net/publication/2472570_A_Formal_Definition_of_Intelligence_Based_on_an_Intensional_Variant_of_Algorithmic_Complexity - here's theoretical work elaborating on importance of this concept.

It is important for current LLM revolution as well, as it is one of ways to differentiate actual congnitive skills from glorified look-up table

[–] InflationSquare@alien.top 1 points 10 months ago

Boltzmann Machines / Deep Boltzmann Machines. I only came across the term recently so I don't know much about them, but I thought it was fascinating to think that there was a branch of modelling that seems to be parallel to NNs, but I'd never heard of.

[–] UndocumentedMartian@alien.top 1 points 10 months ago

Reinforcement learning.

[–] JanBitesTheDust@alien.top 1 points 10 months ago

Inductive learning of logic programs. Assuming the data is encoded properly, you can learn a logic program that is reliable and interpretable.

[–] 1deasEMW@alien.top 1 points 10 months ago

Neural nets at the edge being Gaussians.

Instilling Inductive biases

HYPERNETWORKS (MAML and networks producing networks)

Neuronal weight and layer interpretability( new paper i heard abt used an auto encoder or smt to figure out which neurons were responsible for certain changes in behavior, seemed interesting, could be interesting for understanding how to instill information directly into a network)

[–] cMonkiii@alien.top 1 points 10 months ago

Rule based methods

load more comments
view more: next ›