this post was submitted on 09 Nov 2023
1 points (100.0% liked)
Machine Learning
1 readers
1 users here now
Community Rules:
- Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
- Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
- Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
- Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
State space models and their derivatives.
They have demonstrated better performance that Transformers on very long sequences, and that with linear instead of quadratic Computational costs, and on paper also generalize better to non-NLP tasks.
However, training them is more difficult, so they perform worse in practice outside of these few very long sequence tasks. But with a bit more development, they could become the most impactful AI technology in years.
do you have anything in particular you think is worth sharing? I‘m trying to implement model predictive control in matlab and i‘m working on a lstm surrogate model. Just last week i‘ve found matlab tools for neural state space models, and i‘ve been wondering if i just uncovered a big blindspot of mine.
It sounds like you‘ve come across exactly what I meant.
I have a couple of papers on the topic if you’re interested in those. There’s also a PyTorch implementation of a neural state space model by the authors of the original paper: https://github.com/HazyResearch/state-spaces