CursedCrystalCoconut

joined 10 months ago
[–] CursedCrystalCoconut@alien.top 1 points 10 months ago

Yeah, those bs ones pop up everywhere. If only there was some model to sort between those and the good ones... And I'm kind of giving up on being caught up, seeing g all the answers.

[–] CursedCrystalCoconut@alien.top 1 points 10 months ago

Wow, that is a lot of work. It's awesome that you manage to have the latest and the pulse of AI as you said. That is the kind of discipline I cannot follow. Just one hour at work in the morning would destroy the rest of my day ^^

[–] CursedCrystalCoconut@alien.top 1 points 10 months ago (1 children)

Hugging face is for sure a godsend, even though I'm still at a semi-loss with their API. It changed so much, and there is so much more now that it has become a little confusing. Nothing a little work can't fix ! But that raises the question to me : how do these people manage to get out every model so fast ?

[–] CursedCrystalCoconut@alien.top 1 points 10 months ago

You managed to put into words what bugs me with the field nowadays. What kills me most is that third paragraph you said : no-one cares what the model does IRL but how it improves a metric on a benchmark task and dataset. When the measure becomes the objective, you're not doing proper science anymore.

[–] CursedCrystalCoconut@alien.top 1 points 10 months ago

That helps narrow it down. Though, many discoveries are not published anymore. Reminds me of Mikolov, who was rejected pretty much everywhere and word vectors ended up being such a big deal. Or that OpenAI does not publish their models.

[–] CursedCrystalCoconut@alien.top 1 points 10 months ago (1 children)

Thanks ! When I get back (soon) in a full-time ML position I'll be sure to check it out.

[–] CursedCrystalCoconut@alien.top 1 points 10 months ago

Yes, it seems from all the answers that I just try to go too deep. Unfortunately it feels like nowadays it's just tweaking and trying architectures, but there is no "red line" or big mechanism to know about, like there was kernels or attention.

[–] CursedCrystalCoconut@alien.top 1 points 10 months ago

Then it's kind of sad, because a lot of discoveries have been made by looking at what other disciplines were doing and cross-pollinating (genetic algorithms, attention, etc.). Plus then how does one know of they want to branch to another domain? But you're right there is too much...

 

I started my PhD in NLP a year or so before the advent of Transformers, and finished it just as ChatGPT was unveiled (literally defended a week before). Halfway through, I felt the sudden acceleration of NLP, where there was so much everywhere all at once. Before, knowing one's domain, and the state-of-the-art GCN, CNN or Bert architectures, was enough.

Since, I've been working in a semi-related area (computer assisted humanities) as a data engineer/software developer/ML engineer (it's a small team so many hats). Not much in terms of latest news, so I tried recently to get up to speed with the recent developments.

But there are so many ! Everywhere. Even just in NLP, not considering all the other fields such as reinforcement learning, computer vision, all the fundamentals of ML etc. It is damn near impossible to gather an in-depth understanding of a model as they are so complex, and numerous. All of them are built on top of other ones, so you also need to read up on those to understand anything. I follow some people on LinkedIn who just give new names every week or so. Going to look for papers in top conferences is also daunting as there is no guarantee that a paper with an award will translate to an actual system, while companies churn out new architectures without the research paper/methodology being made public. It's overwhelming.

So I guess my question is two fold : how does one get up to speed after a year of not being too much in the field ? And how does one keep up after that ?