yannbouteiller

joined 1 year ago
[–] yannbouteiller@alien.top 1 points 11 months ago

Am I correct to say that "grokking" is apparently an effect of regularization, as in reaching good generalization performance from pushing the weights to be as small as possible until the model reaches a capacity that is smaller than the dataset?

[–] yannbouteiller@alien.top 1 points 11 months ago

Your teacher's argument is based on the fact that pseudo-random generators are deterministic, which is entirely irrelevant to ML theory.

If you want to make the point that the "can only be" part is extremely far-reaching, just bring quantum physicists in the discussion.

[–] yannbouteiller@alien.top 1 points 11 months ago

Your teacher's argument is based on the fact that pseudo-random generators are deterministic, which is entirely irrelevant to ML theory.

If you want to make the point that the "can only be" part is extremely far-reaching, just bring quantum physicists in the discussion.

[–] yannbouteiller@alien.top 1 points 11 months ago

It does answer OP's question, but is of limited practical relevance for an ML course IMHO.

We typically use GD in approximately pseudoconvex optimization landscapes, not because there are infinitely many or even any single saddle point. To escape local optima and saddle points, we rely on other tricks like SGD.

[–] yannbouteiller@alien.top 1 points 11 months ago

I am not sure, I work in academia and we didn't have this problem.

I suppose it is the backlash from the hype that started in 2015 when random companies started hiring data science teams and later realized they didn't need them, but most notably the big tech companies fired many ML people at the end of the pandemic, which flooded the market with highly-skilled job-seekers and made it hard for newcomers up to this day, as far as I have been told.

[–] yannbouteiller@alien.top 1 points 1 year ago

Actually he told me they got many PhDs from non-relevant fields like biology etc.

[–] yannbouteiller@alien.top 0 points 1 year ago (5 children)

The bubble is already there. Since the end of the pandemic, huge layoffs happenend in the ML department and the market is flooded with job-seekers right now.

My friend's startup - a small no-name ML startup that pays 50k canadian dollars a year - has posted a job offer recently and received more than 1000 applications in 2 days.

[–] yannbouteiller@alien.top 1 points 1 year ago (1 children)

I am not sure I fully understand what your definition of "intelligence" really encompasses, but as far as I understand it sounds like a definition of supervised learning rather than "intelligence"?

Where does naturally arising intelligence from, e.g., random genetic mutations, or reinforcement learning, stand here?

Natural selection via random adaption to the current state of the universe is an example of intelligence constructed without human-generated data, I think, but it doesn't seem to fit in what you call "intelligence" here, since it is not trying to imitate anything.