this post was submitted on 19 Nov 2023

1 points (100.0% liked)

Machine Learning

1 readers

1 users here now

Community Rules:

Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.

founded 2 years ago

MODERATORS

communick@academy.garden

[D] Skill Creep in ML/DL Roles - is the field getting not just more competitive, but more difficult? (alien.top)

submitted 2 years ago by mofoss@alien.top to c/machinelearning@academy.garden

31 comments fedilink hide all child comments

At what point do you think there was an inflection point for technical expertise and credentials requires for mid-top tier ML roles? Or was there never one? To be specific, would knowing simple scikit-learn algorithms, or basics of decision trees/SVM qualify you for full-fledged roles only in the past or does it still today? At what point did FAANGs boldly state: preferred (required) to have publications at top-tier venues (ICLR, ICML, CVPR, NIPS, etc) in their job postings?

I use the word 'creep' in the same context 'power creep' is used in battle animes where the scale of power slowly gets to such an irrationally large scale that anything in the past looks extremely weak.

Back in late 2016 I landed my first ML role at a defense firm (lol) but to be fair had just watched a couple ML courses on YouTube, took maybe 2 ML grad courses, and had an incomplete working knowledge of CNNs. Never used Tensorflow, had some experience with Theano not sure if it's exists anymore.

I'm certain that skill set would be insufficient in the 2023 ML industry. But it begs the question is this skill creep making the job market impenetrable for folks who were already working post 2012-2014.

Neural architectures are becoming increasingly complex. You want to develop a multi-modal architecture for an embodied agent? Well you better know a good mix of DL involving RL+CV+NLP. Improving latency on edge devices - how well do you know your ONNX/TensorRT/CUDA kernels, your classes likely didn't even teach you those. Masters is the new bachelors degree, and that's just to give you a fighting chance.

Yeah not sure if it was after the release of AlexNet in 2012, Tensorflow in 2015, Attention /Transformers in 2017 or now ChatGPT - but the skill creep is definitely creating an increasingly fast and growing technical rigor in the field. Close your eyes for 2 years and your models feel prehistoric and your CUDA, Pytorch, Nvidia Driver, NumPy versions need a fat upgrade.

Thoughts yall?

top 31 comments

sorted by: hot top controversial new old

[–] velcher@alien.top 1 points 2 years ago

I really don't think there is a power creep. In 2016, knowing just script-kiddy level ML knowledge like scikit-learn would not qualify you for full-fledged ML roles at FAANGs anyways. In general, FAANGs always looked for 2 qualities in ML roles - rigorous and principled mathematical background, and solid SWE skills.

Architectures are becoming simpler - Transformers are being used instead of LSTMs for sequence modeling. Code is becoming easier to run and streamlined. CUDA is much easier to install these days. Jax is basically streamlined tensorflow 1.

[–] m_____ke@alien.top 1 points 2 years ago (4 children)

It's the opposite, I'm doing a ton of interviews for senior roles and there's a flood of ex crypto bros and web devs who think they're LLM experts because they used openAI APIs or can use HuggingFace's trainer. Somehow they manage to stuff their resumes with all of the buzzwords and manage to get past recruiters but in interviews can't even give a high level explanation of what BERT is or how RNNs differ from Transformers.

[–] mofoss@alien.top 1 points 2 years ago (1 children)

I think that 'flood' is contributing to a market saturation of applicants, thereby making the process more selective.

We also can't ignore the rapid influx of CS students increasingly choosing to study AI/ML versus say general SWE, web development, mobile app engineering like they did a decade ago.

I totally expect there to be a new standardized ML interview process in the same way Leetcode crept up with the tech boom

[–] 111llI0__-__0Ill111@alien.top 1 points 2 years ago

Sounds like that would be a good thing if “ML leetcode” replaced regular leetcode. The regular leetcode stuff is way harder imo and so pointless to grind

[–] currentscurrents@alien.top 1 points 2 years ago (2 children)

I do know what BERT is and how RNNs differ from transformers. What buzzwords should I be putting on my resume to get these interviews?

[–] new_name_who_dis_@alien.top 1 points 2 years ago (1 children)

I’m not good at this myself (or maybe I’m just lazy) but my friend who’s better at this than me told me that your resume should have all of the buzzwords used in the job posting. That’s how you get through the filters etc.

[–] Username912773@alien.top 1 points 2 years ago

Deep Neural Network, Generative AI, Deep Learning, Large Language Models, GPT, AGI, Universal Function Approximation.

[–] ManuelRios18@alien.top 1 points 2 years ago

And what should I do if I have no work permit in the US ? 🙃

[–] depressed-bench@alien.top 1 points 2 years ago (1 children)

Hey, that's what I am seeing in r/experienceddevs :)

[–] thatguydr@alien.top 1 points 2 years ago (1 children)

Watch out for that subreddit. The name attracts people who are not experienced but who think highly of themselves. There are a lot of posts in that subreddit that reek of inexperience (especially w.r.t. soft skills) and often a significant lack of self-reflection.

[–] depressed-bench@alien.top 1 points 2 years ago

That checks out tbh. I have seen stuff >.>

[–] PLxFTW@alien.top 1 points 2 years ago

How? Honesty how is this possible? I literally have senior experience and responsibilities but I can't even get a single callback from anywhere.

[–] AltruisticCoder@alien.top 1 points 2 years ago

In fairness, offers and compensation packages have also scaled massively, especially for those research-heavy roles you mentioned so the publication requirements are not too unexpected.

[–] Delicious-View-8688@alien.top 1 points 2 years ago

I think it never really became a thing. People/Teams/Orgs treated them as SWEs with ML applications and that is more or less what it became. Most seem to just throw xgboost and random models from huggingface at everything and see what sticks.

[–] Tasty-Rent7138@alien.top 1 points 2 years ago (1 children)

Also not every domain need deep learning at all. With tabular data gbm is still king (I am happy for every example where DL outperformed gbm on tabular data, as I am also would be happy to use more dl, but I can't just use more complex architectures for the sake of complexity.) Also not every company has infinite data which is needed for the data hungry DL models. So there are situations where knowing transformer architecture would be feasable, but with the available data it is gonna be gbm still. There are still data scientict positions out there where you can still have a big impact on the business as a 'sklearn kiddie' (okay maybe xgboost or lightgbm kiddie) and it would not help any more if you would know all the DL architectures.

In the end it should be all about business impact (if you are working in the competitive sector) and not who is using the latest, freshest architectures.

[–] progressgang@alien.top 1 points 2 years ago

That’s the thing (across all sectors) that people don’t seem to understand. If it makes them money, or keeps the board happy (often intertwined) then they’re doing a fantastic job.

Hence high paid LLM stuff. It’s super impactful, and the CEO can’t do it.

[–] liuzicheng1987@alien.top 1 points 2 years ago

I think the industry is maturing. When I started back in 2015, it was totally acceptable to simply produce prototypes that were far away from production-level code quality. But the expectation is now to be able to production-grade prediction systems, which means people pay more attention to software development skills. And if you are applying for such a role (which can pay very well), then people will care about that rather than publications.

[–] fromnighttilldawn@alien.top 1 points 2 years ago (1 children)

Yes.

But "the haves" like to pretend its not in order to make it seem like everything's "fair".

Goeff Hinton's 1986 backpropagation research paper is like 4 pages.

Nowaday this is called a brain-fart.

[–] Creature1124@alien.top 1 points 2 years ago (1 children)

And it was already invented like a dozen times. Also, its just chain rule.

[–] new_name_who_dis_@alien.top 1 points 2 years ago

Hintons paper was famous not because he claimed to invent backprop but because (iirc) it was the first instance of it being used to optimize neural nets.

Like the transformer paper is famous but it didn’t invent attention—just applied it in a novel way.

[–] ProfessionalGoogler@alien.top 1 points 2 years ago

I agree with what most people have said, but it will definitely vary from company to company and even between interviewers.

My personal preference for interviewing candidates has mainly been about someone's thought process around tackling a problem they aren't familiar with. There's such a broad array of people applying for roles that 9 times out of 10 you'll find people can't explain what the feature creation or feature selection process looks like - because in most courses and play datasets these things are done for you.

Many people don't even think about the implications of their answers. You'd be shocked at the number of people who say they would do a grid search to find the optimal parameters on a dataset with millions of rows.

I'd also say in many interviews now the company is setting you up to fail rather than trying to navigate the interview with you to show how you might be useful to the organisation. I would use that to see if the company are a good fit. If a company has 6 interview stages, they don't know what they want. If a company has 1 interview stage they probably aren't being rigorous enough (find out personal fit and technical fit). If a company makes you solve random leet code problems, or explain the architecture of an RNN, when in reality all they do day to day is use scikit-learn, is that interview really fit for purpose?

[–] Lalalyly@alien.top 1 points 2 years ago

Wait. I have publications in CVPR and in the ACL Anthology. I also have worked with several variants of BERT and optimized models for both ONNX runtime and TensorRT. I haven’t been looking because I’m happy where I am.

What even is the base skill set people are looking for these days?

[–] David202023@alien.top 1 points 2 years ago

I think that as time has passed, the meaning of the DS role has evolved as well. Eventually, stakeholders want a product that works. That hasn't changed. The tools however have changed dramatically. I think that as in other fields of study, where you have to obtain a PhD just to understand the jargon, the DS field experienced the same phenomenon. Yes, it is an industry-biased role, meaning that you don't really have to have a PhD to get hired, but yes, you definitely have to know more than the average DS from a few years ago.

[–] synthphreak@alien.top 1 points 2 years ago (1 children)

I’m in the same boat as you OP. Got in the back door with a basic:woefully inadequate skill set in 2019, all self taught, somehow into a research role lol. Gunning now for my second role and I think no have a chance but boy is it tough out there even with 4 YOE.

Have you to jump ship from you initial role, and when?

[–] mofoss@alien.top 1 points 2 years ago (1 children)

I'm on my 2nd ML role and theyre paying for my part-time PhD. Work life balance is brutal, making me a weak student, but coming out of it, I'll have 9 YOE of pure MLE + a PhD in ML. So I should be set for life really haha 😗

[–] artoflearning@alien.top 1 points 2 years ago

Where can you do a part-time PHD?

[–] Ok_Cartographer5609@alien.top 1 points 2 years ago

Agree. I got into my 1st ML role last year. All self-taught. I've done all sorts of work - from ETL, CV, sentiment pipeline (mostly SWE stuff) and now LLM-based Information retrieval systems. My work mostly revolves around applied ML but I do have an interest in knowing the bits and bytes of ML as well. So currently teaching myself all about transformers and lms.

But it is also true that, earlier getting started with ML was easy - no need for heavy machinery/resources. But nowadays, you will need high computing power to even get started on learning something about large language models.

[–] atf1999@alien.top 1 points 2 years ago

Why the lol at the defense firm bit?

[–] iantimmis@alien.top 1 points 2 years ago

Damn this post resonates so hard.

[–] Different-Student859@alien.top 1 points 2 years ago

Senior level ML scientist at MSFT here.

Getting the job I think is only slightly more difficult than, say, two years ago. But being a top performer is definitely a lot harder, and it requires a level of SWE skills and pure AI knowledge that the ML industry has never seen.

There's a rift happening between the traditional ML types and scientists that have adapted to work equally well on both sides of the R&D equation. The former are getting left behind and are treated like second class citizens. Not pretty.

[–] rejectedlesbian@alien.top 1 points 2 years ago

I literly have a publication and experience of 2 years and can't get an entry level job for months now.

So I think very.