overview for Seankala

[D] MS in CS or Statistics? in c/machinelearning@academy.garden

[–] Seankala@alien.top 1 points 2 years ago

Do CS. It seems like you're looking to do more coding than analysis/forecasting.

[R] "It's not just memorizing the training data" they said: Scalable Extraction of Training Data from (Production) Language Models in c/machinelearning@academy.garden

[–] Seankala@alien.top 1 points 2 years ago

Nothing about this is novel though; the fact that language models are able to uncover sensitive training information has been a thing for a while now.

[Discussion] What part of your job do you finding yourself wasting time on most each week? in c/machinelearning@academy.garden

[–] Seankala@alien.top 1 points 2 years ago

Data pre-processing.

The startup I work at is fairly small so we don't really have anyone who deals with the data itself (e.g., data engineers, data scientists, etc.). That leaves the MLEs to do most of the grunt work.

[D] how do you remember methods in papers you read? in c/machinelearning@academy.garden

[–] Seankala@alien.top 1 points 2 years ago (1 children)

Well I don't know your situation but I feel like the "never have time" excuse may not necessarily be true. Even creating a page in Notion and writing down one line is enough for me. I feel like what was holding me back before was the trap of perfectionism. I wouldn't want to write anything unless I could make it into some conference-poster-quality page.

[D] how do you remember methods in papers you read? in c/machinelearning@academy.garden

[–] Seankala@alien.top 1 points 2 years ago (3 children)

Don't be sad, it's just a part of how things are you just have to choose a method and stick to it.

I personally use Notion. I've created a database and added properties like date, venue, authors, organizations, etc.

For example, the other day I needed to recap what the BLIP paper was about so I just searched the paper in the database and took a look at the page. On that page I've highlighted different text with different colors depending on when I came back to read it.

Took me a while to get this working and into the habit of it though.

[D] Exclusive: Sam Altman's ouster at OpenAI was precipitated by letter to board about AI breakthrough in c/machinelearning@academy.garden

[–] Seankala@alien.top 1 points 2 years ago

I'm a little curious why this post has so many upvotes. I guess it shows that things really have changed a lot.

[D] how do you remember methods in papers you read? in c/machinelearning@academy.garden

[–] Seankala@alien.top 1 points 2 years ago (5 children)

Repetition and working with them. I hope you're not under the impression that reading a paper once is going to help you remember it. I have to read a paper at least 3-4 times before I feel like I actually really understand it.

I remember reading somewhere that people are only able to retain 10-15% of the information they read in the first go or something.

[Discussion] What are best practices when building/training very small models? in c/machinelearning@academy.garden

[–] Seankala@alien.top 1 points 2 years ago

TL;DR The more constraints on the model, the more time should spend analyzing your data and formulating your problem.

I'll agree with the top comment. I've also had to deal with a problem at work where we were trying to perform product name classification for our e-commerce product. The problem was that we couldn't afford to have anything too large or increase infrastructure costs (i.e., if possible we didn't want to use any more GPU computing resources than we already were).

It turns out that extensive EDA was what saved us. We were able to come up with a string-matching algorithm sophisticated enough that it achieved high precision with practically no latency concerns. Might not be as flexible as something like BERT but it got the job done.

[Discussion] What are best practices when building/training very small models? in c/machinelearning@academy.garden

[–] Seankala@alien.top 1 points 2 years ago

TL;DR The more constraints on the model, the more time should spend analyzing your data and formulating your problem.

I'll agree with the top comment. I've also had to deal with a problem at work where we were trying to perform product name classification for our e-commerce product. The problem was that we couldn't afford to have anything too large or increase infrastructure costs (i.e., if possible we didn't want to use any more GPU computing resources than we already were).

It turns out that extensive EDA was what saved us. We were able to come up with a string-matching algorithm sophisticated enough that it achieved high precision with practically no latency concerns. Might not be as flexible as something like BERT but it got the job done.

[D] Why are ML model outputs not tested regarding statistical significance? in c/machinelearning@academy.garden

[–] Seankala@alien.top 1 points 2 years ago (4 children)

The reason why is because most researchers can't be bothered because no one pays attention to it anyway. I'm always doubtful about the number of researchers who even properly understand statistical testing.

I'd be grateful if a paper ran experiments using 5-10 different random seeds and provided the mean and variance.

[D] With LLMs hallucinating nature, how do we create a credible production ready application? in c/machinelearning@academy.garden

[–] Seankala@alien.top 1 points 2 years ago (1 children)

The fact that this is actually getting upvoted is really a sign about what's happened to this community.

[D] Is LLM the way to extract out certain text from a longer text? in c/machinelearning@academy.garden

[–] Seankala@alien.top 1 points 2 years ago

This sounds like a programming problem that's more suitable for a website like Stack Overflow. You're basically asking how you can call the API in a batch using the same prompt. That's not related to machine learning or GPT per se.