Me right now
Machine Learning
Community Rules:
- Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
- Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
- Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
- Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.
Yes
I mark the expected duration of my experiments both in a google calendar as well as a project journal. That way, I know when it's "time" to check in on the runs.
I also use Weights & Biases, which sends me a mail if something crashes so I can check up when I really need to.
Curve watching is just a waste of time, you should train yourself to get out of the habit even if it's difficult for you.
Oooh, the calendar idea is a great tip. I’m going to borrow this from you.
If people wrote documentation instead of watching the training progress chart get updated: *flying cars*
lol
I am stealing this, sorry
Curve watching in the initial steps of training is important, but once you get stable behavior time to go to the kitchen, grab some coffee, and do something else with the rest of the day.
No it isn’t. Has anyone here ran more than one experiment at a time? Clearly not because you can’t curve watch hundreds of runs at the same time. Anything that can be achieved by curve watching can be easily automated.
Instead of relying on Weights & Biases sending me an email I’ve implemented alerts being sent to either a Slack or a Discord channel. They are always sent on error with the error message. I also receive a message after a predetermined interval of training with metrics.
Just another idea if someone doesn’t want to rely on W&B
I use pushover.net to send my phone a push notification when something noteworthy happens. Just drop a little function in my trainer loop. Extremely useful and you can programmatically set the notification message so it'll tell me what my numbers are when I'm at lunch or whatever.
Yes. Only when I need debugging. Otherwise, checking after a few hours is not too bad. Sometimes I know the code is correct, I just launch it and forget about it. Enjoying a few hours of doing nothing is better for your mental health than staring at the monitor gaining nothing.
- "Checks how his models are training..."
- "Opens Reddit..."
- "Stumbles upon the question 'Do you obsessively watch your models train?'..."
seriously it's literally on my other monitor right now
As a kid I watched the Kazaa/Limewire/torrent downloads obsessively, now it's model backtesting
It’s pretty unhealthy but yes I do that. A model not improving or eventually converging can literally put me in a bad mood.
I’m trying to break myself of the habit, honestly.
Only to make sure no errors get thrown. Otherwise I let them run and I do something else
Some people intensively watch their model trains choo choo!!!
-
Start training models
-
Go watch model trains
-
Come back to error
Watch model train successfully for hours or even days
it will get done any moment now
some trivial change you made crashes the postprocessing
now the file with your results is corrupted
What about Thomas the tank engine
I train when my models train. It's a form of regularisation
L3 regularisation
Not sure why, but this form of regularisation always leads to an underfit model for me :(
Gradient Descent by Grad Student (GDGS) v2.
Everyone does it, at least at first during a new project.
yes lol
I imagine it to be identical to the life of a day trader save for the desired direction of the curve
Some days yes, others I'm so busy I forget and realize the next day
Yep. I do that with plant too. You should definitely consider getting some plants too 😅
So much wasted time during phd... 😂
Besides the rule is: if the models training, youre being productive, go to the pub.
Only when messing around with deepracer
I do LoRA training for most of my stuff as of late, so most of my experiments are in minutes, or maybe sometimes hours, not days. So yeah, I tend to leave the terminal visible. I've been experimenting with narrowing the LoRA alpha lately with promising results, so I'm even more glued to the eval/loss vs train/loss so I don't waste hours on a doomed experiment.
It kind of feels like I'm baking bread or something, with a childlike excitement when it's ready.
When I was young, I used to.
Also, I remember having this feeling that if I didn’t have a model training over the weekend, that I was forgetting something. : )
Yes
I read this 'do you obsessively watch your model train' and I was thinking for a sec this was a weird ADHD hobby post or something, meant for people who build model train dioramas and like to sit and watch them more so than add to them. (Not that I'm subscribed to any subreddit like that...).
Probably thought that because I was just thinking about a timer circuit I'm working on in Factorio for my train logistics system, and... I do like seeing the trains run around, when they're not running my inattentive self over at least.
Anyway. Yes, sometimes it's tempting to watch the machine move, whatever it is you're engineering. But that's part of the challenge of being an engineer I guess. Back to building (for my work, haha. The factory must grow, but only after hours).
For stable diffusion, not as much. Put the settings right and the training is done in 5 minutes. See the result, alter the settings and go again. Those settings are max possible batch size, previews off and saving checkpoint to off. Otherwise training takes 3 times longer or thrashes an SSD if used.
For voice cloning, extensively. As soon as the loss changes or the loss updates stop it has to be killed. Worse is for newer ones like Style TTS, they have a constant VRAM usage up until a random point where it grows infinitely.
Luckily not anymore. At the start of my current project I checked the progess quite often, now after a few months of development I'll just let it train for a few days and then check back in.
I am usually busy alternating between checking Reddit karma metrics and training metrics.
Very guilty pleasure that I know is silly but yah, obsessively.
Only if there's a bug in the code that needs fixing!
Like a cat watching a laser.
I wrote a telegram bot that sends me the loss after each epoch so I can stay informed when I'm not home, lol
No, but I do obsessively watch my model trains.
Oh I also obsessively watch my model predict (whenever I can), but not on the primary screen.
You guys are so cool, I wanna be just like you. Hopefully I'll make it!
What can I say, I like graphs AND progress bars, so if you mix the two ...
Yes, because one time (years ago) our rickety scaling system went off the rails and spent like $10k on idle EC2 instances before our equally rickety monitoring caught it.
Now I understand the importance of subreddits with this question. The same question can be used in various use cases.
I can’t help myself. I remote in and visit my tensorboard way too much.
If the training time is less than an hour, yes.