this post was submitted on 27 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 10 months ago
MODERATORS
 

I find myself watching tensorboard more than working- just wondering if others who have fallen into this pattern have words of advice wrt productivity

top 50 comments
sorted by: hot top controversial new old
[–] LawfulnessOdd5872@alien.top 1 points 9 months ago

Me right now

[–] 3DHydroPrints@alien.top 1 points 9 months ago
[–] timo_kk@alien.top 1 points 9 months ago (7 children)

I mark the expected duration of my experiments both in a google calendar as well as a project journal. That way, I know when it's "time" to check in on the runs.

I also use Weights & Biases, which sends me a mail if something crashes so I can check up when I really need to.

Curve watching is just a waste of time, you should train yourself to get out of the habit even if it's difficult for you.

[–] deepneuralnetwork@alien.top 1 points 9 months ago

Oooh, the calendar idea is a great tip. I’m going to borrow this from you.

[–] maleits_gavatxos@alien.top 1 points 9 months ago (2 children)

If people wrote documentation instead of watching the training progress chart get updated: *flying cars*

[–] LtFr0st@alien.top 1 points 9 months ago
[–] Weird-Field6128@alien.top 1 points 9 months ago

I am stealing this, sorry

[–] LoyalSol@alien.top 1 points 9 months ago (1 children)

Curve watching in the initial steps of training is important, but once you get stable behavior time to go to the kitchen, grab some coffee, and do something else with the rest of the day.

[–] lumin0va@alien.top 1 points 9 months ago

No it isn’t. Has anyone here ran more than one experiment at a time? Clearly not because you can’t curve watch hundreds of runs at the same time. Anything that can be achieved by curve watching can be easily automated.

[–] Zemeniite@alien.top 1 points 9 months ago

Instead of relying on Weights & Biases sending me an email I’ve implemented alerts being sent to either a Slack or a Discord channel. They are always sent on error with the error message. I also receive a message after a predetermined interval of training with metrics.

Just another idea if someone doesn’t want to rely on W&B

[–] 0ctobogs@alien.top 1 points 9 months ago (1 children)

I use pushover.net to send my phone a push notification when something noteworthy happens. Just drop a little function in my trainer loop. Extremely useful and you can programmatically set the notification message so it'll tell me what my numbers are when I'm at lunch or whatever.

load more comments (1 replies)
load more comments (2 replies)
[–] SnooHesitations8849@alien.top 1 points 9 months ago

Yes. Only when I need debugging. Otherwise, checking after a few hours is not too bad. Sometimes I know the code is correct, I just launch it and forget about it. Enjoying a few hours of doing nothing is better for your mental health than staring at the monitor gaining nothing.

[–] Apathiq@alien.top 1 points 9 months ago (1 children)

- "Checks how his models are training..."
- "Opens Reddit..."
- "Stumbles upon the question 'Do you obsessively watch your models train?'..."

[–] swarmed100@alien.top 1 points 9 months ago

seriously it's literally on my other monitor right now

As a kid I watched the Kazaa/Limewire/torrent downloads obsessively, now it's model backtesting

[–] deepneuralnetwork@alien.top 1 points 9 months ago

It’s pretty unhealthy but yes I do that. A model not improving or eventually converging can literally put me in a bad mood.

I’m trying to break myself of the habit, honestly.

[–] Material_Policy6327@alien.top 1 points 9 months ago

Only to make sure no errors get thrown. Otherwise I let them run and I do something else

[–] Jack_Torcello@alien.top 1 points 9 months ago

Some people intensively watch their model trains choo choo!!!

[–] mr_birkenblatt@alien.top 1 points 9 months ago (2 children)
  1. Start training models

  2. Go watch model trains

  3. Come back to error

[–] swarmed100@alien.top 1 points 9 months ago

Watch model train successfully for hours or even days

it will get done any moment now

some trivial change you made crashes the postprocessing

now the file with your results is corrupted

[–] kgmeister@alien.top 1 points 9 months ago

What about Thomas the tank engine

[–] keepthepace@alien.top 1 points 9 months ago
[–] matigekunst@alien.top 1 points 9 months ago (3 children)

I train when my models train. It's a form of regularisation

[–] xignaceh@alien.top 1 points 9 months ago

L3 regularisation

[–] swarmed100@alien.top 1 points 9 months ago

Not sure why, but this form of regularisation always leads to an underfit model for me :(

[–] muntoo@alien.top 1 points 9 months ago

Gradient Descent by Grad Student (GDGS) v2.

[–] jucestain@alien.top 1 points 9 months ago

Everyone does it, at least at first during a new project.

[–] Witty-Elk2052@alien.top 1 points 9 months ago

yes lol

I imagine it to be identical to the life of a day trader save for the desired direction of the curve

[–] KyxeMusic@alien.top 1 points 9 months ago

Some days yes, others I'm so busy I forget and realize the next day

[–] ProgramPrimary2861@alien.top 1 points 9 months ago

Yep. I do that with plant too. You should definitely consider getting some plants too 😅

[–] the_warpaul@alien.top 1 points 9 months ago

So much wasted time during phd... 😂

Besides the rule is: if the models training, youre being productive, go to the pub.

[–] AGINSB@alien.top 1 points 9 months ago

Only when messing around with deepracer

[–] rwl4z@alien.top 1 points 9 months ago

I do LoRA training for most of my stuff as of late, so most of my experiments are in minutes, or maybe sometimes hours, not days. So yeah, I tend to leave the terminal visible. I've been experimenting with narrowing the LoRA alpha lately with promising results, so I'm even more glued to the eval/loss vs train/loss so I don't waste hours on a doomed experiment.

It kind of feels like I'm baking bread or something, with a childlike excitement when it's ready.

[–] neanderthal_math@alien.top 1 points 9 months ago

When I was young, I used to.

Also, I remember having this feeling that if I didn’t have a model training over the weekend, that I was forgetting something. : )

[–] Tomsen1410@alien.top 1 points 9 months ago
[–] adventuringraw@alien.top 1 points 9 months ago

I read this 'do you obsessively watch your model train' and I was thinking for a sec this was a weird ADHD hobby post or something, meant for people who build model train dioramas and like to sit and watch them more so than add to them. (Not that I'm subscribed to any subreddit like that...).

Probably thought that because I was just thinking about a timer circuit I'm working on in Factorio for my train logistics system, and... I do like seeing the trains run around, when they're not running my inattentive self over at least.

Anyway. Yes, sometimes it's tempting to watch the machine move, whatever it is you're engineering. But that's part of the challenge of being an engineer I guess. Back to building (for my work, haha. The factory must grow, but only after hours).

[–] PrimaCora@alien.top 1 points 9 months ago

For stable diffusion, not as much. Put the settings right and the training is done in 5 minutes. See the result, alter the settings and go again. Those settings are max possible batch size, previews off and saving checkpoint to off. Otherwise training takes 3 times longer or thrashes an SSD if used.

For voice cloning, extensively. As soon as the loss changes or the loss updates stop it has to be killed. Worse is for newer ones like Style TTS, they have a constant VRAM usage up until a random point where it grows infinitely.

[–] dotpoint7@alien.top 1 points 9 months ago

Luckily not anymore. At the start of my current project I checked the progess quite often, now after a few months of development I'll just let it train for a few days and then check back in.

[–] anything_but@alien.top 1 points 9 months ago

I am usually busy alternating between checking Reddit karma metrics and training metrics.

[–] PicaPaoDiablo@alien.top 1 points 9 months ago

Very guilty pleasure that I know is silly but yah, obsessively.

[–] PeteyMax@alien.top 1 points 9 months ago

Only if there's a bug in the code that needs fixing!

[–] TheInfelicitousDandy@alien.top 1 points 9 months ago

Like a cat watching a laser.

[–] PMMEYOURSMIL3@alien.top 1 points 9 months ago

I wrote a telegram bot that sends me the loss after each epoch so I can stay informed when I'm not home, lol

[–] Unlikely-Loan-4175@alien.top 1 points 9 months ago

No, but I do obsessively watch my model trains.

[–] AllowFreeSpeech@alien.top 1 points 9 months ago

Oh I also obsessively watch my model predict (whenever I can), but not on the primary screen.

[–] Dar7oo@alien.top 1 points 9 months ago

You guys are so cool, I wanna be just like you. Hopefully I'll make it!

[–] pure_whey@alien.top 1 points 9 months ago

What can I say, I like graphs AND progress bars, so if you mix the two ...

[–] the__storm@alien.top 1 points 9 months ago

Yes, because one time (years ago) our rickety scaling system went off the rails and spent like $10k on idle EC2 instances before our equally rickety monitoring caught it.

[–] random_Byzantium@alien.top 1 points 9 months ago

Now I understand the importance of subreddits with this question. The same question can be used in various use cases.

[–] Lalalyly@alien.top 1 points 9 months ago

I can’t help myself. I remote in and visit my tensorboard way too much.

[–] crazymonezyy@alien.top 1 points 9 months ago

If the training time is less than an hour, yes.

load more comments
view more: next ›