this post was submitted on 26 Nov 2023
1 points (100.0% liked)
Machine Learning
1 readers
1 users here now
Community Rules:
- Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
- Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
- Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
- Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Would it be possible to create a system where every model's training includes a specific set seed and records its exact state, and then share this information with the dataset it was trained on to ensure we can reproduce the training? This method could help manage the randomness in training.
Using a set seed means we can make sure that the way the model starts and how it learns during training is the same every time. Essentially, if we restart the training from a certain point with this seed, the model should learn in the same way it did before. Also, by saving and sharing details like the model's structure, which training stage it's in, and the training step, along with the seed, we're essentially taking a 'snapshot' of where the model is at that moment.
Others could use this snapshot to pick up the training right where it was left off, under the same conditions. For merging different models, this technique could help line up how they learn, making it easier and more predictable to combine their training.
Am I thinking right about this or am I missing something? This is just theoretical thinking and I am not an expert on the subject.
You could use set seeds and checkpoints to serially train a model between different models. I don’t know how you could “merge” different models that are trained independently. I think the challenge here is in the merging, not necessarily the deterministic part.