Sounds like you want a mixture of experts for the first model, with the categorical distribution a function of the second model output. You can put this together straightforwardly in tf/pytorch/whatever, but will be lower level to implement (ie if you think there's a keras layer or something it is unlikely)
this post was submitted on 28 Nov 2023
1 points (100.0% liked)
Machine Learning
1 readers
1 users here now
Community Rules:
- Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
- Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
- Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
- Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.
founded 1 year ago
MODERATORS
I'm not sure what you mean to be honest. I have a model where the first part is a submodel that is then used later on in the main model. Since there are different behaviour patterns here depending on the category (second input) i was wondering if it would be possible to have say 4 submodels corresponding with 4 classes, and use the corresponding one both during training when updating weights, and also during inference.
something like
value = Input()
categ = Input()
dense0, dense1 = Dense(5), Dense(5)
if categ == 0:
first_layer = dense0
else:
first_layer = dense1
# use first_layer accordingly...
Looks good. You probably want something like:
tf.cond(categ==0, dense0(inputs), dense1(inputs))
I've got 8 categories and a lot of data, so the problem with this is that it becomes really slow, because it expects the inputs to have been passed through already.