Machine Learning

1 readers

1 users here now

Community Rules:

Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.

founded 1 year ago

MODERATORS

communick@academy.garden

[R] Understanding the loss function of Diffusion Probablistic models vs Denoising Diffusion Probablistic Models (alien.top)

submitted 11 months ago by juanlucas2@alien.top to c/machinelearning@academy.garden

1 comments fedilink hide all child comments

I'm currently trying to wrap my head around the training loss functions for DPMs and how they vary from DDPMs, however there are differences in how the papers describe the processes, making it difficult to understand.

In "Deep Unsupervised Learning using Nonequilibrium Thermodynamics," the seminal paper for diffusion models states that "Training amounts to maximizing the model log likelihood," which ultimately leads to equation 14, which is given below:

DPMs Loss Function

I understand what is happening in this equation without too much issue.

However, when looking at other studies trying to build off of this, the equations given are quite different. Particularly, the paper "Denoising Diffusion Probabilistic Models" gives the following equation:

DDPMs loss function

I understand that the first equation and this equation are likely representing an extremely similar process (The second equation is also related to the log likelihood, after all), however I don't understand why these representations are so different, and how to interpret the second equation.

Compounding my confusion, appendix A in "Denoising Diffusion Probabilistic Models" gives a working through of a derivation of the L_{VLB} equation they gave, attributing the process to the "Deep Unsupervised Learning using Nonequilibrium Thermodynamics" paper, however I was not able to find it.

Which of these equations is the "loss function" for the diffusion models? Are these the different representations of the same equation? If not, what are they and why are they important? If they are, how can I understand equation 2?

top 1 comments

sorted by: hot top controversial new old

[–] CatalyzeX_code_bot@alien.top 1 points 11 months ago

Found 2 relevant code implementations for "Deep Unsupervised Learning using Nonequilibrium Thermodynamics".

Ask the author(s) a question about the paper or code.

If you have code to share with the community, please add it here 😊🙏

Found 6 relevant code implementations for "Denoising Diffusion Probabilistic Models".

Ask the author(s) a question about the paper or code.

If you have code to share with the community, please add it here 😊🙏

To opt out from receiving code links, DM me.