this post was submitted on 24 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 1 year ago
MODERATORS
 

I'm currently trying to wrap my head around the training loss functions for DPMs and how they vary from DDPMs, however there are differences in how the papers describe the processes, making it difficult to understand.

In "Deep Unsupervised Learning using Nonequilibrium Thermodynamics," the seminal paper for diffusion models states that "Training amounts to maximizing the model log likelihood," which ultimately leads to equation 14, which is given below:

DPMs Loss Function

I understand what is happening in this equation without too much issue.

However, when looking at other studies trying to build off of this, the equations given are quite different. Particularly, the paper "Denoising Diffusion Probabilistic Models" gives the following equation:

DDPMs loss function

I understand that the first equation and this equation are likely representing an extremely similar process (The second equation is also related to the log likelihood, after all), however I don't understand why these representations are so different, and how to interpret the second equation.

Compounding my confusion, appendix A in "Denoising Diffusion Probabilistic Models" gives a working through of a derivation of the L_{VLB} equation they gave, attributing the process to the "Deep Unsupervised Learning using Nonequilibrium Thermodynamics" paper, however I was not able to find it.

Which of these equations is the "loss function" for the diffusion models? Are these the different representations of the same equation? If not, what are they and why are they important? If they are, how can I understand equation 2?

you are viewing a single comment's thread
view the rest of the comments
[–] CatalyzeX_code_bot@alien.top 1 points 11 months ago

Found 2 relevant code implementations for "Deep Unsupervised Learning using Nonequilibrium Thermodynamics".

Ask the author(s) a question about the paper or code.

If you have code to share with the community, please add it here 😊🙏

--

Found 6 relevant code implementations for "Denoising Diffusion Probabilistic Models".

Ask the author(s) a question about the paper or code.

If you have code to share with the community, please add it here 😊🙏

--

To opt out from receiving code links, DM me.