You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am curious about how you determine the values of $P_{\text{mean}}$ and $P_{\text{std}}$ as per the "Loss weighting" paragraph in Section B.2, where it states "Pmean = −0.4 and Pstd = 1.0 instead of −1.2 and 1.2".
Is there an underlying intuition or is it based on a pure hp search?
The text was updated successfully, but these errors were encountered:
It seems they provide a rough intuition in the same paragraph
Loss weighting. With the EDM training loss (Equation 14), the quality of the resulting distribution tends to be quite sensitive to the choice of Pmean, Pstd, and λ(σ). The role of Pmean and Pstd is to focus the training effort on the most important noise levels, whereas λ(σ) aims to ensure that the gradients originating from each noise level are roughly of the same magnitude. Referring to Figure 5a of Karras et al. [37], the value of L(Dθ; σ) behaves somewhat unevenly over the course of training: It remains largely unchanged for the lowest and highest noise levels, but drops quickly for the ones in between. Karras et al. [37] suggest setting Pmean and Pstd so that the resulting log-normal distribution (Equation 16) roughly matches the location of this in-between region. When operating with VAE latents, we have observed that the in-between region has shifted considerably toward higher noise levels compared to RGB images. We thus set Pmean = −0.4 and Pstd = 1.0 instead of −1.2 and 1.2, respectively, to roughly match its location.
If you refer to figure 5a in the original Elucidating paper, it looks like they to match the distribution of Pmean and Pstd to cover the loss that "drops quickly for the [noise levels] in between" the highest and lowest. I.e. we want to spend more time training in this middle region.
I read this as you first need a trained model to select the P parameters to roughly match this log-normal
Hi,
Firstly, thank you very much for your work!
I am curious about how you determine the values of$P_{\text{mean}}$ and $P_{\text{std}}$ as per the "Loss weighting" paragraph in Section B.2, where it states "Pmean = −0.4 and Pstd = 1.0 instead of −1.2 and 1.2".
Is there an underlying intuition or is it based on a pure hp search?
The text was updated successfully, but these errors were encountered: