A parameterization for the reverse process variance in DDPMs where the model outputs a vector v per dimension, and the variance is computed as Σ_θ(x_t, t) = exp(v · log β_t + (1-v) · log β̃_t) — an interpolation in log-space between the theoretical upper bound β_t and lower bound β̃_t. This is more stable than predicting variances directly since the valid range between the two bounds is very narrow, and working in log-space ensures the interpolation stays within the permissible interval regardless of the network output.
The model is trained with the hybrid objective L_hybrid = L_simple + 0.001 · L_vlb, using a stop-gradient on μ_θ in the L_vlb term so that the variational lower bound loss only guides the variance prediction while L_simple continues to drive the mean prediction. This decomposition prevents the noisy VLB gradients from destabilizing the mean estimation.
A key secondary benefit of learned variances is that they automatically rescale when using fewer diffusion steps, enabling fast sampling with 10-40x fewer steps without the need for hand-tuned variance schedules. This makes learned variances competitive with or superior to DDIM for accelerated generation. Introduced by Nichol & Dhariwal (2021).
Key Details
- Interpolation in log-space between β_t and β̃_t
- Trained with hybrid loss using λ=0.001
- Stop-gradient on mean in VLB term
- Enables fast sampling with automatic variance rescaling
- Outperforms DDIM at 50+ steps