Two families of forward SDEs in the score-based generative framework. VE-SDE (Variance Exploding) takes the form dx = √(d[σ²(t)]/dt) dw with a geometric σ schedule; the variance of the perturbed data explodes as t grows. This is the continuous-time limit of SMLD (Score Matching with Langevin Dynamics). VP-SDE (Variance Preserving) takes the form dx = -½β(t)x dt + √β(t) dw with a linear β schedule; the variance remains bounded throughout the diffusion process. This is the continuous-time limit of DDPM.

The sub-VP SDE modifies the diffusion coefficient so the variance is always upper-bounded by that of the VP-SDE, achieving better likelihoods. Specifically, the sub-VP SDE was designed to produce tighter variance bounds while retaining the favorable properties of the VP formulation.

VE SDEs generally produce better sample quality as measured by FID and IS scores, while VP and sub-VP SDEs produce better likelihoods. This trade-off reflects a fundamental tension in diffusion model design between perceptual quality and density estimation performance.

Key Details

  • VE = continuous SMLD, VP = continuous DDPM
  • Sub-VP achieves 2.99 bits/dim on CIFAR-10
  • VE perturbation kernel: N(x(0), σ²(t)-σ²(0))
  • VP perturbation kernel: N(√ᾱ_t x(0), (1-ᾱ_t)I)

concept