score matching

Score matching is the estimation of the score function $\nabla_{x} lo g p (x)$ — the gradient of the log probability density — which is the fundamental building block of score-based generative models. Since the score does not require computing the normalizing constant, it can be estimated from data without knowing $p (x)$ explicitly.

Denoising score matching (Vincent, 2011) trains a neural network $s_{θ} (x, t)$ to predict the score of noisy data distributions by minimizing $E_{p (x_{t} ∣ x_{0})} [∥ s_{θ} (x_{t}, t) - \nabla_{x_{t}} lo g p (x_{t} ∣ x_{0}) ∥^{2}]$ . For Gaussian perturbation kernels, the score of the perturbation is $\nabla lo g p (x_{t} ∣ x_{0}) = - (x_{t} - x_{0}) / σ_{t}^{2}$ , making the denoising objective equivalent to predicting the clean data from noisy observations (connecting to Tweedie’s formula).

In the SDE framework (Song et al. 2021), a time-dependent network is trained via continuous weighted denoising score matching integrated over all noise levels. The learned score enables: (1) reverse-time SDE simulation for generation; (2) probability flow ODE for deterministic sampling and likelihood; (3) predictor-corrector sampling combining SDE solvers with Langevin MCMC.

Key Details

$\nabla_{x} lo g p (x)$ does not require normalizing constant
Denoising score matching connects to Tweedie’s formula
Enables reverse-time SDE via Anderson (1982)
Continuous version integrates over all noise levels
Foundation of SMLD, DDPM, and all SDE-based generative models
Hyvärinen (2005) introduced score matching, Vincent (2011) established denoising connection

concept

Alethograph

Explorer

score matching

Key Details

Graph View

Backlinks