Summary

This paper provides a comprehensive treatment of Tweedie’s formula — the result that for mu ~ g(.) and z|mu ~ N(mu, sigma^2), the posterior expectation is E{mu|z} = z + sigma^2 * l’(z), where l’(z) = d/dz log f(z) is the score of the marginal density f(z). The formula is foundational for score-based generative modelling: the term l’(z) = nabla log f(z) is precisely the score function that diffusion models learn to estimate, and the formula shows that this score provides the optimal Bayesian denoising correction.

Efron places Tweedie’s formula within the broader exponential family framework: for eta ~ g(.) and z|eta ~ f_eta(z) = exp(eta*z - psi(eta))*f_0(z), the posterior mean and variance are E{eta|z} = lambda’(z) and Var{eta|z} = lambda”(z), where lambda(z) = log(f(z)/f_0(z)). For the normal translation family, this recovers Tweedie’s formula and additionally gives the posterior variance as sigma^2(1 + sigma^2 * l”(z)), connecting the curvature of the log-marginal density to posterior uncertainty. The paper also extends the formula to the Poisson family — directly relevant to Poisson random bridges — and to gamma families with skewness corrections.

Key Contributions

  • Derives Tweedie’s formula as a special case of exponential family posterior moments, giving both mean (via l’(z)) and variance (via l”(z))
  • Establishes the connection E{mu|z} = unbiased estimate + Bayes correction (eq. 2.9), the same decomposition underlying denoising in diffusion models
  • Extends the formula to Poisson data: E{mu|z} = (z+1)f(z+1)/f(z), relevant for discrete/counting processes
  • Introduces the concept of empirical Bayes information — quantifying how much each “other” observation contributes to estimating a particular mu_i
  • Shows near-equivalence between Tweedie’s empirical Bayes and James-Stein estimation for normal priors
  • Extends the formula to handle “relevance” (spatially varying priors), connecting to the covariate-dependent score estimation needed in conditional generation

Methodology

The paper derives Tweedie’s formula from exponential family theory. Given the model eta ~ g(.), z|eta ~ exp(eta*z - psi(eta))*f_0(z), Bayes rule yields the posterior as an exponential family in eta with cumulant generating function lambda(z) = log(f(z)/f_0(z)). Differentiating lambda(z) yields all posterior moments. For practical implementation, Lindsey’s method estimates l(z) = log f(z) by fitting a Poisson GLM to binned data counts, yielding a smooth differentiable estimate l-hat(z) whose derivative provides the empirical Bayes correction. The James-Stein estimator emerges as a special case when J=2 in the polynomial model (eq. 3.1) with a normal prior.

Key Findings

  • The Bayes correction l’(z) is always negative for extreme observations — it shrinks estimates toward the center, correcting selection bias
  • Empirical Bayes information I(z_0) = 1/c(z_0) measures information per “other” observation, with regret ~ 1/(N*I(z_0))
  • The James-Stein estimator is approximately Tweedie’s formula with a 2-parameter log-density model
  • The formula extends to handle variable sigma^2 (Theorem 7.1), showing the posterior ratio g(mu|z_0)/g_0(mu|z_0) depends on the variance ratio lambda_mu = sigma_0/sigma_mu
  • Log-concavity of f(z) implies posterior variance is less than sigma^2, providing shrinkage
  • Connection to false discovery rates: -d/dz log(fdr(z)) = l’(z) - l_0’(z) = E{eta|z}

Important References

  1. An Empirical Bayes Approach to Statistics — Robbins (1956), originating Tweedie’s formula and the empirical Bayes framework
  2. Estimation of the Mean of a Multivariate Normal Distribution — Stein (1981), foundational James-Stein shrinkage estimation
  3. Controlling the False Discovery Rate — Benjamini & Hochberg (1995), FDR procedure connected to Tweedie’s formula via Section 6

Atomic Notes


paper