Lindsey’s method is a technique for estimating the log marginal density l(z) = log f(z) by fitting a Poisson generalized linear model (GLM) to binned observation counts. It provides the smooth, differentiable estimate of l(z) required to compute generalized Tweedie’s formula empirically, since the Bayes correction l’(z) = d/dz log f(z) requires a differentiable density estimate.

The method works as follows: partition the sample space into K bins with centers x_k and width d, count observations y_k = #{z_i in bin k}, and fit a Poisson regression model y_k ~ Poi(nu_k) where nu_k = Ndf_beta(x_k) and log f(z) is modeled as a J-th degree polynomial (or natural spline) in z. The resulting MLE beta-hat provides a smooth estimate l-hat(z), whose derivative l-hat’(z) gives the empirical Bayes correction for generalized Tweedie’s formula.

In the context of score-based generative models, Lindsey’s method is a classical precursor to modern score estimation: both aim to estimate nabla log p(x) from samples. The key difference is that neural score networks (as in score matching) use flexible function approximators rather than polynomial/spline bases, and operate in high dimensions. However, the fundamental statistical principle is identical — estimate the score from the marginal density of observations.

Key Details

  • Models log f(z) = sum(beta_j * z^j) as exponential family with canonical parameters beta
  • Poisson GLM on binned counts provides MLE beta-hat
  • Natural splines with J=5 degrees of freedom work well empirically
  • Derivative l-hat’(z) gives the empirical Bayes / score estimate
  • Classical analogue of neural score estimation in diffusion models

method