Summary

This paper introduces the formal mathematical framework of (Φ, Ψ)-bridges for generative modelling. A random bridge is a càdlàg stochastic process constrained to take a target distribution at a fixed terminal time, acting as a stochastic transport between two distributions. The framework is deliberately abstract — imposing no model-specific assumptions beyond right-continuity with left limits — so it can encompass Markovian or non-Markovian, continuous (Brownian), discontinuous (Poisson, gamma), or hybrid dynamics depending on the driving process.

For the Gaussian case, the bridge admits a closed-form anticipative representation decomposing into a signal component (the target) and a noise process, echoing information-based models from stochastic filtering. The conditional expectation E[Y | ξ_t] — the L² best-estimate — becomes the central object for both training and sampling. Training amounts to learning this conditional expectation via MSE loss, while simulation uses Euler-Maruyama discretization of the bridge SDE in a single forward pass, without any backward-time denoising procedure. The information-theoretic analysis proves that Shannon entropy of the posterior is a supermartingale converging to zero.

Experiments on MNIST and CIFAR-10 demonstrate dramatically better FID at very low step counts (2-10 steps) compared to both DDPM and improved DDPM baselines, though at 1000 steps the improved DDPM surpasses the bridge model. This positions random bridges as particularly suitable for high-speed generation tasks.

Key Contributions

  • General mathematical definition of (Φ, Ψ)-bridges as stochastic transports, subsuming diffusion bridges as a special case
  • Single-directional generative framework eliminating the bi-directional noising-denoising protocol of DDPMs
  • Anticipative and non-anticipative bridge representations with closed-form conditional distributions
  • Simple MSE training objective on E[Y | ξ_t] with analytically tractable conditional sampling
  • Dramatic low-step sampling efficiency: competitive FID with 2-10 steps
  • Lévy random bridge extensions to gamma, stable-1/2, and jump-diffusion processes
  • Shannon entropy supermartingale property formalizing information gain along the bridge

Methodology

The framework builds on a probability space with right-continuous complete filtration. For generative modelling, a Φ-initialized random bridge to Ψ is constructed with driving process {Z_t} = {σW_t}. Training minimizes MSE between neural network output f(ξ_t, t; θ) and target y, sampling ξ_t from known Gaussian conditionals. Simulation proceeds via Euler-Maruyama discretization of dξ_t = (Ŷ_t - ξ_t)/(T-t) dt + σdW_t. Architecture uses a modified UNet (64 channels, 2 residual blocks, 4 attention heads) trained for 40k steps.

Key Findings

  • 2-step MNIST FID: Bridge 61.9 vs Improved DDPM 299.2 (~5x better)
  • 10-step MNIST FID: Bridge 19.3 vs Improved DDPM 136.9 (~7x better)
  • At 1000 steps, Improved DDPM surpasses Bridge model
  • Shannon entropy of target posterior is provably non-increasing, reaching zero at terminal time
  • The Doob h-transform naturally arises in the bridge transition probabilities

Important References

  1. Lévy Random Bridges and the Modelling of Financial Information — Original Lévy random bridge theory
  2. Score-Based Generative Modeling through Stochastic Differential Equations — Improved DDPM baseline
  3. Denoising Diffusion Probabilistic Models — Primary DDPM baseline

Atomic Notes


paper