Summary

The stochastic interpolant framework provides a mathematically rigorous unification of flow-based and diffusion-based generative models. A stochastic interpolant is defined as where interpolates between endpoints (, ), controls latent noise (, for ), with marginals , and independent of the data. The density of is proved to be absolutely continuous and strictly positive, satisfying both a transport equation and forward/backward Fokker-Planck equations with tunable diffusion coefficient .

The velocity and score are characterized as unique minimizers of simple quadratic objectives (Theorems 7, 8). This yields three equivalent generative models: a probability flow ODE , a forward SDE with , and a backward SDE. Crucially, the density is the same for all three — the interpolant construction is decoupled from the sampling process.

The framework’s key advantages over your random bridge approach: (1) the latent variable smooths the intermediate density, eliminating spurious modes that appear in pure linear interpolation; (2) the coupling between can be arbitrary (independent, OT, or data-adapted); (3) the Schrödinger bridge is recovered as the solution to a max-min problem over interpolants (Theorem 41, Section 3.4).

Key Contributions

  • General stochastic interpolant with rigorous density theory (Theorem 6)
  • Velocity and score as unique minimizers of quadratic objectives (Theorems 7, 8)
  • Simultaneous derivation of ODE and SDE generative models with tunable
  • Likelihood control: SDE models bound KL divergence via velocity+score loss (Theorem 23); ODE models additionally require Fisher divergence control
  • Recovery of score-based diffusion as one-sided interpolant with time reparameterization (Section 5.1)
  • Recovery of rectified flow as linear interpolant (Section 5.3)
  • Recovery of Schrödinger bridge via optimization over interpolant function (Section 3.4, Theorem 41)
  • Remark 48 explicitly states: “straight line solutions is a necessary condition for optimal transport, but it is not sufficient” — confirming the rectified-flow-doesn’t-solve-OT result

Methodology

The interpolant bridges and by construction. The velocity is learned by minimizing . The score is learned via . For the spatially linear case , the velocity factorizes as where , , .

Key Findings

  • The latent variable smooths intermediate densities, suppressing spurious modes (Figure 4)
  • The diffusion coefficient affects sample trajectories but NOT the density — a decoupling not present in standard diffusion models
  • For ODE-based generative models: learning alone is insufficient for likelihood control; one must also control the Fisher divergence (unlike SDE models)
  • Rectification (Section 5.3) produces straight-line flows but does NOT guarantee optimal transport (Remark 48): “straight line solutions is a necessary condition for optimal transport, but it is not sufficient”
  • Optimizing over gradient velocity fields and iterating rectification converges to Brenier’s polar decomposition (Remark 49, citing Liu 2022)

Important References

  1. Flow Matching for Generative Modeling — Concurrent work arriving at similar conditional objectives via a different derivation
  2. Score-Based Generative Modeling through Stochastic Differential Equations — Score-based diffusion recovered as one-sided interpolant
  3. Flow Straight and Fast — Rectified flow recovered as γ=0 linear interpolant; rectification analyzed in Section 5.3

Atomic Notes


paper