Stochastic Interpolants A Unifying Framework for Flows and Diffusions

Abstract

A class of generative models that unifies flow-based and diffusion-based methods is introduced. These models extend the framework proposed in Albergo & Vanden-Eijnden (2023), enabling the use of a broad class of continuous-time stochastic processes called ‘stochastic interpolants’ to bridge any two arbitrary probability density functions exactly in finite time. These interpolants are built by combining data from the two prescribed densities with an additional latent variable that shapes the bridge in a flexible way.

Summary

The stochastic interpolant framework provides a mathematically rigorous unification of flow-based and diffusion-based generative models. A stochastic interpolant is defined as $x_{t} = I (t, x_{0}, x_{1}) + γ (t) z$ where $I$ interpolates between endpoints ( $I (0, x_{0}, x_{1}) = x_{0}$ , $I (1, x_{0}, x_{1}) = x_{1}$ ), $γ (t)$ controls latent noise ( $γ (0) = γ (1) = 0$ , $γ (t) > 0$ for $t \in (0, 1)$ ), $(x_{0}, x_{1}) \sim ν$ with marginals $ρ_{0}, ρ_{1}$ , and $z \sim N (0, I_{d})$ independent of the data. The density $ρ (t)$ of $x_{t}$ is proved to be absolutely continuous and strictly positive, satisfying both a transport equation $\partial_{t} ρ + \nabla \cdot (b ρ) = 0$ and forward/backward Fokker-Planck equations with tunable diffusion coefficient $ϵ (t)$ .

The velocity $b (t, x) = E [\overset{x}{˙}_{t} ∣ x_{t} = x]$ and score $s (t, x) = - γ^{- 1} (t) E [z ∣ x_{t} = x]$ are characterized as unique minimizers of simple quadratic objectives (Theorems 7, 8). This yields three equivalent generative models: a probability flow ODE $\dot{X}_{t} = b (t, X_{t})$ , a forward SDE $d X_{t}^{F} = b_{F} d t + 2 ϵ d W_{t}$ with $b_{F} = b + ϵs$ , and a backward SDE. Crucially, the density $ρ (t)$ is the same for all three — the interpolant construction is decoupled from the sampling process.

The framework’s key advantages over your random bridge approach: (1) the latent variable $γ (t) z$ smooths the intermediate density, eliminating spurious modes that appear in pure linear interpolation; (2) the coupling $ν$ between $(x_{0}, x_{1})$ can be arbitrary (independent, OT, or data-adapted); (3) the Schrödinger bridge is recovered as the solution to a max-min problem over interpolants (Theorem 41, Section 3.4).

Key Contributions

General stochastic interpolant $x_{t} = I (t, x_{0}, x_{1}) + γ (t) z$ with rigorous density theory (Theorem 6)
Velocity and score as unique minimizers of quadratic objectives (Theorems 7, 8)
Simultaneous derivation of ODE and SDE generative models with tunable $ϵ (t)$
Likelihood control: SDE models bound KL divergence via velocity+score loss (Theorem 23); ODE models additionally require Fisher divergence control
Recovery of score-based diffusion as one-sided interpolant with time reparameterization (Section 5.1)
Recovery of rectified flow as $γ = 0$ linear interpolant (Section 5.3)
Recovery of Schrödinger bridge via optimization over interpolant function $I$ (Section 3.4, Theorem 41)
Remark 48 explicitly states: “straight line solutions is a necessary condition for optimal transport, but it is not sufficient” — confirming the rectified-flow-doesn’t-solve-OT result

Methodology

The interpolant $x_{t} = I (t, x_{0}, x_{1}) + γ (t) z$ bridges $ρ_{0}$ and $ρ_{1}$ by construction. The velocity $b$ is learned by minimizing $L_{b} = \int_{0}^{1} E [\frac{1}{2} ∣ \hat{b} ∣^{2} - (\partial_{t} I + \overset{γ}{˙} z) \cdot \hat{b}] d t$ . The score is learned via $L_{s} = \int_{0}^{1} E [\frac{1}{2} ∣ \overset{s}{^} ∣^{2} + γ^{- 1} z \cdot \overset{s}{^}] d t$ . For the spatially linear case $x_{t}^{lin} = α (t) x_{0} + β (t) x_{1} + γ (t) z$ , the velocity factorizes as $b = \overset{α}{˙} η_{0} + \dot{β} η_{1} + \overset{γ}{˙} η_{z}$ where $η_{0} = E [x_{0} ∣ x_{t}]$ , $η_{1} = E [x_{1} ∣ x_{t}]$ , $η_{z} = E [z ∣ x_{t}]$ .

Key Findings

The latent variable $γ (t) z$ smooths intermediate densities, suppressing spurious modes (Figure 4)
The diffusion coefficient $ϵ (t)$ affects sample trajectories but NOT the density $ρ (t)$ — a decoupling not present in standard diffusion models
For ODE-based generative models: learning $b$ alone is insufficient for likelihood control; one must also control the Fisher divergence (unlike SDE models)
Rectification (Section 5.3) produces straight-line flows but does NOT guarantee optimal transport (Remark 48): “straight line solutions is a necessary condition for optimal transport, but it is not sufficient”
Optimizing over gradient velocity fields and iterating rectification converges to Brenier’s polar decomposition (Remark 49, citing Liu 2022)

Important References

Flow Matching for Generative Modeling — Concurrent work arriving at similar conditional objectives via a different derivation
Score-Based Generative Modeling through Stochastic Differential Equations — Score-based diffusion recovered as one-sided interpolant
Flow Straight and Fast — Rectified flow recovered as γ=0 linear interpolant; rectification analyzed in Section 5.3

Atomic Notes

paper

Alethograph

Explorer

Stochastic Interpolants A Unifying Framework for Flows and Diffusions

Summary

Key Contributions

Methodology

Key Findings

Important References

Atomic Notes

Graph View

Table of Contents

Backlinks