Flow Matching for Generative Modeling

Abstract

We introduce a new paradigm for generative modeling built on Continuous Normalizing Flows (CNFs), allowing us to train CNFs at unprecedented scale. Specifically, we present the notion of Flow Matching (FM), a simulation-free approach for training CNFs based on regressing vector fields of fixed conditional probability paths. Flow Matching is compatible with a general family of Gaussian probability paths for transforming between noise and data samples — which subsumes existing diffusion paths as special instances. Furthermore, Flow Matching opens the door to training CNFs with other, non-diffusion probability paths. An instance of particular interest is using Optimal Transport (OT) displacement interpolation to define the conditional probability paths.

Summary

Flow Matching introduces a simulation-free framework for training Continuous Normalizing Flows by directly regressing the velocity field of a target probability path. The key insight is that the intractable marginal Flow Matching objective $L_{FM} = E_{t, p_{t} (x)} ∥ v_{t} (x) - u_{t} (x) ∥^{2}$ can be replaced by the tractable Conditional Flow Matching (CFM) objective $L_{CFM} = E_{t, q (x_{1}), p_{t} (x ∣ x_{1})} ∥ v_{t} (x) - u_{t} (x ∣ x_{1}) ∥^{2}$ , which has identical gradients (Theorem 2). This allows training without ODE simulation, using only per-sample conditional probability paths.

The paper defines a general family of Gaussian conditional probability paths $p_{t} (x ∣ x_{1}) = N (x ∣ μ_{t} (x_{1}), σ_{t} (x_{1})^{2} I)$ with corresponding conditional velocity fields $u_{t} (x ∣ x_{1}) = \frac{σ _{t}^{'}}{σ _{t}} (x - μ_{t}) + μ_{t}^{'}$ (Theorem 3). Two key instances are: (1) Diffusion paths recovering VE/VP-SDE probability paths; (2) OT paths with $μ_{t} = t x_{1}$ and $σ_{t} = 1 - (1 - σ_{m i n}) t$ , producing straight-line trajectories with constant-direction velocity fields. The OT conditional velocity $u_{t} (x ∣ x_{1}) = (x_{1} - (1 - σ_{m i n}) x) / (1 - (1 - σ_{m i n}) t)$ is time-constant in direction, making it simpler to learn than the time-varying diffusion score function.

FM with OT paths achieves state-of-the-art results: FID 2.99 on CIFAR-10, FID 5.02 on ImageNet-32, and FID 20.9 on ImageNet-128, outperforming score matching and DDPM while requiring ~60% fewer function evaluations.

Key Contributions

Conditional Flow Matching objective providing unbiased, simulation-free training of CNFs (Theorem 2)
General family of Gaussian conditional probability paths parameterized by $μ_{t}, σ_{t}$ (Theorem 3)
OT displacement interpolation as a conditional probability path, producing straight trajectories
Subsumes diffusion paths (VE, VP) as special cases while enabling non-diffusion paths
Empirical demonstration that OT paths are faster to train, faster to sample, and produce better samples than diffusion paths

Methodology

Training minimizes $L_{CFM} = E_{t, q (x_{1}), p (x_{0})} ∥ v_{θ} (ψ_{t} (x_{0})) - \frac{d}{d t} ψ_{t} (x_{0}) ∥^{2}$ where $ψ_{t} (x) = σ_{t} x + μ_{t}$ is the conditional flow map. For OT paths: $ψ_{t} (x) = (1 - (1 - σ_{m i n}) t) x + t x_{1}$ , giving loss $E ∥ v_{θ} (ψ_{t} (x_{0})) - (x_{1} - (1 - σ_{m i n}) x_{0}) ∥^{2}$ . Sampling via ODE solver (dopri5) with tolerances 1e-5.

Key Findings

OT paths produce constant-direction conditional VFs — simpler regression target than time-varying diffusion scores
FM with diffusion paths is more stable than score matching despite equivalent objectives
OT paths require ~60% fewer NFEs to reach the same error threshold as diffusion paths
The conditional OT flow is the displacement map between two Gaussians — optimal per-sample but the marginal VF is NOT the global OT map

Important References

Score-Based Generative Modeling through Stochastic Differential Equations — SDE framework whose probability paths are subsumed as special cases
Flow Straight and Fast — Concurrent work (rectified flow) arriving at similar linear interpolation objectives
Building Normalizing Flows with Stochastic Interpolants — Concurrent work by Albergo & Vanden-Eijnden with a similar framework

Atomic Notes

paper

Alethograph

Explorer

Flow Matching for Generative Modeling

Summary

Key Contributions

Methodology

Key Findings

Important References

Atomic Notes

Graph View

Table of Contents

Backlinks