W∞ distance tracking is a technique for bounding the initial distance between adjacent PNSGD learning processes, which is crucial for establishing tight certified unlearning guarantees. The W∞ (infinite Wasserstein) distance between distributions μ and ν is defined as W∞(μ,ν) = inf_{γ∈Γ(μ,ν)} ess sup_{(x,y)~γ} ||x-y||.

The key result is that for adjacent datasets D, D’ differing in one point belonging to mini-batch B^{j₀}, the W∞ distance between their stationary distributions satisfies: W∞(ν_{D|B}, ν_{D’|B}) ≤ min(2ηM/(b(1-c^{n/b})), 2R), where c = 1-ηm, M is the gradient difference bound, and b is the mini-batch size.

Key Details

  • Naive bound: W∞ ≤ 2R (projection set diameter) — vacuous for unlearning
  • Tight bound: Z_B = O(ηM/b) when c^{n/b} is small — dramatically tighter
  • Per-iteration tracking: Within each epoch, only the mini-batch containing the differing point causes divergence of 2ηM/b
  • Cross-epoch contraction: The gradient map contracts by factor c = 1-ηm each iteration under strong convexity
  • Sequential unlearning: W∞ is a metric, so triangle inequality gives linear accumulation across multiple unlearning requests
  • Converts to Rényi divergence bounds via privacy amplification by iteration (Lemma 2.6)

concept