shifted Renyi divergence

The shifted Renyi divergence is a hybrid distance measure between probability distributions introduced by Feldman et al. (2018) in Privacy Amplification by Iteration - Feldman et al 2018. It interpolates between the infinity-Wasserstein distance W_infinity (a metric notion of distance) and the standard Renyi divergence D_alpha (an information-theoretic divergence). This interpolation is the central technical innovation enabling the proof of privacy amplification by iteration for contractive noisy iterations.

For distributions mu and nu on a Banach space (Z, || . ||), and parameters z >= 0 and alpha >= 1, the z-shifted Renyi divergence of order alpha is defined as:

D_alpha^{(z)}(mu || nu) := inf_{mu’: W_infinity(mu, mu’) ⇐ z} D_alpha(mu’ || nu)

That is, it is the smallest Renyi divergence of order alpha between nu and any distribution mu’ that is within infinity-Wasserstein distance z of mu. At z = 0, it reduces to the standard Renyi divergence D_alpha(mu || nu). As z increases, the shifted divergence can only decrease (monotonicity: for 0 ⇐ z ⇐ z’, D_alpha^{(z’)} ⇐ D_alpha^{(z)}). The “shifting” property states that for any deterministic shift x, D_alpha^{(||x||)}(mu * x || nu) ⇐ D_alpha(mu || nu), where mu * x denotes the distribution of U + x for U ~ mu.

The shifted Renyi divergence plays a critical role in Koloskova et al.’s (2025) Certified Unlearning for Neural Networks proofs, particularly for the gradient clipping for unlearning analysis. The key mechanism is that the shift parameter z tracks the accumulated “metric distance” between processes that has not yet been converted into information-theoretic divergence by noise addition.

Key Details

Definition (Feldman et al., Definition 8): D_alpha^{(z)}(mu || nu) = inf_{mu’: W_infinity(mu, mu’) ⇐ z} D_alpha(mu’ || nu).
Noise magnitude function: For a noise distribution zeta on a Banach space, R_alpha(zeta, a) = sup_{||x|| ⇐ a} D_alpha(zeta * x || zeta). For Gaussian noise N(0, sigma^2 I_d) on R^d, R_alpha(N(0, sigma^2 I_d), a) = alpha * a^2 / (2 * sigma^2).
Shift-reduction lemma (Lemma 20): D_alpha^{(z)}(mu * zeta || nu * zeta) ⇐ D_alpha^{(z+a)}(mu || nu) + R_alpha(zeta, a) for any a >= 0. Adding noise converts metric distance (shift) into information-theoretic divergence.
Contraction lemma (Lemma 21): For contractive maps psi, psi’ with sup_x ||psi(x) - psi’(x)|| ⇐ s, D_alpha^{(z+s)}(psi(X) || psi’(X’)) ⇐ D_alpha^{(z)}(X || X’). Contractive maps cannot increase the shifted divergence.
These two lemmas combine inductively to prove the main amplification theorem (Theorem 22) for Contractive Noisy Iterations.
Balle et al. (2019) in Privacy Amplification by Mixing and Diffusion Mechanisms provide a measure-theoretic generalization via explicit couplings (Theorem 2), replacing the W_infinity-based definition with transport operators.

concept

Alethograph

Explorer

shifted Renyi divergence

Key Details

Graph View

Backlinks