Projected noisy gradient descent (PNGD) and its stochastic variant PNSGD are optimization algorithms that add Gaussian noise to gradient updates and project onto a bounded convex set. These algorithms are central to both differential privacy and certified machine unlearning.

The update rule for PNSGD is: x_{k}^{j+1} = Π_{C_R}(x_k^j - η g(x_k^j, B^j) + √(2ησ²) W_k^j), where g is the mini-batch gradient, B^j is the j-th mini-batch, σ is the noise standard deviation, and Π_{C_R} is projection onto the ball of radius R.

Algorithm

  1. Initialize x_0 on C_R = {x ∈ ℝ^d : ||x|| ≤ R}
  2. For each epoch t = 0, …, T-1:
    • For each mini-batch j = 0, …, n/b-1:
      • Compute mini-batch gradient g(x_t^j, B^j)
      • Add Gaussian noise: x_{t,1} = x_t^j - η g + √(2ησ²) W
      • Project: x_t^{j+1} = Π_{C_R}(x_{t,1})
  3. Output x_T

Key Properties

  • Converges to a unique stationary distribution ν_{D|B} for each dataset-batch sequence pair
  • The gradient update is (1-ηm)-contractive under m-strong convexity when η ≤ 1/L
  • Projection does not increase Rényi divergence (data processing inequality)
  • Noise σ controls privacy-utility trade-off: larger σ → stronger privacy, worse utility
  • Mini-batch size b controls privacy-complexity trade-off: smaller b → better privacy decay, more iterations per epoch

method