Summary

This paper bridges the gap between certified unlearning — which provides formal (epsilon, delta) indistinguishability guarantees between the unlearned model and a fully retrained model — and deep neural networks whose loss landscapes are inherently nonconvex. Prior certified unlearning methods relied on strong convexity assumptions that do not hold for DNNs.

The authors decompose certified unlearning into two steps: (1) estimating the retrained model via a modified Newton update, and (2) adding calibrated Gaussian noise proportional to the approximation error bound to achieve the certification budget. Two key techniques make this practical for nonconvex objectives: local convex approximation (adding an l2 regularization term to make the Hessian positive definite locally) and a weight norm constraint during training (projected gradient descent with bound C on parameter norms), which together yield a tractable approximation error bound independent of the unlearned set size.

For computational efficiency, the authors adopt the LiSSA (Linear time Stochastic Second-order Algorithm) method to approximate the inverse Hessian, reducing complexity from O(np^2 + p^3) to O(sp^2 + n_u p) where s is the number of recursion samples. They further extend the certification analysis to nonconvergence training (where w* is not the exact minimum) and sequential unlearning (multiple successive unlearning requests), proving that the same approximation error bounds hold in both settings. Experiments on MNIST, CIFAR-10, and SVHN with MLP, AllCNN, and ResNet18 architectures show that their method achieves over 10x speedup compared to retraining while maintaining competitive F1-scores and membership inference attack resistance.

Key Contributions

  • Extension of certified unlearning from convex to nonconvex objectives via local convex approximation and weight norm constraints, without requiring convexity of the loss function
  • Efficient inverse Hessian approximation using LiSSA that achieves 470x speedup over exact Hessian computation while preserving certification guarantees
  • Theoretical analysis of certified unlearning under nonconvergence training and sequential unlearning settings with maintained approximation error bounds
  • Comprehensive experimental validation across three datasets and three DNN architectures demonstrating practical effectiveness

Methodology

The core framework follows a two-step process based on Theorem 2.3 (connecting certified unlearning to differential privacy):

  1. Approximation step: Estimate the retrained model via a Newton-like update: w_tilde = w* + (n_u / (n - n_u)H) * H_tilde_inv * grad_L(w*, D_u), where H_tilde_inv is the LiSSA-approximated inverse Hessian with local convex regularization (lambda * I added to make eigenvalues positive)
  2. Noise addition step: Add Gaussian noise Y ~ N(0, sigma^2 I) calibrated to the approximation error bound Delta to achieve (epsilon, delta)-certified unlearning

Key technical components:

Key Findings

  • Certified unlearning achieves the closest F1-scores to the retrain-from-scratch baseline on the unlearned set across most experimental configurations
  • The method provides 10x+ speedup over full retraining on DNNs
  • Local convex approximation distinctly reduces the approximation error bound and slightly reduces real approximation error
  • The weight norm constraint yields much lower approximation error compared to unconstrained training
  • Under sequential unlearning, the gradient norm and approximation error bound decrease at each step as parameters approach a local optimum
  • A tradeoff exists between certification budget tightness and model utility: tighter budgets require more noise, degrading performance

Important References

Atomic Notes


paper