Abstract
In the field of machine unlearning, certified unlearning has been extensively studied in convex machine learning models due to its high efficiency and strong theoretical guarantees. However, its application to deep neural networks (DNNs), known for their highly nonconvex nature, still poses challenges. To bridge the gap between certified unlearning and DNNs, we propose several simple techniques to extend certified unlearning methods to nonconvex objectives. To reduce the time complexity, we develop an efficient computation method by inverse Hessian approximation without compromising certification guarantees. In addition, we extend our discussion of certification to nonconvergence training and sequential unlearning, considering that real-world users can send unlearning requests at different time points. Extensive experiments on three real-world datasets demonstrate the efficacy of our method and the advantages of certified unlearning in DNNs.
Summary
This paper bridges the gap between certified unlearning — which provides formal (epsilon, delta) indistinguishability guarantees between the unlearned model and a fully retrained model — and deep neural networks whose loss landscapes are inherently nonconvex. Prior certified unlearning methods relied on strong convexity assumptions that do not hold for DNNs.
The authors decompose certified unlearning into two steps: (1) estimating the retrained model via a modified Newton update, and (2) adding calibrated Gaussian noise proportional to the approximation error bound to achieve the certification budget. Two key techniques make this practical for nonconvex objectives: local convex approximation (adding an l2 regularization term to make the Hessian positive definite locally) and a weight norm constraint during training (projected gradient descent with bound C on parameter norms), which together yield a tractable approximation error bound independent of the unlearned set size.
For computational efficiency, the authors adopt the LiSSA (Linear time Stochastic Second-order Algorithm) method to approximate the inverse Hessian, reducing complexity from O(np^2 + p^3) to O(sp^2 + n_u p) where s is the number of recursion samples. They further extend the certification analysis to nonconvergence training (where w* is not the exact minimum) and sequential unlearning (multiple successive unlearning requests), proving that the same approximation error bounds hold in both settings. Experiments on MNIST, CIFAR-10, and SVHN with MLP, AllCNN, and ResNet18 architectures show that their method achieves over 10x speedup compared to retraining while maintaining competitive F1-scores and membership inference attack resistance.
Key Contributions
- Extension of certified unlearning from convex to nonconvex objectives via local convex approximation and weight norm constraints, without requiring convexity of the loss function
- Efficient inverse Hessian approximation using LiSSA that achieves 470x speedup over exact Hessian computation while preserving certification guarantees
- Theoretical analysis of certified unlearning under nonconvergence training and sequential unlearning settings with maintained approximation error bounds
- Comprehensive experimental validation across three datasets and three DNN architectures demonstrating practical effectiveness
Methodology
The core framework follows a two-step process based on Theorem 2.3 (connecting certified unlearning to differential privacy):
- Approximation step: Estimate the retrained model via a Newton-like update: w_tilde = w* + (n_u / (n - n_u)H) * H_tilde_inv * grad_L(w*, D_u), where H_tilde_inv is the LiSSA-approximated inverse Hessian with local convex regularization (lambda * I added to make eigenvalues positive)
- Noise addition step: Add Gaussian noise Y ~ N(0, sigma^2 I) calibrated to the approximation error bound Delta to achieve (epsilon, delta)-certified unlearning
Key technical components:
- local convex approximation for certified unlearning: Adding lambda * I to the Hessian to ensure positive definiteness at the cost of a controllable approximation bias
- LiSSA for inverse Hessian approximation: Stochastic recursive estimation of (H + lambda I)^{-1} using s i.i.d. Hessian samples
- sequential certified unlearning: Processing multiple unlearning requests sequentially while maintaining the same per-step error bounds (Algorithm 2)
Key Findings
- Certified unlearning achieves the closest F1-scores to the retrain-from-scratch baseline on the unlearned set across most experimental configurations
- The method provides 10x+ speedup over full retraining on DNNs
- Local convex approximation distinctly reduces the approximation error bound and slightly reduces real approximation error
- The weight norm constraint yields much lower approximation error compared to unconstrained training
- Under sequential unlearning, the gradient norm and approximation error bound decrease at each step as parameters approach a local optimum
- A tradeoff exists between certification budget tightness and model utility: tighter budgets require more noise, degrading performance
Important References
- Certified data removal from machine learning models (Guo et al., 2020) — foundational definition of (epsilon, delta)-certified unlearning
- Remember what you want to forget (Sekhari et al., 2021) — algorithms for certified unlearning in convex models
- Descent-to-delete (Neel et al., 2021) — gradient-based methods for machine unlearning
Atomic Notes
- local convex approximation for certified unlearning
- LiSSA for inverse Hessian approximation
- sequential certified unlearning