Towards Unbounded Machine Unlearning

Abstract

Deep machine unlearning is the problem of ‘removing’ from a trained neural network a subset of its training set. This problem is very timely and has many applications, including the key tasks of removing biases (RB), resolving confusion (RC) (caused by mislabelled data in trained models), as well as allowing users to exercise their ‘right to be forgotten’ to protect User Privacy (UP). This paper is the first, to our knowledge, to study unlearning for different applications (RB, RC, UP), with the view that each has its own desiderata, definitions for ‘forgetting’ and associated metrics for forget quality. For UP, we propose a novel adaptation of a strong Membership Inference Attack for unlearning. We also propose SCRUB, a novel unlearning algorithm, which is the only method that is consistently a top performer for forget quality across the different application-dependent metrics for RB, RC, and UP. At the same time, SCRUB is also consistently a top performer on metrics that measure model utility (i.e. accuracy on retained data and generalization), and is more efficient than previous work. The above are substantiated through a comprehensive empirical evaluation against previous state-of-the-art.

Summary

This paper addresses a fundamental limitation of existing approximate unlearning methods: they rely on assumptions (e.g., stability of SGD, small forget sets) that are often violated in practice, leading to poor empirical performance. The authors argue that the definition of “forgetting” is application-dependent and propose the first framework to study unlearning across three distinct scenarios: Removing Biases (RB), where maximal forget error is desired; Resolving Confusion (RC), where mislabelled data must be unlearned; and User Privacy (UP), where the model must defend against Membership Inference Attacks.

The central methodological contribution is SCRUB (SCalable Remembering and Unlearning unBound), a novel teacher-student formulation where the original model serves as a “teacher” and the student model selectively obeys it — matching the teacher on retained data while diverging on forget data. This is achieved through a contrastive-style min-max objective that simultaneously minimizes KL divergence between student and teacher on the retain set while maximizing it on the forget set, combined with standard cross-entropy loss on retained data. The approach is termed “unbounded” because it does not rely on the limiting assumptions (small forget sets, model closeness to optimum) that constrain prior certified methods.

The paper also contributes NegGrad+, an improved variant of gradient ascent-based unlearning that balances forget and retain objectives, and introduces the first adaptation of the LiRA (Likelihood Ratio Attack) Membership Inference Attack to the unlearning evaluation setting. Through comprehensive experiments on CIFAR-10 and Lacuna-10 with ResNet-18 and All-CNN architectures, SCRUB is shown to be by far the most consistent top performer across all three application scenarios and both model utility metrics.

Key Contributions

Application-dependent unlearning evaluation: first paper to study unlearning across three distinct application scenarios (RB, RC, UP), each with tailored forgetting definitions and metrics
SCRUB algorithm: teacher-student framework with contrastive min-max objective that consistently achieves high forget quality without damaging model utility across all applications
SCRUB+R (SCRUB with Rewind): extension that selects the training checkpoint where forget set error is “just high enough” by calibrating against a validation set, greatly improving defense against Membership Inference Attacks
NegGrad+: improved gradient ascent baseline that balances retain and forget losses with a tunable weight parameter beta
LiRA for unlearning: first adaptation of the state-of-the-art LiRA Membership Inference Attack to evaluate unlearning quality in the privacy setting
Scalability: SCRUB does not require storing the entire training dataset and scales to class-level and selective unlearning without the quadratic cost of NTK-based methods

Methodology

SCRUB’s training objective (Equation 3):

min_{w^u} alpha/N_r * sum_{x_r in D_r} d(x_r; w^u) + gamma/N_r * sum_{(x_r, y_r) in D_r} l(f(x_r; w^u), y_r) - 1/N_f * sum_{x_f in D_f} d(x_f; w^u)

where d(x; w^u) = D_KL(p(f(x; w^o)) || p(f(x; w^u))) is the KL divergence between teacher and student outputs. Training alternates between max-steps (gradient ascent on forget set) and min-steps (gradient descent on retain set), inspired by GAN training. Additional min-steps at the end guard against utility loss.

For the rewinding procedure (SCRUB+R), a validation set matching the forget set distribution is created, and the checkpoint closest to the validation set error is selected, ensuring forget error is neither too high (MIA vulnerability) nor too low (incomplete forgetting).

Key Findings

Prior approximate unlearning methods (NTK, Fisher Forgetting) often perform poorly because they violate assumptions about model closeness to optimal solutions and forget set size
Finetuning retains utility but fails to forget; EU-k forgets well for UP but is inconsistent for RB
SCRUB is the only method consistently ranked as a top performer across all metrics and all application scenarios
NegGrad+ is a strong baseline when tuned properly, outperforming prior work in several settings
The rewinding procedure substantially improves SCRUB’s defense against LiRA MIA, making it comparable to the Retrain oracle
Bad-T (teacher-student with an incompetent teacher) forgets effectively but damages model utility

Important References

Certified data removal from machine learning models — Guo et al. (2019), foundational work on certified data removal that inspires the indistinguishability framework
Making AI Forget You Data Deletion in Machine Learning — Ginart et al. (2019), introduces the probabilistic definition of unlearning inspired by differential privacy
Eternal Sunshine of the Spotless Net Selective Forgetting in Deep Networks — Golatkar et al. (2020), NTK-based unlearning approach that SCRUB supersedes in scalability and consistency

Atomic Notes

paper

Alethograph

Explorer