NegGrad+

NegGrad+ is an improved gradient ascent-based machine unlearning method proposed in Kurmanji et al. (2023). It extends the basic NegGrad baseline (which only performs gradient ascent on the forget set) by simultaneously fine-tuning on both the retain and forget sets, with a tunable parameter beta that controls the balance between retaining knowledge and inducing forgetting.

The loss function for NegGrad+ is:

L(w) = beta * (1/|D_r|) * sum_{i=1}^{|D_r|} l(f(x_i; w), y_i) - (1 - beta) * (1/|D_f|) * sum_{j=1}^{|D_f|} l(f(x_j; w), y_j)

where beta in [0, 1] controls the trade-off. When beta = 1, NegGrad+ reduces to standard fine-tuning on the retain set. When beta = 0, it reduces to pure gradient ascent on the forget set (original NegGrad). Intermediate values balance the two objectives, achieving a good trade-off between forget quality and model utility.

The key insight behind NegGrad+ is that pure gradient ascent (NegGrad) on the forget set tends to damage model performance on retained data, while standard fine-tuning on the retain set alone often fails to induce sufficient forgetting. By interpolating between these extremes, NegGrad+ achieves a balance that previous work did not explore.

Algorithm

Start from the original model f(.; w^o) trained on the full dataset D
Fine-tune using the NegGrad+ loss with chosen beta, using SGD with standard hyperparameters
For small-scale experiments: beta = 0.95, lr = 0.01, weight decay = 0.1, 10 epochs
For large-scale experiments: beta = 0.9999, lr = 0.01, weight decay = 0.0005, 5 epochs

Key Properties

Simple and efficient: requires only standard SGD fine-tuning, no additional data structures or stored statistics
Scalable: computational cost is comparable to fine-tuning, with no quadratic scaling issues
Tunable: the beta parameter allows practitioners to control the forgetting-utility trade-off for their specific application
Strong baseline: outperforms prior methods in several settings, particularly for RB applications, though less consistently than SCRUB
No formal guarantees: like SCRUB, NegGrad+ does not provide (epsilon, delta)-certified approximate unlearning certificates
Has been adapted to segmentation settings as NegGrad-Seg in the context of shortcut unlearning
Small values of beta can cause loss explosion in practice; careful tuning is required

method

Alethograph

Explorer

NegGrad+

Algorithm

Key Properties

Graph View

Table of Contents

Backlinks