global spurious mutual information

Global Spurious Mutual Information (S_global) is a metric introduced in Towards Certified Shortcut Unlearning in Medical Imaging that quantifies the worst-case information leakage from spurious features (e.g., surgical markers, rulers, ink markings) into model predictions within segmentation tasks.

The standard conditional mutual information I(S; Y | Y_hat) between spurious features S and pathology Y given predictions Y_hat is inadequate for segmentation because the vast majority of pixels are background, causing the metric to be dominated by trivially independent empty regions. S_global addresses this by explicitly targeting information leakage within predicted foreground regions.

Formally, for a random mask variable Z:

S_global(Z) := max_{k in {1,…,K}} sum_{i,j} P(Z_ij = k) * I(S_ij, Y_ij | Z_ij = k)

where the maximum is taken over non-background classes, weighted by the probability of predicting each class at each location.

Key Details

Captures worst-case leakage across non-background classes, reflecting the clinical requirement that pathology detection be independent of spurious correlations
Minimising S_global formally corresponds to shortcut removal from positive predictions
Certified Reduction Guarantee (Corollary 3.7): For an (epsilon, delta)-certified unlearning operator U, |S_global(U[Y^(r1)]) - S_global(Y^(r2))| ⇐ O(epsilon) + O(delta log(1/delta))
The bound shows the unlearned model’s shortcut reliance tracks that of the ideal fine-mask model up to additive certification error
Empirically validated via Conditional Mutual Information (CMI) evolution plots showing sharp CMI drop during unlearning

concept

Alethograph

Explorer

global spurious mutual information

Key Details

Graph View

Backlinks