Spatial specificity is a framework introduced by Saab et al. (2022) that characterises the degree to which spatial annotations constrain a model’s ability to exploit spurious background features. The key insight is that annotation granularity forms a hierarchy: image-level labels provide no spatial constraint, bounding boxes restrict attention to a region around the pathology, and pixel-level segmentation masks tightly constrain the model to the pathology itself.

Formally, spatial specificity is parameterised by a dilation radius r, where a mask Y^(r) is formed by dilating the ground-truth pathology mask Y by r pixels. As r decreases (finer annotation), the mutual information I(S; Y_hat | Y^(r)) between spurious features S and model predictions Y_hat decreases, because fewer background pixels are included in the training signal.

Key Details

  • The hierarchy is: image-level label (r = max) > bounding box (intermediate r) > segmentation mask (r = 0)
  • Under the conditional independence assumption (S independent of Y given Y^(r)), the reduction in spurious mutual information is provably monotonic in r
  • Towards Certified Shortcut Unlearning in Medical Imaging later showed that transitioning between spatial specificity levels is isomorphic to machine unlearning of dilation artefacts (the unlearning isomorphism)
  • The benefit of finer spatial specificity is largest when the pathology is small relative to the image (low target-to-background ratio)
  • Hooper et al. (2023) extended this idea into the general segmentation-for-classification framework

concept