Sensitivity is a fundamental quantity in differential privacy that measures the maximum change a single individual’s data can cause in the output of a query function. It determines the amount of noise needed to make a mechanism differentially private: the higher the sensitivity, the more noise is required to mask any one person’s contribution. The concept is formally defined in DWORK2014 (Definition 3.1).

For a function f: N^|X| R^k mapping databases to real-valued vectors, the L1 sensitivity (or global sensitivity) is:

Delta_1 f = max_{x,y : ||x-y||_1 = 1} ||f(x) - f(y)||_1

where the maximum is over all pairs of neighboring databases (databases differing in exactly one record). The L2 sensitivity is defined analogously:

Delta_2 f = max_{x,y : ||x-y||_1 = 1} ||f(x) - f(y)||_2

The L1 sensitivity governs the noise calibration for the Laplace mechanism (noise scale b = Delta_1 f / epsilon), while the L2 sensitivity governs the Gaussian mechanism (noise sigma >= c * Delta_2 f / epsilon). Since ||v||_2 ||v||_1 for all vectors v, and the inequality can be strict by a factor of sqrt(k) for k-dimensional outputs, the Gaussian mechanism can require substantially less noise than the Laplace mechanism for high-dimensional queries.

In the context of certified approximate unlearning, sensitivity plays a dual role. First, it quantifies how much a single training point influences the trained model parameters — the “data influence” that unlearning must remove. Second, mechanisms like gradient clipping and model clipping artificially bound this influence (enforce bounded sensitivity) so that a controlled amount of noise suffices to provide certified unlearning guarantees. In Certified Unlearning for Neural Networks, gradient clipping to radius C ensures that the per-step sensitivity is bounded by 2C, enabling the privacy analysis.

Key Details

  • Counting queries have sensitivity 1 (adding or removing one person changes the count by at most 1)
  • Histogram queries over disjoint cells also have sensitivity 1 (each person affects exactly one cell)
  • For a vector of m counting queries, the L1 sensitivity can be as large as m (one person might affect every query), but the L2 sensitivity is at most sqrt(m)
  • The sensitivity is a worst-case (maximum over all pairs) measure; local sensitivity considers the actual pair of databases but cannot be used directly without additional care
  • In machine learning, the sensitivity of the empirical risk minimizer depends on the loss function, regularization, and learning algorithm; gradient clipping enforces bounded sensitivity by projecting gradients to a ball of radius C
  • Related to the Lipschitz constant of f: sensitivity is the Lipschitz constant of f with respect to the Hamming distance on databases and the L_p norm on outputs

concept