Sensitivity is a fundamental quantity in differential privacy that measures the maximum change a single individual’s data can cause in the output of a query function. It determines the amount of noise needed to make a mechanism differentially private: the higher the sensitivity, the more noise is required to mask any one person’s contribution. The concept is formally defined in DWORK2014 (Definition 3.1).
For a function f: N^|X| → R^k mapping databases to real-valued vectors, the L1 sensitivity (or global sensitivity) is:
Delta_1 f = max_{x,y : ||x-y||_1 = 1} ||f(x) - f(y)||_1
where the maximum is over all pairs of neighboring databases (databases differing in exactly one record). The L2 sensitivity is defined analogously:
Delta_2 f = max_{x,y : ||x-y||_1 = 1} ||f(x) - f(y)||_2
The L1 sensitivity governs the noise calibration for the Laplace mechanism (noise scale b = Delta_1 f / epsilon), while the L2 sensitivity governs the Gaussian mechanism (noise sigma >= c * Delta_2 f / epsilon). Since ||v||_2 ⇐ ||v||_1 for all vectors v, and the inequality can be strict by a factor of sqrt(k) for k-dimensional outputs, the Gaussian mechanism can require substantially less noise than the Laplace mechanism for high-dimensional queries.
In the context of certified approximate unlearning, sensitivity plays a dual role. First, it quantifies how much a single training point influences the trained model parameters — the “data influence” that unlearning must remove. Second, mechanisms like gradient clipping and model clipping artificially bound this influence (enforce bounded sensitivity) so that a controlled amount of noise suffices to provide certified unlearning guarantees. In Certified Unlearning for Neural Networks, gradient clipping to radius C ensures that the per-step sensitivity is bounded by 2C, enabling the privacy analysis.
Key Details
- Counting queries have sensitivity 1 (adding or removing one person changes the count by at most 1)
- Histogram queries over disjoint cells also have sensitivity 1 (each person affects exactly one cell)
- For a vector of m counting queries, the L1 sensitivity can be as large as m (one person might affect every query), but the L2 sensitivity is at most sqrt(m)
- The sensitivity is a worst-case (maximum over all pairs) measure; local sensitivity considers the actual pair of databases but cannot be used directly without additional care
- In machine learning, the sensitivity of the empirical risk minimizer depends on the loss function, regularization, and learning algorithm; gradient clipping enforces bounded sensitivity by projecting gradients to a ball of radius C
- Related to the Lipschitz constant of f: sensitivity is the Lipschitz constant of f with respect to the Hamming distance on databases and the L_p norm on outputs