Differential Privacy
1 min readMay 16, 2021
Centralized DP (CDP) vs. Local DP (LDP)
DP:
Intuitively, DP can be interpreted as follows:
- The decision to include/exclude individual’s record has minimal influence (epsilon) on the outcome
- Smaller epsilon (i.e. privacy budget) → more privacy
DP was inspired from 1965 Warner model:
- A survey technique for private questions
- Technique:
- Survey people: are you communist party?
- Each person flips an unbiased coin; answer the truth if it is head (with probability 0.5); answer randomly if it is tail;
- A communist will answer “yes” with probability 0.75 and “no” 0.25
- If m out of n people are communists, we expect to see this many yes answers: E(n_yes) = 0.75 * m + 0.25 (n — m)
- The unbiased estimation of the number of communists is then c(m) = (n_yes — 0.25 * n)/0.5
- This provides deniability — seeing the answer, not certain about the secret
CDP:
- A centralized data curator collects all data, then perturb using DP before releasing to the public
- Pros: More utility compared to LDP
LDP:
- Perturb user’s data locally before they leave user’s device
- Only the data owner can access the private data
- The data curator only holds perturbed data
- Cons: lower utility compared to CDP