Differential Privacy

AI/Data Science Digest
1 min readMay 16, 2021

Centralized DP (CDP) vs. Local DP (LDP)

Comparison between the two differential privacy models (Source: Yang et. al., Local differential privacy and its applications)

DP:

Intuitively, DP can be interpreted as follows:

  • The decision to include/exclude individual’s record has minimal influence (epsilon) on the outcome
  • Smaller epsilon (i.e. privacy budget) → more privacy

DP was inspired from 1965 Warner model:

  • A survey technique for private questions
  • Technique:
  • Survey people: are you communist party?
  • Each person flips an unbiased coin; answer the truth if it is head (with probability 0.5); answer randomly if it is tail;
  • A communist will answer “yes” with probability 0.75 and “no” 0.25
  • If m out of n people are communists, we expect to see this many yes answers: E(n_yes) = 0.75 * m + 0.25 (n — m)
  • The unbiased estimation of the number of communists is then c(m) = (n_yes — 0.25 * n)/0.5
  • This provides deniability — seeing the answer, not certain about the secret

CDP:

  • A centralized data curator collects all data, then perturb using DP before releasing to the public
  • Pros: More utility compared to LDP

LDP:

  • Perturb user’s data locally before they leave user’s device
  • Only the data owner can access the private data
  • The data curator only holds perturbed data
  • Cons: lower utility compared to CDP

--

--

AI/Data Science Digest

One Digest At a Time. I value your time! #datascience #AI #GenAI #LLMs #dataanalyst #datascientist #probability #statistics #ML #savetime #digest