differential-privacy: show $\epsilon$ -differentially privacy

251 Views Asked by At

In this problem we consider a sensitive dataset $x \in \{−1, 1\}^n$. We consider the bounded setting where neighboring n-dimensional datasets differ in one coordinate. $A$ mechanism is available that supports statistical queries on x. Specifically, for a query $q \in \{−1, 1\}^n$ the mechanism computes the dot product $<x, q > = \sum_{i=1}^n x_iq_i$ and releases $<x, q> + z$ where $z \sim Lap(\lambda)$ is Laplace-distributed noise added to protect the privacy of individual entries of x.

Then I must argue that for the case we have $\epsilon$-differentially private.

My thoughts are that the $<x,q>$ will differ between $-1$ and $1$, so we have $\{-1,1\}^n+z$ where $z \sim Lap(\lambda)$. Therefore when considering two databases $x_1, ..., x_n$ and $x_1, ..., x'_n$, we get that: $$ \begin{array}[ccc] \;\frac{P(m(x_1, \dots, x_n) = t)}{P(m(x_1, \dots, x'_n) = t)} & \leq & \exp(2/\lambda)\\ & = &\exp(\varepsilon) \end{array} $$ But I'm not sure if it correct? Especially not the inequality, why it should hold? Can anyone help?

1

There are 1 best solutions below

0
On

Just read definition from Wiki.

Denote $a_1, a_2$ be the outcomes of the algorithm (dot-product) in which the associated datasets $x_1, x_2$ are differ by $1$-element exactly (neighboring dataset).

Since each elements take the binary values ${-1, 1}$ only, the outcomes must satisfy $$ |a_1 - a_2| = 2$$

as inside the dot product, one of the summand $+1$ will be replaced by $-1$, or vice versa.

Using the similar parametrization from Wiki, https://en.wikipedia.org/wiki/Laplace_distribution

we have

$$ a_i + Q \sim \text{Lap}(a_i, \lambda)$$

where the pdf is given by $$ f_i(t) = \frac {1} {2\lambda} \exp\left\{- \frac {|t - a_i|} {\lambda} \right\}$$

Therefore the ratio is

$$ \frac {f_1(t)} {f_2(t)} = \frac {\displaystyle \frac {1} {2\lambda} \exp\left\{- \frac {|t - a_1|} {\lambda} \right\}} {\displaystyle \frac {1} {2\lambda} \exp\left\{- \frac {|t - a_2|} {\lambda} \right\}} = \exp\left\{\frac {|t - a_2| - |t - a_1|} {\lambda} \right\} $$

Since the function $g(x) = e^{x/\lambda}$ is increasing in $x$, we just need to find the global maximum of $|t - a_2| - |t - a_1|$ to obtain the upper bound of the above ratio.

Assume $a_2 = a_1 + 2$. Then $$\begin{align} |t - a_2| - |t - a_1| &= |t - a_1 - 2| - |t - a_1| \\ &= \begin{cases} (a_1 + 2 - t) - (a_1 - t) = 2 & \text {if} & t \leq a_1 \\ (a_1 + 2 - t) - (t - a_1) = 2(a_1 - t + 1) & \text {if} & a_1 < t < a_1 + 2 \\ (t - a_1 - 2) - (t - a_1) = -2 & \text {if} & t \geq a_1 + 2 \end{cases} \end{align}$$

When $a_2 = a_1 - 2$, similar results can be obtained - it is just the negative of the above result.

In any cases, the expression $|t - a_2| - |t - a_1|$ is continuous, piecewise linear and bounded above by $2$.

Therefore we conclude that $$ \frac {f_1(t)} {f_2(t)} \leq \exp\left\{\frac {2} {\lambda}\right\}$$