Size of preimage in a pseudo random function

110 Views Asked by At

I am interested in statistics of pseudo random functions. In particular, in the following. Given:

  • a pseudo random function $f : S \to S$ (with $s = |S|$),
  • and a set $D \subseteq S$ (with $n = |D|$) of arbitrary elements,

what is the expected size of the preimage of $D$ through $f$? In other words, $\mathbb{E}[|\{x \in S \mid f(x) \in D\}|]$.

The reverse problem is easier (to me): given the same $f$ and a set $C \subseteq S$ (with $m = |C|$) of arbitrary elements what is the expected size of the image of $C$ through $f$?

We have $\mathbb{E}[f(C)] = s(1-(1-1/s)^m) \approx s(1-\exp(-m/s))$. See e.g. this question for argument on a very similar development.

Curious if we can use a similar reasoning for the preimage scenario.


My attempt so far:

I start by computing the probability that the preimage size is a given size: \begin{align*} \Pr(|C| = m \mid |D| = n) &= \Pr(n \text{ points all map to a different image}) \times \Pr(\text{the other } m-n \text{ points map to an existing image}) \times {m \choose n}\\ &= \left(\prod_{i=0}^{n-1} 1-\frac i s\right) (n/s)^{m-n} {m \choose n}\\ \Pr(|C| < n \mid |D| = n) &= 0 \end{align*}

Then compute the weighted average for the expected value: \begin{align*} \mathbb{E}(|C| \mid |D| = n) &= \sum_{m=0}^s m \Pr(|C| = m \mid |D| = n)\\ &= \sum_{m=n}^s m \left(\prod_{i=0}^{n-1} 1-\frac i s\right) (n/s)^{m-n} {m \choose n}\\ &= \left(\prod_{i=0}^{n-1} 1-\frac i s\right) \sum_{m=n}^s m (n/s)^{m-n} {m \choose n} \end{align*}

However the result is unweildy (in particular difficult to compute if $s$ is large), and I am not sure of its correctness.