I'm trying to analyze a network algorithm to get a latency probability distribution. One of the steps is to "calculate the probability distribution of the number of updated nodes in a single hop of a network flooding". It can be formulated like this:
Let $V$ be a set of $n$ vertices. Each vertex has $d$ uniformly random outgoing edges. So there's $(^{n-1}C_d)^n$ combinations how this graph may be connected. We split $V$ in two sets: $A, B$ of size $n_a$ and $\mathcal n_b=n-n_a$. Count edge combinations, in which exactly $k$ vertices in $B$ have at least one outgoing or incoming edge with a vertex in $A$. $(0 \le k \le n_b)$
I've written a script to numerically calculate this for small $n, d$ and $k$ and it gives me:
- For $n=4, d=1, n_a=1$ the answer is: $[0, 36, 36, 9]$ (corresponds to $k=0, 1, 2, 3$).
- For $n=5, d=1, n_a=1$ the answer is: $[0, 432, 432, 144, 16]$.
- For $n=6, d=2, n_a=3$ the answer is: $[1, 807, 63183, 936009]$.
Is it possible to derive an expression for this for given $n, d, n_a$ and $k$?
I thought I can express $k=2$ combinations inductively like "total combinations for $k \le 2$ minus combinations for $k = 1$ and $k=0$", but everything I come up with eventually breaks down when I check against other $k$.
First let $k_1$ be the number of $B$ vertices with an outgoing edge to an $A$ vertex, and $k_2$ be the number of $B$ vertices with an incoming edge from an $A$ vertex.
So $k = k_1 + k_2$; you'll want to sum over all possible $(k_1,k_2)$ pairs to get the final result.
Now let's evaluate the probability to have a particular $k_1=c$. For a vertex $v$, its probability to have an edge going to $A$ is $\lambda = 1 - (n_b/n)^d$. Since all vertices choose their edges independently, this questions boils down to the binomial distribution of parameter $n_b$ and $\lambda$, so $$\Pr[k_1 = c] = {n_b \choose c} \lambda^c (1-\lambda)^{n_b-c}$$
Now we look for edges from $A$ to $B$. There are $dn_a$ edges; we want them to go to $k_2$ distinct $B$ vertices that were not selected in the $k_1$ vertices. Since we don't care about the order on the edges, we can first assign edges going to $B$. The first one has probability $(n_b-k_1)/n$ to be linked correctly, the second one $(n_b-k_1-1)/n$, and so on until $(n_b-k_1-k_2-1)/n$. It's called the falling factorial, so let's write this probability $(n_b-k_1)_{k_2}/n^{k_2}$ to have the first edges going to $k_2$ distinct $B$ vertices. Now all other edges should either go to an $A$ vertex, or to a $B$ vertex that has already been selected. So that gives a $((n_a +k)/n)^{dn_a-k_2}$ factor.
Put everything together and you get your formula.