Assortativity (homophily) in random graphs

211 Views Asked by At

I am trying to come up with the solution to the following problem and would be grateful for any help.

Consider a random undirected graph built using configuration model with degree distribution $\{f_1,\ldots,f_n\}$ (we can think of $f_k$ as a proportion of nodes with degree $d_k$). Assume that every node has a color (or any other characteristic uncorrelated with the degree) denoted by $a_i$. Say, the set of all possible colors is $\{a_1, \ldots, a_M\}$. Assume that in the network proportion of nodes with color $a_i$ is $\lambda_i$.

I am trying to introduce assortativity (in color) in this network. Usually, this is done by introducing a matrix $\mathbf{E}$ with elements $e_{ij}$ denoting proportion of edges in the network connecting nodes of color $a_i$ with nodes of color $a_j$. Necessarily, $\sum_i e_{ij} = g_j$ and $e_{ij} = e_{ji}$.

Problem: consider a node with color $a_i$. I would like to calculate probability that his randomly chosen friend will have color $a_j$ (denote it by $p(j \vert i)$).

Could someone help me with this or at least give me a hint how to start?

What I did so far (I am not sure that it is correct at all):

Say we have $m = 0.5\sum_k d_k$ edges in total. Then, I would like to calculate $p(j\vert i)$ and hence I know that one end has color $a_i$. There are $m \sum_{l} e_{il}$ edges in total that have $a_i$ on one end. Out of them only $m e_{ij}$ have $j$ on the other end, hence we have: $$p(j \vert i) = \frac{e_{ij}}{\sum_l e_{i l} }$$

Thank you in advance.

1

There are 1 best solutions below

0
On BEST ANSWER

Your reasoning looks fine.

One possible source of confusion might stem from the fact that:

$$\sum_{i,j} e_{ij} = 2$$

and $e_{ij}=e_{ji}$.

This means that there are more than just one row of entries with an $i$ in them. But because of the symmetry of the matrix, we only need to count in say the upper diagonal half.

However the entries with an $i$ in are repeated in the lower diagonal half, and form a row.

$$ \begin{pmatrix} 1 & 2 & 1 & 2 & 3\\ 2 & 1 & 4 & 0 & 1\\ 1 & 4 & 2 & 3 & 2\\ 2 & 0 & 3 & 2 & 1\\ 3 & 1 & 2 & 1 & 3 \end{pmatrix} $$

So, for example, if $i=3$, we read $1,4,2,3,2$, either as an L-shape, or as a row - the result is the same.

If we want to know the probability of say $3\iff4$, your method records the value of $e_{34}=3$, counts row $i=3$ row; $1+4+2+3+2=12$, and gives the probability as $\frac3{12}=\frac14$.

We can also note that:

$$\lambda_i=\frac{\sum_{j} e_{ij}}{2}$$