What is the probability of passing through a node in a directed graph

1.6k Views Asked by At

Say I have a directed graph with no cycles like this one. enter image description here And say someone travels along it choosing a random edge to go down at every node. We know that the person walking starts from node 0 and is going to end up in node 8.

Is there a way to figure what the likelihood of this person going through a certain node is? For instance we know that they are going through node 3 2/3 of the time as the starting node only has 3 options and 2 of them goes through node 3.

But is there a way to know the likelihood of them going through a more inner node like 7? I thought maybe if we knew how many possible paths exists and how many of those go through node 7 we could divide them and get the answer. But now I don't think that'll work since not all paths are equally likely.

Any help would be appreciated. Thanks!

5

There are 5 best solutions below

4
On BEST ANSWER

I thought maybe if we knew how many possible paths exists and how many of those go through node 7 we could divide them and get the answer. But now I don't think that'll work since not all paths are equally likely.

Funny, I went through the exact same thought process! Yes, I first thought of counting just the number of paths as well, but you're right, not all paths are equally likely. OK, so then just compute the probability of getting to a node by computing the probability of getting to any of its predecessors, and multiplying that by the probability of following the edge from that predecessor to the node in question. The image below shows the results (green means the probability of taking the edge, while red means the probability of getting to the node):

enter image description here

Just as an example: the probability of going through node $7$ is the probability of going through either of nodes 4, 5, or 6, respectively multiplied by the probability of taking the edge from that node to node $7$. Thus:

$$P(7) = P(4)\cdot \frac{1}{3}+P(5)\cdot \frac{1}{2}+P(6)\cdot \frac{1}{2} = \frac{2}{9}\cdot \frac{1}{3}+\frac{8}{27}\cdot \frac{1}{2}+\frac{19}{27}\cdot \frac{1}{2}=\bbox[yellow]{\frac{31}{54}}$$

Also, just for a sanity check, let's make sure the probability of getting to node $8$ is $1$:

$$P(8)=P(4)\cdot \frac{1}{3}+P(6)\cdot \frac{1}{2}+P(7)\cdot 1 =\frac{2}{9}\cdot \frac{1}{3}+\frac{19}{27}\cdot \frac{1}{2}+\frac{31}{54}\cdot 1 = \frac{4+19+31}{54} = 1$$ Yay!

0
On

Let $p_k$ be the probability that the path passes through node $k$ so that $p_0=1.$ We see that $p_1=\frac13$ since the only way to get to node $1$ is directly from node $0$. What about node $3?$ Well there's a $\frac13$ chance of getting there from node $0$, but if we get to node $1$ we certainly get to node $3$ so $$p_3=\frac13\cdot p_0+1\cdot p_1=\frac23$$

Continuing in this manner, you can compute $p_k$ for every node in the graph.

0
On

I will use a 3 nodes example

Make transition matrix of the probabilities to walk from one node to another node:

$T = \begin{bmatrix} P_{0\rightarrow 0} & P_{1\rightarrow 0} & P_{2\rightarrow 0} \\ P_{0\rightarrow 1} & P_{1\rightarrow 1} & P_{2\rightarrow 1} \\ P_{0\rightarrow 2} & P_{1\rightarrow 2} & P_{2\rightarrow 2} \end{bmatrix}$

Define the probability of the person starting at node 0, 1, and 2 at the beginning as:

$\vec{x}^{(0)} = \begin{bmatrix} P_{0} \\ P_{1} \\ P_{2} \end{bmatrix}$

Let $\vec{x}^{(n + 1)} = Tx^{(n)}$

The probability that the person visited node 2 is:

$\sum_{n = 0}^{n = \infty} \vec{x}^{(n)}_{2}$

where $\vec{x}^{(n)}_{2}$ denotes the 2nd element of vector x, counting from the 0th element.

Since the person can only walk from left to right and can't stop at a node, just stop the algorithm when all elements of $x^{(n)}$ are zero.

0
On

IN GENERAL, you could solve the problem like this:

Let $\chi$ be a stochastic vector in $\mathbb{R}^V_{\geq 0}$ (all coordinates nonengative and they sum to 1), and let us suppose that you start at node $i$ with probability $\chi_i$.

Then from the digraph you have a transition matrix $A$. Now let $S$ be the subset of nodes ($S$ could be a single node of course) which you want to measure the probability that you land on some $v \in S$ at some point.

Then let $\chi_1$ be the vector $A\chi$ and let $\chi'_1$ be the vector $\chi_1$ except the $v$-th coordinate of $\chi'_1$ is set to 0 for all $v \in S$. And for each $j$, let $\chi_j$ be the vector $A\chi'_{j-1}$, and let $\chi'_j$ be the vector $\chi_j$ except the $v$-th coordinate of $\chi'_j$ is set to 0, for all $v\in S$.

Then the probability that a node lands on $v$ at some point is $\sum_{j=1}^{\infty} \sum_{v \in S} \chi_j(v)$.

0
On

The stationary vector per @R zu answer and @Bram28 values can be calculated as such,

import numpy as np
import numpy.linalg as lin

A = [[0, 1/3., 1/3., 1/3., 0,  0,   0,    0,   0,  ],
     [0,  0,    0,    1.,  0,  0,   0,    0,   0,  ],
     [0,  0,    0,    0,   0,  1,   0,    0,   0,  ],
     [0,  0,   1/3., 1/3.,1/3.,0,   0,    0,   0,  ],
     [0,  0,    0,    0,   0, 1/3., 0,   1/3., 1/3.],
     [0,  0,    0,    0,   0,  0,  1/2., 1/2., 0   ],
     [0,  0,    0,    0,   0,  0,   0,   1/2., 1/2.],
     [0,  0,    0,    0,   0,  0,   0,    0,   1.  ],
     [0,  0,    0,    0,   0,  0,   0,    0,   0  ]]
A = np.array(A)

eval,evec = lin.eig(A.T)
pi = evec[:,np.argmax(eval)] / np.sum(evec[:,np.argmax(eval)])
print (np.abs(pi))
[0.         0.         0.         0.         0.         0.
 0.33333333 0.         0.        ]
[0.         0.         0.01149425 0.01149425 0.01149425 0.04597701
 0.06896552 0.18390805 0.66666667]

Stationarity BTW calculates how many times would we pass through a node from any starting value after infinite steps. That answer is highest for 8 obviously but it is not 1, above says 0.66.