On a chessboard there is a (drunken) king. The king moves at the beginning of each minute, in a random direction: up, down, left, right, or the four diagonal directions (unless the king is on an edge or corner, in which case the available directions are limited accordingly).
I have worked out the following theorem:
At a moment in the far future, the probability that the king is on each square is proportional to the number of adjacent squares.
I am looking for a simple, direct way of understanding why this theorem is true.
(I know the theorem is true because I proved it in the following rather complicated way. Express the probability that the king is on each square, in terms of the probabilities that the king was on each adjacent square one minute ago. This gives a system of equations, which must have a unique solution, since there is only one correct value for each probability. If the theorem is true, then this yields a solution of the system of equations. And since the system must have a unique solution, the theorem must be true.)
The important concept here is detailed balance. A chain is in detailed balance if when it is governed by its invariant distribution, the probability currents between each pair of states are equal, as opposed to just the in-current to and the out-current from each single state being equal. This amounts to the equations
$$\pi_i p_{ij}=\pi_j p_{ji}$$
for all $i \neq j$. You can see by counting equations and unknowns (counting normalization you have ${n \choose 2}+1$ linear equations in $n$ unknowns) that this is a rather strong assumption. However, it comes up in a lot of systems of interest. Generally this assumption is used without checking it up-front. Instead you assume it and the $\pi$ you find serves to verify your assumption was consistent, or else you get a contradiction and move on.
In the case of the random walk on a bounded 2D grid, the result actually depends on which way you define the boundary condition. Three natural choices are:
In cases 1 and 3, the detailed balance conditions for adjacent squares read $\pi_i/8 = \pi_j/8$ and so $\pi_i=\pi_j$, so you have the uniform distribution. In case 3 this is easy to understand because all squares are indistinguishable. In case 1 this is a little bit counterintuitive, but the idea is that the loss of current from neighboring squares is perfectly balanced by the increased chance to not move from an edge square or a corner square.
In case 2, which is evidently the case you are interested in, you instead have
$$\frac{\pi_i}{d_i} = \frac{\pi_j}{d_j} \Rightarrow \frac{\pi_i}{\pi_j}=\frac{d_i}{d_j}$$
where $d_i$ is the number of squares adjacent to $i$. Now you can use this to find a measure and then normalize it: arbitrarily pick a measure $q$ where the squares with $d=8$ have measure $1$, then $q_i=\frac{d_i}{8}$ and $\pi_i=\frac{q_i}{\sum_j q_j}=\frac{d_i}{\sum_j d_j}$ as desired.