Could you give me an intuition for the statement: "The Markov chain converges to its stationary distribution"?
I know the math behind it. I'm asking for an intuition without using mathematical notation. My idea is: suppose I draw an initial realisation of the random variable $X$, $x$, and I start collecting hundreds and hundreds of points in the support of $X$ following the transition probability; then the empirical pdf of this sample of points is very close to the stationary distribution. Is this correct?
Not entirely correct. Convergence to stationary distribution means that if you run the chain many times starting at any $X_0=x_0$ to obtain many samples of $X_n$, the empirical distribution of $X_n$ will be close to stationary (for large $n$) and will get closer to it (and converge) as $n$ increases.
The chain might have a stationary distribution but not converge to it. This is true for periodic chains and a simple example is a chain on two states $\{0,1\}$ that changes state on each step. Its stationary distribution is $(1/2,1/2)$ however distributions of $X_n$ clearly don't converge to it unless we start in a stationary distribution.
On the other hand, your definition of convergence (that an empirical distribution of a trajectory converges to some distribution) is equivalent to a requirement that a chain has a stationary distribution for irreducible chains. So if you want to find a stationary distribution, run a chain for a long time and take empirical distribution of its trajectory.