Let $(\Omega, \cal F, P)$ be a probability space and $X,Y$ two random variables defined on $(\Omega, \cal F, P)$. I stumbled upon the equality $$E[E(X \vert Y)]=E(X)$$
Observing that
$$\begin{align} E[E(X \vert Y)] & =\sum_iP(Y=Y_i)E(X\vert Y_i) \\ & = P(Y=Y_1)E(X \vert Y_1)+...+ P(Y=Y_n)E(X\vert Y_n) \\ &=E(X) \end{align}$$
it "mathematically" makes sense to me, but still not intuitively. How does one intuitively recognise that the expectation of a random variable equal the expectation of an expectation of that random variable given some information?
There are many ways of explaining this intuivitely, I will use the equalities you provided, stick to the discrete case and won't go much into the machinery of probability theory.
I claim that the sum $$\sum_i P(Y=Y_i)E(X|Y_i)$$ is essentially just the expected value of $X$ - a weighted average of all values of $X$ - in disguise. It is only matter of how the space $\Omega$ is partitioned.
$E(X)$
To see that, let's see what $E(X)$ does:
$$E(X) = \sum_i P(X = X_i) X_i$$
Think of the event $[X=X_i]$ as a set of all cases (all the $\omega$s) where $X(\omega) = X_i$. These events form a partitioning of $\Omega$ in that they are disjoint sets that add up to $\Omega$.
In $E(X)$ we are therefore looking at the value of $X$ on the set $[X=X_i]$ and weighting it with the size of the set, $P(X=X_i)$.
$E(E(X|Y))$
Now in your case: $$\sum_i P(Y=Y_i)E(X|Y_i)$$
Here, we are looking at the value of $X$ on the set $[Y=Y_i]$. But we can't just write $X_i$, since $X$ does not have to have the same value on all $\omega \in [Y=Y_i]$. Thus we have to average over all the cases in $[Y=Y_i]$, which is precisely what $E(X|Y_i)$ is. Then we just multiply this with the size of the set, which is $P(Y=Y_i)$.
In both cases, we are just aggregating the $\omega$s in different ways. In $E(X)$, it's over the sets $[X=X_i]$, in $E(E(X|Y))$, it's over the sets $[Y=Y_i]$.