$\newcommand{\mo}[1]{\operatorname{#1}}$ I am trying to get a better understanding of the inner workings of the expectations operator. First define that we have a random vector $y$ and $x$.
How to prove that:
The condition $$\mo E(y\,|\,x)=0$$
is the same as the two conditions combined: $$\mo{cov}(y,x)=0 \,\land\, \bigl(\mo E(y)=0 \,\lor\, \mo E(x)=0\bigr)$$
What's the best way to also prove that both of these conditions can be reduced to $\text E(u)=0$, if $X$ is deterministic?
To complement Robert Israel's answer.
The data $E[Y|X]=0$ is much more informative than {$Cov(X,Y)=0$, $E[X]=0$, $E[Y]=0$}. The later are just three numbers, the former is (in general) a function: the regression curve.
{$Cov(X,Y)=0$, $E[X]=0$, $E[Y]=0$} only tells you that the variables are neither positively nor negatively correlated. But you can think of many examples in which this happens (for example, think of a uniform distribution over some 2D shape that is symmetric along the $Y$ axis; say, an isosceles triangle with horizontal base; just move it up-down till both means are zero) but still the regression line is not the trivial horizontal line (draw it!)
In words, {$Cov(X,Y)=0$} tells you roughly this: to know that $X$ is bigger than expected does not tell me if $Y$ tends to be bigger or smaller than expected. On the other side, $E(Y \mid X)=E(Y)$ tells you this: to know the value of $X$ does not alter the expected value of $Y$, for any value of $X$. The later is much more informative.
What it's true is this: if we know/assume that the regression curve is a straight line: $E[Y \mid X]=a X +b$, then the data {$Cov(X,Y)=0$, $E[X]=0$, $E[Y]=0$} implies the constants $a,b$ are zero, and then, yes, $E[Y \mid X]=0$