Conditioning on information about the moments of a random variable is trivial

103 Views Asked by At

Say we have some random variable, $X$. Is it always trivial to condition on information about the moments of $X$?

For example, suppose we know that $\mathbb{E}(X)$ is positive. But $\mathbb{E}\left(X|\mathbb{E}(X)>0\right)=\mathbb{E}(X)$ since the thing in the conditioning set is just some generic fact about a constant.

Same is true for $X,Y$ have some joint distribution. $\mathbb{E}(X|\mathbb{E}(X)>\mathbb{E}(Y))=\mathbb{E}(X)$.

2

There are 2 best solutions below

2
On

Your formulas are not really valid from a probabilistic viewpoint , since $E(X)$ is not a measurable event (i.e., it has no probability). $X=E(X)$ is a measurable event, but you are not asking that here. Intuitively, yes, you are correct that conditioning on useless information is not helpful, but the above formulation is not the way to express that - you can't condition on things that don't have a probability. For example, you might as well have written $E(X|1+1=2)$

1
On

To start with, your approach does not make much sense, because -as Eupraxis1981'says- you are not conditioning on a random event.

Informally: what you know about the random variable occurrence (i.e., something about the value of $X$) is one thing, what you know about the probability law of $X$ (i.e., something about the density $f_X(\cdot)$) is another completely different thing, and you can't mix them. When you write $E(X)$, for example, you are implicitly assuming that you don't know anything about (the value of this "realization" of) $X$, but that you know everything about the probability law of $X$ (say, $f_X(X)$ ir $F_X(x)$). HEnce, it does not make sense to add some "condition" on $E(X|A)$ where $A$ is some event that gives us information about $f_X()$: we already knew that.

Once you understand that the above is true -only then- you can take the next step towards Bayesian probability, and learn that the above is perhaps not so true :-) I mean that in that approach, you can mix (you do mix) the random variable with the parameters of its pdf (probability density function). Because here one regards the parameters of a pdf as random variables themselves, and so $E(X | \mu >0)$ might have some sense. But this requires some understading and consideration.

For example: supppose we "know" (this word turns somewhat tricky in the Bayesian setting) that a variable $X$ follows a Gaussian pdf with unit variance; but we don't know the mean $\mu$, we only "know" that ("a priori") $\mu$ can take any value in $[-5,5]$ (more formally $p(\mu) \approx U(-5,5)$) This is to say that what follows a Gaussian distribution is not actually $X$ alone, but $X$ "given (conditioned on) the value of $\mu$". But $p(x)$ itself is not gaussian; actually we can compute it as $p(X) = \int p(X |\mu) p(\mu) d\mu=\frac{1}{10} \int p(X |\mu) d\mu$ and from this we could compute the "unconditioned" expectation $E(X)$ (which in our example, by symmetry, would be 0).

Now, if we are given (in addition of the above) the information $I \equiv\mu >0$, then our new $p(\mu | I)$ is a truncated density, which in our case is a uniform on $[0,5]$. And now we'd have a new $p(X|I)$ and $E(X|I)>0$