Theorem Cramer-Wold.

1.6k Views Asked by At

Theorem (Cramer-Wold device): The distribution of a random $n$-vector $X$ is completely determined by the set of all one-dimensional distributions of linear combinations $t^TX$, where $t$ ranges over all fixed $n$-vectors.$$$$ Proof. $Y := t^TX$ has characteristic function: $$\phi_Y(s) := E[e^{isY}] = E[e^{ist^TX}]$$ If we know the distribution of each $Y$ , we know its CF $\phi_Y(s)$. In particular, taking $s = 1$, we know $E[e^{ist^TX}]$. But this is the CF of $X = (X_1,\ldots,X_n)^T$ evaluated at $t = (t_1,\ldots,t_n)^T$ . But this determines the distribution of $X.$

My questions:

  1. If $s = 1$ then does the proof lose generality? Why not?
  2. My interpretation of the proof is that if the linear combination of the vector and the vector have the same characteristic function then their distribution are the same. Is this interpretation correct? 3)What does 'completely determined' mean in the theorem?
1

There are 1 best solutions below

3
On BEST ANSWER
  • I don't understand what you mean by "lose generality" in your first question. The only "generality" in the proof that seems to me to be of any importance is that it must work for any $\ n$-vector and for any $\ n\ $, which it does. Putting $\ s=1\ $ is merely one step in the the proof, a step which you can legitimately take regardless of what the values of $\ n\ $ and $\ X\ $ are, so it doesn't impose any restriction on them.

  • A vector is not the same thing as a linear combination of its entries (unless the vector, $\ X\ $ is $1$-dimensional and the linear combination is $\ 1\cdot X\ $), which must be a scalar. So neither the distribution nor characteristic function of a random vector can be the same as those of a linear combination of that random vector's entries. The distribution of a random $\ n$-vector as well as its characteristic function will be functions of $\ n\ $ variables, whereas both the distribution and the characteristic function of any particular linear combination of the random vector's entries will be functions of a just a single variable.

    The key point in the Cramer-Wold device is that you're required to know the one-dimensional distribution of not merely some particular linear combination of the random vector's entries, but of every such linear combination, and this enables you to deduce the random vector's distribution, a function of many variables, from that infinite family of functions of a single variable.