$\chi^2$ Distribution for Rectified Gaussians

118 Views Asked by At

We know that a Chi-squared distribution is the distribution of the sum of squared independent Gaussian random variables $Z_i \sim N(0, 1)$:

$$ Y = \sum_{i=0}^k Z_i^2 $$.

If we have another random variable $Z^R_I$ which is a rectified Gaussian, $$ Z_i^R = max(0, Z_i) $$,

Would it be logical to say that the distribution of the sum of squared independent rectified Gaussian random variables $Z_i^R$ is equivalent to the sum of to half a chi-squared distribution?

$$ Y^R = \sum_{i=0}^{k} (Z^R_{i})^2 = \sum_{i=0}^{k/2} Z_i^2 $$ $$ Y^R \sim \chi^2_{k/2} $$

Furthermore , if $Var[Y] = 2k$, would $Var[Y^R] = k$ and $E[Y] = k$, would $E[Y^R] = k/2$ ?

3

There are 3 best solutions below

2
On BEST ANSWER

Addressing the mean and variance part:

  • $E[Z_i^R]=\frac12 E[Z_i]=\frac12$
  • $E\left[(Z_i^R)^2\right]=\frac12 E[Z_i^2]=\frac32$
  • $\textrm{Var}(Z_i^R)=\frac32 - (\frac12)^2=\frac54$

so

  • $E[Y^R]=\frac12k$
  • $\textrm{Var}(Y^R)=\frac54k$

meaning that although the means match, the variance is higher than you suggested and $Y^R \not \sim \chi^2_{k/2}$.

This is no surprise: the expectation is that half the $Z_i$ and $Z_i^R$ are positive and so cause that mean, but sometimes few or none are positive and sometimes most or all are positive, and so cause the dispersion to be greater than if always exactly half of them would have been positive.

1
On

The claim seems false to me: $$\begin{aligned}P(Y^R=0)&=P(\cap_{\ell \leq k}\{(Z_\ell^R)^2=0\})=\\ &=\prod_{\ell \leq k}P((Z_\ell^R)^2=0)=\\ &=\prod_{\ell \leq k}P((\max(Z_\ell,0))^2=0)=\\ &=\prod_{\ell \leq k}P(Z_\ell\leq 0)=\\ &=\frac{1}{2^k}>0\end{aligned}$$ while $P(\sum_{\ell\leq k/2}Z_\ell^2=0)=0$.

2
On

Let us write $Z^+ = \max\{0,Z\}$, the positive part of $Z$.

We have $$ \mathbb P (\max\{0,Z\} \ge t) = \mathbb P( Z \ge t \, \; \text{or} \; 0 \ge t) = \begin{cases} \mathbb P(Z \ge t) & t\ge 0 \\ 1 & t < 0. \end{cases} $$ This means that the distribution of $Z^+$ has a point mass of $1/2 = \mathbb P(Z \ge 0)$ at $t = 0$.

A moment of thought gives that the distribution of $Z^+$ can be described as the mixture of a point mass at 0 and the distribution of $|Z|$, with equal weights (1/2 each).


Another way to see this: Let $B_i$ be the indicator of $Z_i > 0$, that is, $B_i = 1\{Z_i > 0\}$. Then, $$ Z^+_i = \begin{cases} |Z_i| & B_i = 1 \\ 0 & B_i = 0 \end{cases} $$ Let $S = \{i \in [k] : B_ i =1\}$. Then, $$ Y = \sum_{i=1}^k (Z_i^+)^2 = \sum_{i \in S} Z_i^2. $$ Thus, conditioned on $S$, the distribution of $Y$ is $\chi^2_{|S|}$. How many cases have $|S| = m$? The answer is $\binom{k}{m}$, and each has probability $2^{-k}$. Thus the distribution of $Y$ can be described as the following mixture of chi-squares $$ Y \sim \sum_{m=0}^k \frac{\binom{k}{m}}{2^k} \chi^2_m, $$ where we interpret $\chi^2_0$ as the identically zero random variable.