$\text{Var}[X]$ as an expression of $\text{Var}[\text{sgn}(X)\ln(|X|)]$ for $X\sim \mathcal N(0,\sigma^2)$

197 Views Asked by At

Assume $X\sim \mathcal N(0,\sigma^2).$

Assume $\operatorname{Var}[Y]$ is known, where $Y=\ln(X)$ for $X>0,~Y=0$ for $X=0$ and $Y=−\ln(−X)$ for $X<0.$

Is there a way to approximate $\operatorname{Var}[X]$ with a simple formula based on $\operatorname{Var}[Y]?$

Please check below for some threads about the Delta Method:

How can the Delta Method be applied in such special case? If it cannot be altered to accommodate such special case, what else do you recommend? What are the implications?

1

There are 1 best solutions below

13
On BEST ANSWER

A concise way of writing $Y$ as a function of $X$ is $Y = \mbox{sign}(X)\log(|X| + 1_{X=0})$. You can pay no attention to $\{X=0\}$ as it has probability $0$ and take the liberty to write $Y = \mbox{sign}(X)\log(|X|)$.

Because $\displaystyle{\int_0^1} \log(x)^2dx < \infty$ and $\displaystyle{\int_1^{+\infty}} \log(x)^2 f_{|X|}(x)dx < \infty$, $Y$ does have a finite variance.

You can write $X = \sigma N$, with $N \sim \mathcal{N}(0,1)$ a standard normal variable and $\sigma > 0$ the standard deviation of $X$. Then almost everywhere (except on $\{X=0\}$), $$Y = \mbox{sign}(N)\log(\sigma |N|) = \log(\sigma)\mbox{sign}(N) +\mbox{sign}(N) \log(|N|)$$ so $\mathbb{E}(Y) = 0$ and $\mbox{Var}(Y) = \mathbb{E}(Y^2) = \log(\sigma)^2 + 2\log(\sigma)\mathbb{E}(\log(|N|))+\mathbb{E}(\log(|N|)^2)$.

You can find an explicit value for $\mbox{Var}(Y)$ as a quadratic polynomial in $\log(\sigma)$ since $2 \mathbb{E}(\log(|N|))=-\gamma -\log(2)\simeq -1.27$, and $\mathbb{E}(\log(|N|)^2)=\frac{2\gamma^2+\pi^2+2\log(2)^2+\gamma\log(16)}{8} \simeq 1.637$, where $\gamma$ is the Euler-Mascheroni constant. The minimum value for the variance is $\frac{\pi^2}{8}$, and is reached at a point we call $\log(\sigma_0) = \frac{\gamma+\log(2)}{2}$.

Inverting the quadratic polynomial (in $\log \sigma$) yields two solutions which are simple to write: $\frac{\gamma+\log(2)}{2} \pm \sqrt{\mbox{Var}(Y) - \frac{\pi^2}{8}}$.

Now you are stuck at this step, because since you want an estimate based only on $\mbox{Var}(Y)$, this equation will have one or two solutions in $\mathbb{R}$ for $\log(\sigma)$, or sometimes zero if you replace the theoretical value $\mbox{Var}(Y)$ with an estimator.

You have no way of deciding which root to select to estimate $\log(\sigma)$. A solution is to choose the largest one if you have reason to believe that $\sigma \ge \sigma_0 = \exp\Big(\frac{\gamma+\log(2)}{2}\Big) \simeq 1.887$, or choose the smallest root otherwise. And if your observed variance is smaller than the lower bound $\frac{\pi^2}{8}$, you can take as your estimate $\hat{\sigma} = \sigma_0$, where the minimum variance $\frac{\pi^2}{8} \simeq 1.234$ is reached.


Summary for the estimator: denoting $\sigma_0 = \exp\Big(\frac{\gamma+\log(2)}{2}\Big)$ and $\hat{V}$ the estimator for $\mbox{Var}(Y)$ we estimate $\hat{\sigma} = \begin{cases} \sigma_0 & \mbox{ if $\hat{V} < \frac{\pi^2}{8}$} \\ \sigma_0\exp\big(\sqrt{\hat{V}-\frac{\pi^2}{8}}\big) & \mbox{ otherwise if you have evidence that $\sigma \ge \sigma_0$} \\ \sigma_0\exp\big(-\sqrt{\hat{V}-\frac{\pi^2}{8}}\big) & \mbox{ otherwise if you have evidence that $\sigma \le \sigma_0$} \end{cases}$

You can get the "evidence" from a Bayesian approach for the estimation of $\sigma$, or combine this estimator with an estimator based on the average of $Y$ or its maximum sampled value.