Calculating the mean of a folded normal distribution

2.1k Views Asked by At

I have some data that I know is distributed according to the folded normal distribution. I am looking to calculate the mean and variance of the underyling "unfolded" normal distribution.

Let's say Y is a normal distribution. My data is X = |Y|

I do not know Y (the "ground truth"). Is it possible to calculate the mean of Y using only the X data?

I am not a mathematician or a statistician (just a biologist), so I am a bit lost with all these formulas.

μY formula I thought this equation might be the answer, but I am not sure what the μY actually represents (the mean of the unfolded normal or perhaps even the arithmetic mean of the data?) It's taken straight from the wikipedia page. https://en.wikipedia.org/wiki/Folded_normal_distribution

1

There are 1 best solutions below

6
On

One option would be to set up a maximum likelihood estimate of thr unknown mean value.

You collect thr data $x_n$ for $n=1,\ldots,N$ and define the function $$L(\mu,\sigma) = \sum_{n=1}^N\log f(x_n;\mu,\sigma)$$ where $f(x_n;\mu,\sigma)$ is the pdf of the folded normal distribution. You plug in the data and see $L$ as a function of the unknown parameters of the folded normal. By finding the parameters that maximize this function, you get the maximum likelihood estimates of the mean and variance of the underlying "unfolded" Gaussian distribution.

Alternatively, if this is too cumbersome, you can do a method of moments estimate of the parameters, which is what you hint at in the question. To this end, yoh estimate the first and second moments (mean and variance) of the data using samples (i.e. the sample mean $\hat m$ and variance $\hat s^2$ of the data); then you set these equal to the theoretical mean and variance of the folded normal: $$\begin{cases} \hat m = \sigma \sqrt{\frac{2}{\pi}}\mathrm e^{-\mu^2/2\sigma^2} +\mu \left(1 -\Phi(-\mu/\sigma)\right)\\ \hat s^2 = \sigma^2 +\mu^2 - \hat m^2 \end{cases}$$ where $\Phi(\cdot)$ is the standard normal cumulative distribution function. This is a system of nonlinear equations that you can solve to find the mean $\mu$ and variance $\sigma^2$ of the underlying "unfolded" normal distribution.