What are the mean and variance of the log of a random variable?

11.1k Views Asked by At

Here's the problem.

We have a random variable X that follows a Poisson law. If we take the log of this variable, what are the first two moments (mean and variance) of the law it follows?

This looks like a simple question, but I can't find anything about it. Any idea?

EDIT: In order to prevent the X=0 case, we bias the Poisson law. Our random variable becomes log(X+epsilon) with X~Poisson(λ).

3

There are 3 best solutions below

2
On

With a Poisson distributed random variable $X$ with parameter $\lambda$, you have $P(X =0) = e^{-\lambda} \gt 0$.

$\log(0)$ is undefined, or at best negative infinity, so it is meaningless to talk about the mean and variance of $\log(X)$.

2
On

Recall that the mean $\mathbb{E}(X)$ of a random variable $X$ is a weighted average of the possible values of $X$, weighted by the probability $P(X = x)$ of each outcome $x$ of $X$. Hence, the mean of a function $f(X)$ is a weighted average of the possible values of $f(X)$, weighted by the probability $P(f(X) = f(x))$. But the probability that $f(X)$ takes the value $f(x)$ is the same as the probability that $X$ takes the value $x$ (this is not exactly true, since $f$ might take the same value for multiple $x$, but for the purpose of taking a weighted average it doesn't matter since the effective weight on $f(x)$ will be the sum of the probabilities of each such $x$ either way). This allows us to use the following formula to find the mean of $f(X)$, with sums or integrals depending on whether $X$ is continuous or discrete: $$\mathbb{E}(f(X)) = \int f(x) \cdot P(X = x) dx$$ $$\mathbb{E}(f(X)) = \sum f(x) \cdot P(X = x).$$

Similarly, the variance is a weighted average of the squared difference between each value $x$ and the average value $\mathbb{E}(X)$, weighted by the probability $P(X=x)$. So we can similarly calculate the variance of $f(X)$ "by hand" once we have the mean, using: $$\text{Var}(f(X)) = \int \left(f(x) - \mathbb{E}(f(X)) \right)^2 \cdot P(X = x) dx$$ $$\text{Var}(f(X)) = \sum \left(f(x) - \mathbb{E}(f(X)) \right)^2 \cdot P(X = x).$$

In both cases, the sum or integral is taken over all possible values $x$ of your variable $X$.

Note that as per Henry's answer, naively plugging $f(x) = \log(x)$ into these formulas will get you in trouble if $X \sim \text{Poisson}(\lambda)$, as you'll have a diverging integral. For a Poisson distribution, which allows $X = 0$, taking the mean and variance of $log(X)$ is only sensible if you first exclude $X = 0$ from your domain somehow. So this answer is conditional on $X$ remaining in the domain of $f$.

0
On

This is also called delta method. If $g(X)$ is a function of random variable $X$, and $g(X)$ is differentiable, in practice we can use first-order Tylor expansion at $\bar{X}$, the mean of $X$: $$g(X) \approx g(\bar{X}) + \frac{d}{dX}g(\bar{X})(X-\bar{X})$$ Therefore, $$E[g(X)] \approx g(\bar{X}) + \frac{d}{dX}g(\bar{X})E(X-\bar{X}) = g(\bar{X}),$$ because the second term is 0. Similarly, the variant is: $$Var[g(X)] \approx Var(\frac{d}{dX}g(\bar{X})X) = [\frac{d}{dX}g(\bar{X})]^2Var(X)$$ In your case $X\sim Poisson(\lambda)$, $g(X) = logX$; plug in and you got: $$E(logX) \approx log\lambda$$ $$Var(logX) \approx \frac{1}{\lambda}$$ You have to be careful with X=0 case, as other folks have already stated.