Is there a closed form expression for entropy on tanh transform of gaussian random variable?

533 Views Asked by At

so my question is if there's a closed form (analytical?) expression for entropy for a variable $u$ defined as the $\tanh$ of an gaussian random variable $x$. The reason I need an closed form solution, is that this is part of an neural network and I need to be able to derive(find gradient) based on the mean $\mu$ and std $\sigma$ of $x$.

For an random variable from gaussian distribution $x \sim \mathcal{N}(\mu, \sigma) $, I know that the entropy is: $h(X)=\frac{1}{2}\log(2\pi \sigma^2) + \frac{1}{2}$, which is closed form.

I have the transform: $u= \tanh(x)$, and I'd like to get the entropy of this random variable. I know from Differential Entropy wiki page that I can formulate the entropy for $u$ as:

$$ h(U) = h(X) + \int f(x) \log\bigg|\frac{d (\tanh(x))}{dx}\bigg| dx$$

with $f(x)=\frac{1}{\sigma \sqrt{2\pi}}e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}$ as the probability density function of the gaussian distribution. I've tried to solve the integral term in the right hand but haven't been able to figure it out. I tried (with my limited knowledge of) wolfram alpha, without any success. Is there any closed form expression, and if so, do you know how it looks?

Many thanks in advance!

1

There are 1 best solutions below

2
On

The equality that you state is actually an inequality that defines an upper bound for the differential entropy of the transformed random variable:

$$ h(U) \leq h(X) + \int f(x) \log|\frac{d (\tanh(x))}{dx}| dx $$

To obtain the differential entropy of the transformed random variable we can use the definition:

$$ h(U)=-\int _{\sup(U)}f_U(x)\log f_U(x)\,dx$$

First we need to compute the density of U=tanh(X). Note that the cdf of U is:

$$F_U(x)= \begin{cases} 1 & if & x>1 \\ F_X(\tanh^{-1}(x)) & if & x \in [-1,1] \\ 0 & if & x<1 \end{cases} $$

As it is continuous ($F_X(\tanh^{-1}(-1))=F_X(-\infty)=0$ & $F_X(\tanh^{-1}(1))=F_X(\infty)=1$ ), U has a Lebesgue pdf:

$$ f_U(x)=f_X(\tanh^{-1}(x)) I_{[-1,1]}(x) = \frac {1}{\sigma \sqrt{2 \pi}} e^{-\frac{1}{2}(\frac{\tanh^{-1}(x) - \mu}{\sigma})^2} I_{[-1,1]}(x)$$

Finally, using the base e logarithmic units: $$h(U) = - \int_{-1}^{1} f_U(x) \left [-\ln(\sigma \sqrt{2\pi}) - \frac{1}{2} \left (\frac{\tanh^{-1}(x) - \mu}{\sigma} \right)^2 + \ln\left( I_{[-1,1]}(x) \right) \right]dx$$

$$h(U)=\frac{1}{2}\ln(2\pi \sigma^2) + \frac{1}{2}\int_{-1}^{1}\left [ \left (\frac{\tanh^{-1}(x) - \mu}{\sigma} \right)^2 + \ln\left( I_{[-1,1]}(x) \right) \right] f_U(x)dx$$

$$h(U)=\frac{1}{2}\ln(2\pi \sigma^2) + \frac{1}{2} \text E\left[ \left (\frac{tanh^{-1}(x) - \mu}{\sigma} \right)^2 + \ln\left( I_{[-1,1]}(x) \right) \right] $$

You can evaluate the second term numerically with integral() function in MATLAB, integrate package in R, ...