so my question is if there's a closed form (analytical?) expression for entropy for a variable $u$ defined as the $\tanh$ of an gaussian random variable $x$. The reason I need an closed form solution, is that this is part of an neural network and I need to be able to derive(find gradient) based on the mean $\mu$ and std $\sigma$ of $x$.
For an random variable from gaussian distribution $x \sim \mathcal{N}(\mu, \sigma) $, I know that the entropy is: $h(X)=\frac{1}{2}\log(2\pi \sigma^2) + \frac{1}{2}$, which is closed form.
I have the transform: $u= \tanh(x)$, and I'd like to get the entropy of this random variable. I know from Differential Entropy wiki page that I can formulate the entropy for $u$ as:
$$ h(U) = h(X) + \int f(x) \log\bigg|\frac{d (\tanh(x))}{dx}\bigg| dx$$
with $f(x)=\frac{1}{\sigma \sqrt{2\pi}}e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}$ as the probability density function of the gaussian distribution. I've tried to solve the integral term in the right hand but haven't been able to figure it out. I tried (with my limited knowledge of) wolfram alpha, without any success. Is there any closed form expression, and if so, do you know how it looks?
Many thanks in advance!
The equality that you state is actually an inequality that defines an upper bound for the differential entropy of the transformed random variable:
$$ h(U) \leq h(X) + \int f(x) \log|\frac{d (\tanh(x))}{dx}| dx $$
To obtain the differential entropy of the transformed random variable we can use the definition:
$$ h(U)=-\int _{\sup(U)}f_U(x)\log f_U(x)\,dx$$
First we need to compute the density of U=tanh(X). Note that the cdf of U is:
$$F_U(x)= \begin{cases} 1 & if & x>1 \\ F_X(\tanh^{-1}(x)) & if & x \in [-1,1] \\ 0 & if & x<1 \end{cases} $$
As it is continuous ($F_X(\tanh^{-1}(-1))=F_X(-\infty)=0$ & $F_X(\tanh^{-1}(1))=F_X(\infty)=1$ ), U has a Lebesgue pdf:
$$ f_U(x)=f_X(\tanh^{-1}(x)) I_{[-1,1]}(x) = \frac {1}{\sigma \sqrt{2 \pi}} e^{-\frac{1}{2}(\frac{\tanh^{-1}(x) - \mu}{\sigma})^2} I_{[-1,1]}(x)$$
Finally, using the base e logarithmic units: $$h(U) = - \int_{-1}^{1} f_U(x) \left [-\ln(\sigma \sqrt{2\pi}) - \frac{1}{2} \left (\frac{\tanh^{-1}(x) - \mu}{\sigma} \right)^2 + \ln\left( I_{[-1,1]}(x) \right) \right]dx$$
$$h(U)=\frac{1}{2}\ln(2\pi \sigma^2) + \frac{1}{2}\int_{-1}^{1}\left [ \left (\frac{\tanh^{-1}(x) - \mu}{\sigma} \right)^2 + \ln\left( I_{[-1,1]}(x) \right) \right] f_U(x)dx$$
$$h(U)=\frac{1}{2}\ln(2\pi \sigma^2) + \frac{1}{2} \text E\left[ \left (\frac{tanh^{-1}(x) - \mu}{\sigma} \right)^2 + \ln\left( I_{[-1,1]}(x) \right) \right] $$
You can evaluate the second term numerically with integral() function in MATLAB, integrate package in R, ...