Consider two probability density function g(y) and f(y: $\theta$), $\theta \in \Theta$.
The KL divergence of f and g is defined by $$ D_{KL}(g|f) := \int \log \frac{g(y)}{f(y: \theta)} \, dy = \int -\log f(y: \theta)g(y)\, dy. $$
I want to check the following three conditions hold for some specific distributions of f and g,
(i) $ E\left[\log g(Y) \right]$ exist.
(ii) $ \left|\log f(y:\theta) \right| \le m(y) \;\; for \;\;all \; \theta \in \Theta$, where m is some integrable function.
(iii) $D_{KL}(g|f)$ has unique minimum at $\, \theta^* \in \Theta.$
, where $\;f(y), g(y)$ are continuous on $R$, $ \;\{x: f(y) > 0 \} = R, \;and\; \;\{x: g(y) > 0 \} = R.$
$\\$
I think that the condition (i) holds if the number of peaks of $g(y)$ is finite.
Here is my proof.
case(1): $g(y) \lt 1 \;\;for \;\; all \;\;y \in R.$
$\implies -\infty \lt \log g(y) \lt 0 \;\;for \;\; all \;\;y \in R.$
$\implies \left( \log g(y) \right)^{+} := max \{ \log g(y), \;0 \} = 0\;\;$ and $ \;\; \left( \log g(y) \right)^{-} := max \{ -\log g(y), \;0 \} = -\log g(y).$
$\implies E[ \left(\log g(Y) \right)^{+}] = 0 .$
$\implies E\left[\log g(Y) \right]$ exist.
case(2): There exist $n$ intervals $ I_1, ..., I_n \subset R \;\; s.t. \;\; g(y) \ge 1 \;\;\forall y\in I := I_1 \cup ...\cup I_n \;\;$ and $\;\;g(y) \lt 1 \;\;\forall y\in R-I$.
$\implies 0 \le \log g(y) \lt \infty, \;\;\; \forall y \in I \;\;\;and\;\;\; -\infty \lt \log g(y) \lt 0, \;\;\;\forall y \in R-I.$
$\implies E[ \left(\log g(Y) \right)^{+}] = \int_{I} \log g(y) \cdot g(y)dy \le sup_{y \in I}\log g(y)\cdot\int_{R} g(y)dy = sup_{y \in I}\log g(y) \lt \infty.$
$\implies E\left[\log g(Y) \right]$ exist.
Is my proof correct?
$\\$
For the condition (iii), if f(y: $\theta$) is identifiable, then $\log f(y:\theta) $ is concave in $\theta$.
Thus, $ D_{KL}(g|f) = \int -\log f(y: \theta)g(y)dy\;$ is convex in $\theta$ and the condition (iii) holds.
i.e. the condition (iii) holds if f(y: $\theta$) is identifiable.
Is it correct?
$\\$
Also, how to show the condition (ii) holds, for example, f is a normal density.