local geometry interpretation of binary hypothesis testing

29 Views Asked by At

In information geometry, suppose $$ \mathcal{N}_{\epsilon}^{\mathcal{X}}\triangleq\left\{P\in \mathcal{P}^{\mathcal{X}}:\sum_{x\in\mathcal{X}}\frac{(P(x)-P_0(x)^2}{P_0(x)}\leq \epsilon^2\right\} $$ where $\mathcal{P}^{\mathcal{X}}$ is the space of probability distribution on a finite alphabet $\mathcal{X}$ and $\epsilon$ is a small amount. Denote $\varphi \leftrightarrow P$ if $$ \varphi(x)\triangleq \frac{P(x)-P_0(x)}{\sqrt{P_0(x)}} $$ We consider a binary hypothesis testing theorem with $m$ samples $x_1,\dots,x_m$ drawn i.i.d. from either distribution $P_1$ or $P_2$, from basic theory of HT we know that $$ \ell = \frac{1}{m}\sum_{i=1}^m \log\frac{P_1(x_i)}{P_2(x_i)} $$ Show that when $P_1,P_2 \in \mathcal{N}_{\epsilon}^{\mathcal{X}}$, $\ell$ can be expressed as $$ \ell = \epsilon^2 \sum_{i=1}^m \hat{\varphi}(x_i)(\varphi_1(x_i)-\varphi_2(x_i))+o(\epsilon^2) $$ where $\varphi_1\leftrightarrow P_1,\varphi_2\leftrightarrow P_2,\hat{\varphi}\leftrightarrow \hat{P}$ and $\hat{P}$ is the empirical distribution of the data samples $$ \hat{P}(x)=\frac{\sum_{i=1}^m 1_{x_i=x}}{m} $$

1

There are 1 best solutions below

0
On

The formulation should be rectified as $$ \ell = E_{P_0}(\rho(X))+\sum_{x\in \mathcal{X}} \hat{\varphi}(x)(\varphi_1(x)-\varphi_2(x))+o(\epsilon^2) $$ where $\rho(x)=\log\frac{P_1(x)}{P_2(x)}$

Firstly, $\frac{P_1(x)}{P_0(x)}=1+\frac{\varphi_1(x)}{\sqrt{P_0(x)}}$ where $\varphi_1(x)\leq \epsilon$ Therefore $\log\frac{P_1(x)}{P_0(x)}=\frac{\varphi_1(x)}{\sqrt{P_0(x)}}+o(\epsilon)$ \begin{align} \rho(x)=&\log\frac{P_1(x)}{P_2(x)}\\ =&\log\frac{P_1(x)}{P_0(x)}-\log\frac{P_2(x)}{P_0(x)}\\ & =\frac{\varphi_1(x)-\varphi_2(x)}{\sqrt{P_0(x)}}+o(\epsilon) \end{align} Also $\ell$ can be expressed as $E_{\hat{P}}(\rho(X))$, i.e. $$ \ell = \sum_{x\in\mathcal{X}} \hat{P}(x)\rho(x) $$ Then \begin{align} \ell-E_{P_0}(\rho(X))=& \sum_{x\in\mathcal{X}} (\hat{P}(x)-P_0(x))\rho(x)\\ =& \sum_{x\in\mathcal{X}} \sqrt{P_0(x)}\hat{\varphi}(x)\rho(x)\\ =& \sum_{x\in\mathcal{X}} \hat{\varphi}(x)(\varphi_1(x)-\varphi_2(x))+\sum_{x\in\mathcal{X}}\sqrt{P_0(x)}\hat{\varphi}(x)o(\epsilon) \end{align} $\hat{\varphi}(x)\leq \epsilon \Rightarrow$ $$ \ell = E_{P_0}(\rho(X))+\sum_{x\in \mathcal{X}} \hat{\varphi}(x)(\varphi_1(x)-\varphi_2(x))+o(\epsilon^2) $$