A question about infinities and distribution functions

143 Views Asked by At

Let $\mathcal{P}_i$ be the set of probability density functions to which $f_i$ belongs, $(i=0,1)$. Furthermore assume that $$L(y)=\frac{f_1(y)}{f_0(y)}$$ is an increasing function for any chosen $f_1$ and $f_2$. Let the support of the densities be a compact set in reals defined by $\mathbb{K}$.

For a given threshold $\tau\in\mathbb{K}$ one can calculate the probability of false alarm and probability of miss detection as follows:

$$P_F(\tau)=\int_\tau^{\infty}f_0(y)dy$$

$$P_M(\tau)=\int_{-\infty}^\tau f_1(y)dy$$

ROC:=$(P_F(\tau),P_M(\tau))$ forms a curve in $[0,1]$ which is convex.

(ROC, for those who don't know, stands for Receiver Operator Characteristic).

Here is an example:

$$f_0(y)=\frac{1}{\sqrt{2\pi\sigma_0^2}}e^{\frac{-\left(y-\mu_0\right)^2}{2\sigma_0^2}}$$

$$f_1(y)=\frac{1}{\sqrt{2\pi\sigma_1^2}}e^{\frac{-\left(y-\mu_1\right)^2}{2\sigma_1^2}}$$

with $\sigma_0=\sigma_1=1$ and $\mu_0=0$ and $\mu_1=1$. Then we have the following figure for $(P_F(\tau),P_M(\tau))$ when $\tau$ is changed from $-\infty$ to $\infty$, ($\mathbb{K}=\mathbb{R}$).

enter image description here

As known and can be seen from the figure, the blue curve is convex.

For any chosen pair of densities $(f_0,f_1)\in \mathcal{P}_0\times \mathcal{P}_1$. The ROC curve (the blue one) $(P_F(\tau),P_M(\tau))$ when $\tau\in (-\infty,\infty)$ will lie in the butterfly given in the figure with red lines assuming that the point $\theta=P_F=P_M$ is common for all densities in $\mathcal{P}_0\times\mathcal{P}_1$ (in the figure $\theta \approx 0.3$)

Question:

Assume that all densities $(f_0,f_1)\in\mathcal{P}_0\times\mathcal{P}_1$ are known to have a particular $\theta$ in their ROC. In other words, let $\mathcal{P}_0\times\mathcal{P}_1$ define only the pair of densities that have $\theta$ in their ROC and furthermore let one choose any pair of density from $\mathcal{P}_0\times\mathcal{P}_1$ with equal probability.

What is the probabilty that a single point of the ROC that we obtain by this selection will lie in the green sector?

Once again the green sector is the intersection of the butterfly with the area under the line which passes through $\theta$ and $f_1/f_0$ is increasing as defined before. One can assume any $\mathbb{K}$ for example $\mathbb{K}=[0,1]$ or ($\mathbb{K}=\mathbb{R}$).

1

There are 1 best solutions below

15
On BEST ANSWER

If the black line is tangent, and the blue curve is convex, then there is only a single point of the blue line contained in the green area. This is because the green area is defined by the tangent line, and convexity guaranteed that the blue curve will not intersect the black line, and hence the green region, at any other point.

If you're looking for the probability that a single realization will land in this region, simply compute the area of the region. The ROC curve defines the "dividing line" of classification; however, the unit square is still your global probability space.

If you want to know the probability that a different ROC curve intersects this green region, then you can employ a few different conditions. First, assume that any other ROC curve is convex and continuous. Then, the curve defined by $$ R = \left\{ \left(P_F(\tau),P_M(\tau)\right) \right\}$$ is continuous and monotonic and maps from $[0,1]$ to $[0,1]$.

Therefore, this curve has a fixed point in $[0,1]$, namely the point where

$$P_F(\tau) = P_M(\tau).$$

These fixed points lie on the line $y=x$. Obviously, any monotonic curve whose fixed point is $x' > \theta$ will not intersect the green region.

Conversely, any curve with a fixed point $x' \le \theta$ will pass through the green triangle.

Therefore, your probability is $P(x' \le \theta)$ and is uniformly distributed, so your probability in question is therefore exactly $\theta$, which agrees with my previous assessment.