Analytic form of an ROC curve

153 Views Asked by At

I'm studying the problem of combining two sensors for anomaly detection. I want to analyze its performance by the ROC curve. Now I have obtained a parametric equation about the ROC curve:

$$(x,y) = (FPR,TPR) = (\Phi(c(a-t)), \Phi(c(b-t)))$$

where $a,b,c$ are non-negative real numbers, $t$ is the threshold of the anomaly detection method, and $\Phi$ is the cumulative distribution function (CDF) of a standard normal distribution.

Is it possible to analytically express the ROC curve in a function form like $y=f(x)$?

I have drawn the ROC curve (see figure). It seems like an exponential function (maybe $y=x^\alpha$ ?), but I still have no idea whether it is possible to get an analytic form of $f$. enter image description here

2

There are 2 best solutions below

3
On BEST ANSWER

We have $$x=\Phi(c(a-t))=\frac{1}{2} \left(1+\text{erf}\left(\frac{c (a-t)}{\sqrt{2}}\right)\right)$$ $$y=\Phi(c(b-t))=\frac{1}{2} \left(1+\text{erf}\left(\frac{c (b-t)}{\sqrt{2}}\right)\right)$$ So, $$t=a-\frac{\sqrt{2}}{c}\, \text{erf}^{-1}(2 x-1)$$ and then $$y=\frac 12 \Bigg[1+\text{erf}\left(\frac{c (b-a)}{\sqrt{2}}+\text{erf}^{-1}(2 x-1)\right) \Bigg]$$ From a formal point of view, it is done.

From a practical point of view, you need now to find a "simple" approximation of the inverse of the error function.

If you look at my answer to this question, you will find approximations of the error function which can easily be inversed. Using $P_1(x)$, it is at the price of a quadratic equation; it is just immediate with $P_0$ (which could be sufficient).

Otherwise, you can use the series represntations given here.

Simple would be $$ \text{erf}^{-1}(z)=u+\frac{1}{3}u^3+\frac{7 }{30}u^5+\frac{127 }{630}u^7+\frac{4369 }{22680} u^9+O\left(u^{11}\right)$$ where $u=\frac{\sqrt{\pi } }{2}z$

Edit

Starting from the series, a quite good approximation could be obtained with the $[5,4]$ Padé approximant $$ \text{erf}^{-1}(z)=u\,\frac{1-\alpha_, u^2+\beta\, u^4 } { 1-\gamma\, u^2+\delta\, u^4}\qquad \text{with}\qquad u=\frac{\sqrt{\pi } }{2}z$$ whose error is $\frac{u^{11}}{250}$.

The required coefficients are $$\alpha=\frac{4397}{4338}\qquad \beta=\frac{111547}{910980} \qquad \gamma=\frac{5843}{4338} \qquad \delta=\frac{20533}{60732}$$ The absolute error is less than $0.01$ as long as $z \leq 0.90$.

Worked example

Using for $(a,b,c)$ the values used to generate the plot in the post, that is to say $(0,1,\sqrt{2})$ and all the elements given in this answer, the absolute error on $y$ is smaller than $0.001$ as soon as $x>0.0854215$.

To give an idea : using the norm $$\Psi(a)=\frac 14\int_a^1\Bigg[\text{erf}\left(1-\text{erf}^{-1}(1-2 x)\right)-\text{erf}\left(\text{approximation} \right)\Bigg]^2\,dx$$

$$\left( \begin{array}{cc} a & \Psi(a) \\ 0.00 & 1.24386\times 10^{-4} \\ 0.01 & 2.50636\times 10^{-5} \\ 0.02 & 7.26261\times 10^{-6} \\ 0.03 & 2.37431\times 10^{-6} \\ 0.04 & 8.33009\times 10^{-7} \\ 0.05 & 3.06603\times 10^{-7} \\ 0.06 & 1.16904\times 10^{-7} \\ 0.07 & 4.58574\times 10^{-8} \\ 0.08 & 1.84804\times 10^{-8} \\ 0.09 & 7.70450\times 10^{-9} \\ 0.10 & 3.39594\times 10^{-9} \\ \end{array} \right)$$

2
On

I doudt that a sufficiently simple analytic function $y=f(x)$ could be found.

Possibly one could approach it with series.

I propose an empirical model on the form of supercircle (particular case of superelipse : https://en.wikipedia.org/wiki/Superellipse ). $$\boxed{(1-x)^c+y^c=1}$$ $$y=\left(1-(1-x)^c \right)^{1/c}$$

enter image description here

The blue curve is the copy of the graph from your question.

The red curve is the approximation with $c=2.5$

The approximate value of $c$ was computed as folows :

  • Scanning your graph for the coordinates of the blue pixels ( 2618 points).

  • nonlinear regression for fitting of the function and value of $c$.

Surprisingly the computed value obtained of $c$ is very close to 2.5