How can I calculate area under the curve?

131 Views Asked by At

I have problems understanding how can I calculate manually area under the curve for my predictions, knowing the real values.

I understand the idea behind confusion matrix, can calculate true positive/false positive rate and understand the definition of the area under the curve, but still can not understand how to calculate it by hand.

All the time, I blindly uses statistical tools to do the job for me, where I just provided a binary vector of my predictions and a binary vector of real values, and it returned me the answer.

So can anyone explain me how can I calculate it for example for the following binary vectors:

my_prediction = [1, 0, 1, 1, 0]
real_values   = [0, 0, 1, 1, 1]

The AUC for these values is 0.58(3)

1

There are 1 best solutions below

0
On

Your "curve" only has three points

  • Predict "all negative", in which case you have a false positive rate of $\frac{0}{2}=0$ and a true positive rate of $\frac{0}{3}=0$

  • Your actual predictions, in which case you have a false positive rate of $\frac{1}{2}=0.5$ and a true positive rate of $\frac{2}{3}=0.666\ldots$

  • Predict "all positive", in which case you have a false positive rate of $\frac{2}{2}=1$ and a true positive rate of $\frac{3}{3}=1$

enter image description here

This will give you an area under the curve of $\frac12(0.5-0)(0+0.666\ldots) + \frac12(1-0.5)(1+0.666\ldots)=0.583\ldots$