Wilson interval estimation for multiple classes (confusion matrix)

46 Views Asked by At

Sorry to resurrect this, but after 9 years the links mentioned here (How can the Wilson Confidence Interval be adapted for more than 2 outcomes?) went dead ;)

I am looking for similar help in adapting the interval to multiple categories, specifically for the application in evaluation of confusion matrices.

I wrote a script that calculates the confidence interval of normalized confusion matrices using the Wilson distribution. This works fine with 2x2 matrices. The way that I avoid the multiple cases now is treating each class as having a priori odds of 50/50. This condenses each class interval estimation to a true/false binary problem for which the standard binomial approach would work. However, the a priori odds of a multi-class problem are 1/N, so the low-sample interval means should converge to 1/N instead of the 1/2 of the binomial Wilson interval.

If I start from this approximation

p = ( ns + 0.5 * z^2 ) / (n + z^2) +/- [ ( z / (n + z^2) ) * sqrt( (ns * nf / n) + z^2 / 4 ) ]

changing the 0.5 to 1/N seems to work for the mean of the interval. Experimentally increasing n (total observations) with N (number of classes) nicely follows the expected outcome. But how does the interval around the mean scale with the number of classes, if at all? Intuitively the 1/4 may need to change to 1/(N^2), but I don't quite understand it enough to finish the derivation.

Edit: I assume the interval needs to be adjusted, because it needs to be squeezed more or less depending on the mean to avoid under- and overshoot of 0 and 1.