Applying Disintegration to represent Conditional Probabilities to assist in Integration

33 Views Asked by At

In the context of binary classification by classifier f, sufficiency score of a feature value j in an input u, where f(u) = y, is given by the formula:

$$Suff_j(f,u) = ∫_A I(f(x') = y)R(x) dx$$

where A is a local neighbourhood of u, and for all x in A, we do intervention by setting feature jth value to original input's jth value x_j = u_j. We are the interested in observing how many of the data points yields same classification as y. so $$x'= intervene(x_j = u_j )$$. R(x) denotes the probability distribution in the neighborhood, which is conditioned on all points where x_j ≠ u_j and f(x) ≠ y, so that we are able to observe effect of the intervention on the classification.

$$R(x) = P(x | x_j ≠ uj, f(x)≠ y) = P(x, xj≠ uj,f(x)≠ y) / P(xj ≠ uj , f(x)≠ y) $$

The problem is that for points where x_j = u_j and f(x)=y, the above numerator and denominator will be 0, in that case I have been suggested to use disintegration but after studying it, I don't seem to understand how to approach this? Any help will be appreciated.