Test of null hypothesis that two sets are drawn from the same underlying distribution, where the sets are paired values and associated with a number

18 Views Asked by At

I have about 60 pairs of binary values. The first elements of the pairs form one set ("set A"), and the second elements of the pairs form a second set ("set B"). Each pair is associated with a number. To illustrate (with the binary values indicated as - or +):

number ($n$) set A set B
1e1 - -
1e1 - -
4e1 - -
1e2 - -
2e2 - +
1e3 + -
2e3 + +
5e3 + +
1e4 + +
3e4 + +

Note the binary values tend to be - when the number is small and + where the number is big. This is because the probability of being + increases with the number. We imagine that there are (unknown) functions $p_A(n)$ and $p_B(n)$ that, given a number $n_i$, return the probability of a value being +; set A is sampled from the probability distribution output by $p_A(n)$, given a set $n$ of numbers $n_i$. So for example suppose $p_A(1e1)=0.001$; we wouldn't be surprised to see a - for $n_i=1e1$. (Apologies if the notation is non-standard.)

I would like to test the null hypothesis that set A and set B are drawn from the same overall distribution. The interaction with $n$ is what's confusing me. I think my question is the same as asking whether $p_A(n)=p_B(n)$, but don't know how to test that null hypothesis, either. I feel there must be some reasonably well known, probably eponymous test for this, but couldn't find what it could be. I'd prefer to use such a test, than re-invent the wheel. Does anyone know? Do you have one or more references?

Specifically, I would like to avoid, e.g., logit fits and KS of the logit fits, because of propagation of errors of the fit parameters (though if someone here tells me, "no, that's absolutely the standard way to do it and here's a reference!, I'm happy to reconsider.)

Help is appreciated---thanks!