Subpopulations of an island (Bayes theorem?)

72 Views Asked by At

Help appreciated here.

An island with 2 regions, I and II, has 4 types of individuals: AX, AY, BX and BY, for which we know their exact total nos. Here A-B-X-Y are simply traits, e.g., A=Male, B=Female, X=Old, Y=Young.

Let's say we also know the single (non-cross tabbed) totals for A-B-X-Y for each of the 2 regions.

So in total we have sort of 3 tables:

Island:

      A      B
X     45     44  
Y     13     9  

Region I:

      A      B
X     -      -  49 
Y     -      -  11
      32     28

Region II:

      A      B
X     -      -  40 
Y     -      -  11
      26     25

Question is: can we calculate the exact cross tabs of individuals AX-AY-BX-BY for regions I & II? If not, can we at least get estimates of these nos.?

I have approached this problem as a sort of variation of the Bayes theorem, but I am not sure it qualifies as such.

Thanks in advance, a.

1

There are 1 best solutions below

3
On BEST ANSWER

You can't calculate the exact quantities for {AX, AY, BX, AY}, as the illusion of 4 equations in 4 unknowns is shattered by the matching column and row grand totals.

One way to solve is to pick a variable and make an estimate. (No doubt for more complex real-world problems there are better ways, especially if there is some noise associated with the row and column total results.) In order to reduce the chance of out-of-range numbers I would make an estimate of BY, chosen as the intersection of the smallest subtotals in each direction. In the top grid, for example, I might set BY to 5, chosen by assuming independence between totals, giving:

$$\begin{array}{c|cc} & A & B \\ \hline X & 26 & 23 \\ Y & 6 & 5 \\ \end{array}$$

However equally I could (arbitrarily) set BY to $0$ and get

$$\begin{array}{c|cc} & A & B \\ \hline X & 21 & 28 \\ Y & 11 & 0 \\ \end{array}$$

So some sort of probabilistic approach might be appropriate; certainly if you have some information on the variation in the underlying populations or any dependencies then that could be used for estimates or sensitivity analysis.