Given a dataset of n records, each record having two attributes A1 and A2.
- A1 can be one of two categories: X or Y.
- A2 can be one of 10 categories, C1 to C10.
Given the following hypothesis: If A1 is X, then A2 is statistically significant mostly C1.
The null-hypothesis is, that the categories C1 to C10 are equally distributed.
How would I test for this statistical significance? I've been trying to do a chi-square, but did come to the conclusion that this might not be the right approach, since I would be testing if there is a statistical significant correlation between A1 and A2. However I want to test if there is a statistical significance of A2=C1, for a given A1=X.
What would be the right statistical test for this?