What test should I use for this problem? (assessing the significance of a change in a categorical variable between two different sized populations)?

43 Views Asked by Bumbble Comm At 23 Feb 2026 - 8:33

I have 2 high schools, School A and School B. For the first school, I have 5 classes of students; for the second, I have 3 classes (so 8 classes in total). Within each class, I have different categorical information about each student, for example whether they're male, whether they study French, etc.

The number of students in each class is different.

So the data might look like this (for example):

SCHOOL A

Class 1: 50 students, 20 males, and 5 students who study French
Class 2: 300 students, 50 males, and 8 students who study French
...
Class 5: 25 students, 17 males, and 3 students who study French

SCHOOL B

Class 1: 140 students, 80 males, 10 students who study French
Class 2: 2500 students, 600 males, 110 students who study French
Class 3: 200 students, 110 males, 9 students who study French

What test to I do to assess whether there is a significant difference in the number of males or students who study French between School A and School B?

I'm confused because the different sample sizes presumably mean we should be looking at proportions, but if I look ONLY at proportions, am I still factoring in the magnitudes of the original values? (e.g. far more students are males than study French, so 6/100 students studying French v.s. 3/100 will look small in terms of proportion changes) Would it be a t-test on the proportions?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 22 Jun 2020 - 10:04 BEST ANSWER

Since every variable is categorical (school, gender, topic studied), you can run a chi-squared test for independence. You want to compare proportions while factoring in the sample sizes, this is exactly what chi-squared tests do.

Simply compute what the expected numbers are in case of mutual independence of all variables, and compute the statistic $$\chi^2=\sum \frac{(O-E)^2}{E}$$ where $O$ stands for Observed numbers and $E$ stands for Expected. The degree of freedom of the system is the product "number of schools $-1$" times "number of topics $-1$" times "number of genders $-1$". Given the way the question is asked, the classes are irrelevant.

What test should I use for this problem? (assessing the significance of a change in a categorical variable between two different sized populations)?

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in HYPOTHESIS-TESTING

Related Questions in SAMPLING

Related Questions in P-VALUE

Trending Questions

Popular # Hahtags

Popular Questions