STATS 250

$$\chi^2$$ Test of Homogeneity

This is a test which is used to assess if the distribution for 1 discrete (categorical) variable is the same for two or more populations (varsity athletes in LSA vs athletes in Engineering).

Let's say you have two populations of interest: 75 preschool boys and 75 preschool girls. The two samples are independent. Is ice cream preference the same for boys and girls?

Ice Cream Preference Boys Girls
Vanilla 25 26
Chocolate 30 23
Strawberry 20 26

Our null hypothesis $$H_0$$ is that the distribution of ice cream preference is the same for the two populations, boys and girls.

Since there were 46 children two preferred strawberry overall, we would expect 23 of these children to be boys and 23 to be girls.

Since there were 53 children who preferred chocolate overall, we would expect 26.5 boys and 26.5 girls to prefer chocolate. Although we can't split somebody in half, this is what we would expect numerically. On the exam, when you're asked for an expected value, do not round.

For vanilla, 51 total, and 25.5 in each category.

To get the $$\chi^2$$ statistic, this is the sum over all of these terms:

$$\frac{(\text{Observed - Expected})^2}{\text{Expected}}$$

$$\chi^2 = \frac{(25*25.5)^2}{25.5} + \cdots + \frac{(26 - 23)^2}{23} = 1.73$$

The question is, is this 1.73 shocking? What would we expect? If the null hypothesis is true, then $$\chi^2$$ has the $$\chi^2$$ distribution with df = $$(r-1)(c-1)$$. In our case, the degrees of freedom is 2. Note that we expect $$\chi^2$$ to be equal to the degrees of freedom, 2. 1.73 is pretty close to 2.

Using Table A.5, we find that the p-value for this is in-between 0.25 and 0.5, so we fail to reject the null hypothesis.