STATS 250

Confidence Intervals, Sample Size, and Introduction to Hypothesis Testing for a Population Proportion

The confidence interval was previously defined as:

$$\hat{p} \pm z^* \text{s.e.}(\hat{p})$$

This standard error function,

$$\text{s.e.}(\hat{p}) = \sqrt{\frac{p(1-p)}{n}}$$

is highest when \(\hat{p} = 0.5\). This transforms the expression into:

$$\hat{p} \pm \frac{z^*}{2\sqrt{n}}$$

This is called the conservative confidence interval, since it is the largest that a confidence interval can be given a sample size.

Choosing a sample size

This can help you choose a sample size, given a conservative margin of error that you want.

$$m = \frac{z^*}{2\sqrt{n}}$$

This can be rearranged to:

$$n = \left(\frac{z^*}{2m}\right)^2$$

Make sure to always round up, because you can't have 0.2 of a sampled element. This is the minimum, so round up.

Using the CI to guide decisions

  • A value not in a confidence interval can be rejected as a possible value of the population proportion.
  • A value in the confidence interval is an "acceptable" possibility for the value of a population proportion.
  • If two confidence intervals for proportions in different populations do not overlap, it is reasonable to conclude that their values are different.

Hypothesis Testing

  1. Determine the null hypothesis
  2. Verify necessary data conditions, if met, summarize data into appropriate test statistic
  3. Assuming the null hypothesis is true, find the p-value.
  4. Decide if your result is statistically significant, based on the p-value.
  5. Report your conclusions in your context.

The null hypothesis \(H_0\) is the hypothesis that says there is no effect. This is a statement about the population parameter, not the sample statistic.

The alternative hypothesis \(H_a\) is the hypothesis that something is happening.

As a side note, the equals sign is always in the null hypothesis.

If you are looking at significant change in a direction, it is called a one-tailed hypothesis test. If you are looking at significant change from some value in either direction, then you should use a two-tailed hypothesis test.