STATS 250

Sampling Distribution for a Sample Proportion

If you want to estimate a population parameter, you take a sample. Samples may vary based on sampling variability. This accounts for different values for the sample statistic.

In statistical inference, you use a sample to make a judgment about a population. There are two main ways to do this that we cover in the class:

  • Confidence interval estimation: A range of reasonable values for the population parameter.
  • Hypothesis test: Comparing what you saw with what you expected.

Introduction to Confidence Intervals

A confidence interval is a rough range around your sample statistic \(\hat{p}\) that you believe contains the population parameter \(p\). Although you can never really be sure about this, you can give a confidence level that your interval contains the actual value of \(p\).

For example, I could be 100% sure that the average age of a person is between 0, and the age of the universe. If I took more samples, I could be 99% sure that the average age is between 10 and 100. Eventually, the margin of error for the population parameter will close in around its actual value.

If \(np \geq 10\) and \(n(1-p) \geq 10\), then you should use the following distribution:

$$X \sim N\left(p, \sqrt{\frac{p(1-p)}{n}}\right)$$

Otherwise, you should use:

$$X \sim N\left(np, \sqrt{np(1-p)}\right)$$