STATS 250

Sampling Distribution for a Sample Proportion

If you want to estimate a population parameter, you take a sample. Samples may vary based on sampling variability. This accounts for different values for the sample statistic.

In statistical inference, you use a sample to make a judgment about a population. There are two main ways to do this that we cover in the class:

Confidence interval estimation: A range of reasonable values for the population parameter.
Hypothesis test: Comparing what you saw with what you expected.

Introduction to Confidence Intervals

A confidence interval is a rough range around your sample statistic $\hat{p}$ that you believe contains the population parameter $p$. Although you can never really be sure about this, you can give a confidence level that your interval contains the actual value of $p$.

For example, I could be 100% sure that the average age of a person is between 0, and the age of the universe. If I took more samples, I could be 99% sure that the average age is between 10 and 100. Eventually, the margin of error for the population parameter will close in around its actual value.

If $np \geq 10$ and $n(1-p) \geq 10$, then you should use the following distribution:

$$X \sim N\left(p, \sqrt{\frac{p(1-p)}{n}}\right)$$

Otherwise, you should use:

$$X \sim N\left(np, \sqrt{np(1-p)}\right)$$