In this lesson, you’re going learn how to figure out the margin of error, confidence interval, and point estimate for a population proportion with large sample sizes.

## Confidence Intervals and Proportions

Understanding confidence intervals and proportions can be useful in everyday life. For example, let’s say that one day you might want to run your own business.

In doing so, you may want to figure out the proportion or percentage of customers who are happy with the product you make.To do this, you’re going to have to undertake the estimation of a population proportion and find confidence intervals. A **confidence interval** is *the point estimate +/- the margin of error* and the **point estimate** is *the value of a sample statistic, which is used as an estimate of a population parameter*.This lesson will describe how to find confidence intervals for proportions.

## Important Formulas

The **population proportion** is denoted by the symbol *p* while the **sample proportion** is denoted by the symbol *p-hat*. In this lesson, we’re going to learn how to estimate the population proportion thanks to the sample proportion.For large samples:

- The sampling distribution of
*p-hat*is pretty much normal - The mean of the sampling distribution of
*p-hat*, denoted as mu_*p-hat*, equals*p* - The standard deviation of the sampling distribution of
*p-hat*, denoted as sigma_*p-hat*, is equal to the square root of (*p*x*q*) /*n*, where*q*= 1-*p*

By the way, when I say large samples, I mean that in cases of a proportion, the sample is large enough when *np* and *nq* are both greater than 5, where *n* is the symbol for the sample size. If you don’t know what *n* or *q* are equal to, then *n* x *p-hat* and *n* x *q-hat* should be greater than 5.

Since we don’t really know the values of *p* and *q* when we are estimating the population proportion, we can’t actually compute the value of the standard deviation of the sampling distribution of *p-hat*. This means we have to use the value of *s*_*p-hat* as an estimate of sigma_*p-hat*.To calculate *s*_*p-hat* you can use the following formula:*s_p-hat* = square root of (*p-hat* x *q-hat*) / *n*Again, *s_p-hat* is the estimator of the standard deviation *p-hat*, sigma_*p-hat*.Similarly, *p-hat* is the point estimator of *p*. Thus, to find the confidence interval for *p* we need to add and subtract a number to and from *p-hat* that is called the **margin of error (E)**, *the number added to or subtracted from the point estimate*.

The confidence interval for *p* is calculated with the following formula:*p-hat* (+/-) *z* x *s_p-hat*Where the term *z* x *s_p-hat* is the margin of error.You would find the value of *z* from the standard normal distribution table for the appropriate confidence level at the bottom of this page.

## Example

That’s a lot of formulas! Let’s go through an example together to show you how this is really done. This is a completely made up example, by the way.A research company has taken a sample of 1,000 people aged 18-75. It asked them whether or not marriage was important to them.

60% of the respondents claimed that marriage was, in fact, important to them.Find the point estimate of the corresponding population proportion. Then, using a 99% confidence level, find the percentage of all people between the ages of 18-75 who will also say that marriage is important to them.

Find the margin of error for this estimate.I always like to first write out the numbers we know of for sure.Our sample size, *n*, is equal to 1,000.*p-hat* in this case is 0.60 (from 60%), and this is the point estimate of *p*.*q-hat* = 1 – *p-hat* = 1 – 0.6 = 0.

4This means we have all the numbers we need to figure out *s_p-hat*. Just plug and chug into the formula I gave you before to get:*s_p-hat* = square root of (*p-hat* x *q-hat*) / *n* = square root of (0.6 x 0.

4) / 1000 = square root of (.24 / 1000) = 0.015492Using the tables right here, and remembering that we are looking for the area in both tails of the normal distribution curve, we’d find that the *z* values we need are about -2.58 and 2.58.

Meaning, *z* = 2.58. This is because (1 – 0.99) / 2 = 0.

0050 + 0.99 = 0.9950. You’d find this value in the table I mentioned before very close to 2.58.Again, all we have to do is plug and chug into the formulas I gave you before to find our confidence interval.*p-hat* (+/-) *z* x *s_p-hat* = 0.6 (+/-) 2.58(0.015492) = 0.56 to 0.64 = 56%-64%.This means we can state with a 99% level of confidence that between 56%-64% of all people between the ages of 18-75 will claim that marriage is important to them.The margin of error is 0.0399, which is basically 4%.

## Lesson Summary

Using everything you’ve learned here, you should now be able to solve similar problems on your own.A **confidence interval** is *the point estimate +/- the margin of error*; the **point estimate** is *the value of a sample statistic, which is used as an estimate of a population parameter*; and the **margin of error (E)**, is *the number added to or subtracted from the point estimate*.The **population proportion** is denoted by the symbol *p* while the **sample proportion** is denoted by the symbol *p-hat*.To calculate *s_p-hat* you can use the following formula:*s_p-hat* = square root of (*p-hat* x *q-hat*) / *n*The confidence interval for *p* is calculated with the following formula:*p-hat* (+/-) *z* x *s_p-hat*Where the term *z* x *s_p-hat* is the margin of error.