6.5 Behavior of Confidence Intervals for a Proportion
Confidence intervals (CI) for p behave similarly to intervals for μ, though there are a few subtle differences.
Calculating the Sample Size
If researchers desire a specific margin of error, then they can use the error bound formula to calculate the required sample size, n.
Recall the margin of error for a population proportion is:
MoE = ()
Solving for n gives you an equation for the sample size:
n = p(1 – p)
Recall the objective of a CI. If we are looking to estimate p, then we do not know what it is, even though it appears in this formula. So what do we plug in for p? We have a few options:
- If you have prior information, such as a previous sample, and can calculate a point estimate , plug it in!
- You can use your best guess at p.
- You can use a “conservative” estimate of p, 0.5.
NOTE: Remember that = (1 – ), though we do not know yet. Since we multiply and together, we make them both equal to 0.5. Why? Because = (0.5)(0.5) = 0.25 results in the largest possible product. (Try other products: (0.6)(0.4) = 0.24; (0.3)(0.7) = 0.21; (0.2)(0.8) = 0.16; and so on.) The largest possible product gives us the largest n. This gives us a large enough sample to be CL% confident that we are within three percentage points of the true population proportion.
Example
Suppose a mobile phone company wants to determine the current percentage of customers aged 50+ who use text messaging on their cell phones. How many customers aged 50+ should the company survey in order to be 90% confident that the estimated (sample) proportion is within three percentage points of the true population proportion of customers aged 50+ who use text messaging on their cell phones?
Solution
From the problem, we know that the margin of error is 0.03 (3% = 0.03) and 𝑧𝛼/2 = 𝑧0.05 = 1.645 because the confidence level is 90%.
n = gives n = (0.5)(0.5)
Round the answer to the next higher value. The sample size should be 752 cell phone customers aged 50+ in order to be 90% confident that the estimated (sample) proportion is within three percentage points of the true population proportion of all customers aged 50+ who use text messaging on their cell phones.
Your Turn!
Suppose an internet marketing company wants to determine the current percentage of customers who click on ads on their smartphones. How many customers should the company survey in order to be 90% confident that the estimated proportion is within five percentage points of the true population proportion of customers who click on ads on their smartphones?
“Plus Four” Confidence Interval for p
This is an alternative optional method for constructing a CI for p stemming from the continuity correction of the binomial approximation.
There is a certain amount of error introduced into the process of calculating a confidence interval for a proportion. Because we do not know the true proportion for the population, we are forced to use point estimates to calculate the appropriate standard deviation of the sampling distribution. Studies have shown that the resulting estimation of the standard deviation can be flawed.
Fortunately, there is a simple adjustment that allows us to produce more accurate confidence intervals. We simply pretend that we have four additional observations. Two of these observations are successes, and two are failures. The new sample size is then n + 4, and the new count of successes is x + 2.
Computer studies have demonstrated the effectiveness of this method. It should be used when the confidence level desired is at least 90% and the sample size is at least ten.
Example
A random sample of 25 statistics students was asked: “Have you smoked a cigarette in the past week?” Six students reported smoking within the past week. Use the “plus-four” method to find a 95% confidence interval for the true proportion of statistics students who smoke.
Solution
Six students out of 25 reported smoking within the past week, so x = 6 and n = 25. Because we are using the “plus-four” method, we will use x = 6 + 2 = 8 and n = 25 + 4 = 29.
= = ≈ 0.276
= 1 – = 1–0.276 = 0.724
Since CL = 0.95, we know α = 1 – 0.95 = 0.05 and 𝛼/2 = 0.025.
𝑧0.025 = 1.96
Margin of error = () = (1.96) ≈ 0.163.
– MoE = 0.276 – 0.163 = 0.113
+ MoE = 0.276 + 0.163 = 0.439
We are 95% confident that the true proportion of all statistics students who smoke cigarettes is between 0.113 and 0.439.
Your Turn!
Out of a random sample of 65 freshmen at State University, 31 students have declared a major. Use the “plus-four” method to find a 96% confidence interval for the true proportion of freshmen at State University who have declared a major.
How much a point estimate can be expected to differ from the true population value; made up of the standard error multiplied by the critical value