6.2 Inference for the Mean in Practice
We have discussed how the sampling distribution of the sample mean follows a normal distribution when the population standard deviation, σ, is known and the t-distribution when it is not. In practice, we rarely know the population standard deviation. For larger samples, we can typically get away with using z according to the CLT. In summary, the majority of the time in which we opt to use t, we do not know σ and we have a small sample (n < 30).
Confidence Intervals for the Mean (σ Unknown)
The general format of a confidence interval is:
PE – MoE, PE + MoE
The population parameter is μ. The point estimate (PE) for μ is , the sample mean.
If the population standard deviation is not known, the margin of error for a population mean is:
MoE = () )
- is the t critical value with area to the right equal to
- use df = n – 1 degrees of freedom
- s = sample standard deviation
Example
The Federal Election Commission (FEC) collects information about campaign contributions and disbursements for candidates and political committees each election cycle. A political action committee (PAC) is a committee formed to raise money for candidates and campaigns. A Leadership PAC is a PAC formed by a federal politician (senator or representative) to raise money to help other candidates’ campaigns.[1]
The FEC has reported financial information for 556 Leadership PACs that operated during the 2011–2012 election cycle. The following table shows the total receipts during this cycle for a random selection of 30 Leadership PACs (in dollars).
| Receipt data | ||||
|---|---|---|---|---|
| $46,500.00 | $0 | $40,966.50 | $105,887.20 | $5,175.00 |
| $29,050.00 | $19,500.00 | $181,557.20 | $31,500.00 | $149,970.80 |
| $2,555,363.20 | $12,025.00 | $409,000.00 | $60,521.70 | $18,000.00 |
| $61,810.20 | $76,530.80 | $119,459.20 | $0 | $63,520.00 |
| $6,500.00 | $502,578.00 | $705,061.10 | $708,258.90 | $135,810.00 |
| $2,000.00 | $2,000.00 | $0 | $1,287,933.80 | $219,148.30 |
Figure 6.4: PAC receipt data
= $251,854.23
s = $521,130.41
Use this sample data to construct a 96% confidence interval for the mean amount of money raised by all Leadership PACs during the 2011–2012 election cycle. Use the Student’s t-distribution.
Note that we are not given the population standard deviation, only the standard deviation of the sample.
Solution
There are 30 measures in the sample, so n = 30, and df = 30 – 1 = 29.
CL = 0.96, so α = 1 – CL = 1 – 0.96 = 0.04.
α/2 = 0.02 tα/2 = t0.02 = 2.150
Margin of error = tα/2() = 2.150 (521130.41 ÷ √30) ∼ $204,561.66
– margin of error = 251,854.23 – 204,561.66 = $47,292.57
+ margin of error = 251,854.23 + 204,561.66 = $456,415.89
We estimate with 96% confidence that the mean amount of money raised by all Leadership PACs during the 2011–2012 election cycle lies between 47,292.57 and 456,415.89 dollars.
The 96% confidence interval is ($47,262, $456,447).
The difference between solutions arises from rounding differences.
Your Turn!
A random sample of statistics students were asked to estimate the total number of hours they spend watching television in an average week. The responses are recorded in figure 6.5. Use this sample data to construct a 98% confidence interval for the mean number of hours statistics students will spend watching television in one week.
| TV data | ||||
|---|---|---|---|---|
| 0 | 3 | 1 | 20 | 9 |
| 5 | 10 | 1 | 10 | 4 |
| 14 | 2 | 4 | 4 | 5 |
Figure 6.5: Student TV data
Hypothesis Tests for the Mean (σ Unknown)
Remember, we will use the t-distribution when the population standard deviation is unknown and the distribution of the sample mean is approximately normal.
If we are testing a single population mean, and we decide to use t, the steps say the same, but our test statistic will change slightly.
t =
You should have no problem using technology to find p-values associated with a t-test statistic. However, if you want to use your t-table, you’ll find it is somewhat limited in finding exact p-values. Despite that, you can still estimate a range of values for your p-value and then compare the range to your significance level.
Examples
Statistics students believe that the mean score on the first statistics test is 65. A statistics instructor thinks the mean score is higher than 65. He samples ten statistics students and obtains the scores below:
65, 65, 70, 67, 66, 63, 63, 68, 72, 71
Perform the hypothesis test using a 5% level of significance to test the instructor’s claim.
Solution
Set up the hypothesis test:
A 5% level of significance means that α = 0.05. This is a test of a single population mean.
H0: μ = 65
Ha: μ > 65
Since the instructor thinks the average score is higher, use a “>”. The “>” means the test is right-tailed.
Determine the distribution needed:
If you read the problem carefully, you will notice that there is no population standard deviation given. You are only given n = 10 sample data values. Notice also that the data come from a normal distribution. This means that the distribution for the test is a student’s t.
Use tdf. Therefore, the distribution for the test is t9 where n = 10 and df = 10 – 1 = 9.
Calculate the p-value using the Student’s t-distribution:
p-value = P( > 67) = 0.0396 where the sample mean and sample standard deviation are calculated as 67 and 3.1972 from the data.
Interpret the p-value:
If the null hypothesis is true, then there is a 0.0396 probability (3.96%) that the sample mean is 65 or more.
Compare α and the p-value:
Since α = 0.05 and p-value = 0.0396, α > p-value.
Make a decision:
Since α > p-value, reject H0.
This means you reject μ = 65. In other words, you believe the average test score is actually more than 65.
Conclusion:
At a 5% level of significance, the sample data show sufficient evidence that the mean (average) test score is more than 65, just as the math instructor thinks.
Your Turn!
It is believed that a stock price for a particular company will grow at a rate of $5 per week. An investor believes the stock won’t grow as quickly. Changes in the stock price are recorded for ten weeks and are as follows:
$4, $3, $2, $3, $1, $7, $2, $1, $1, $2
Perform a hypothesis test using a 5% level of significance. State the null and alternative hypotheses, find the p-value, state your conclusion, and identify the type I and type II errors.
Summary of Assumptions
When you perform inference on a single population mean μ using a Student’s t-distribution (often called a t-test), there are fundamental assumptions that need to be met in order for the test to work properly. Your data should be a simple random sample that comes from a population that is approximately normally distributed. You use the sample standard deviation to approximate the population standard deviation. (Note that, if the sample size is sufficiently large, a t-test will work even if the population is not approximately normally distributed).
When you perform a hypothesis test of a single population mean μ using a normal distribution (often called a z-test), you take a simple random sample from the population. The population you are testing is normally distributed, or your sample size is sufficiently large. You know the value of the population standard deviation, which is rarely known in reality.
Additional Resources
If you are using an offline version of this text, access the resources for this section via the QR code, or by visiting https://doi.org/10.7294/26207456.
- “Disclosure Data Catalog: Candidate Summary Report 2012.” U.S. Federal Election Commission. Available online at http://www.fec.gov/data/index.jsp (accessed July 2, 2013). ↵
An interval built around a point estimate for an unknown population parameter
The value that is calculated from a sample used to estimate an unknown population parameter
How much a point estimate can be expected to differ from the true population value; made up of the standard error multiplied by the critical value