5.5 Introduction to Hypothesis Tests
One job of a statistician is to make statistical inferences about populations based on samples taken from the population. Confidence intervals are one way to estimate a population parameter.
Another way to make a statistical inference is to make a decision about a parameter. For instance, a car dealership advertises that its new small truck gets 35 miles per gallon on average. A tutoring service claims that its method of tutoring helps 90% of its students get an A or a B. A company says that female managers in their company earn an average of $60,000 per year. A statistician may want to make a decision about or evaluate these claims. A hypothesis test can be used to do this.
A hypothesis test involves collecting data from a sample and evaluating the data. Then the statistician makes a decision as to whether or not there is sufficient evidence to reject the null hypothesis based upon analyses of the data.
In this section, you will conduct hypothesis tests on single means when the population standard deviation is known.
Hypothesis testing consists of two contradictory hypotheses or statements, a decision based on the data, and a conclusion. To perform a hypothesis test, a statistician will perform some variation of these steps:
- Define hypotheses.
- Collect and/or use the sample data to determine the correct distribution to use.
- Calculate test statistic.
- Make a decision.
- Write a conclusion.
Defining your hypotheses
The actual test begins by considering two hypotheses: the null hypothesis and the alternative hypothesis. These hypotheses contain opposing viewpoints.
The null hypothesis (H0) is often a statement of the accepted historical value or norm. This is your starting point that you must assume from the beginning in order to show an effect exists.
The alternative hypothesis (Ha) is a claim about the population that is contradictory to H0 and what we conclude when we reject H0.
Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.
After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are “reject H0” if the sample information favors the alternative hypothesis or “do not reject H0” or “decline to reject H0” if the sample information is insufficient to reject the null hypothesis.
The following table shows mathematical symbols used in H0 and Ha:
| H0 | Ha |
|---|---|
| Equal (=) | Not equal (≠) or greater than (>) or less than (<) |
| Equal (=) | Less than (<) |
| Equal (=) | More than (>) |
Figure 5.12: Null and alternative hypotheses
NOTE: H0 always has a symbol with an equal in it. Ha never has a symbol with an equal in it. The choice of symbol in the alternative hypothesis depends on the wording of the hypothesis test. Despite this, many researchers may use =, ≤, or ≥ in the null hypothesis. This practice is acceptable because our only decision is to reject or not reject the null hypothesis.
Example
We want to test whether the mean GPA of students in American colleges is 2.0 (out of 4.0). The null hypothesis is: H0: μ = 2.0. What is the alternative hypothesis?
Solution
Ha: μ ≠ 2.0
Your Turn!
A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25%. State the null and alternative hypotheses.
Using the Sample to Test the Null Hypothesis
Once you have defined your hypotheses, the next step in the process is to collect sample data. In a classroom context, the data or summary statistics will usually be given to you.
Then you will have to determine the correct distribution to perform the hypothesis test, given the assumptions you are able to make about the situation. Right now, we are demonstrating these ideas in a test for a mean when the population standard deviation is known using the z distribution. We will see other scenarios in the future.
Calculating a Test Statistic
Next you will start evaluating the data. This begins with calculating your test statistic, which is a measure of the distance between what you observed and what you are assuming to be true. In this context, your test statistic, zο, quantifies the number of standard deviations between the sample mean, , and the population mean, µ. Calculating the test statistic is analogous to the previously discussed process of standardizing observations with z-scores:
where µo is the value assumed to be true in the null hypothesis.
Making a Decision
Once you have your test statistic, there are two methods to use it to make your decision:
- Critical value method (discussed further in later chapters)
- p-value method (our current focus)
p-Value Method
To find a p-value, we use the test statistic to calculate the actual probability of getting the test result. Formally, the p-value is the probability that, if the null hypothesis is true, the results from another randomly selected sample will be as extreme or more extreme as the results obtained from the given sample.
A large p-value calculated from the data indicates that we should not reject the null hypothesis. The smaller the p-value, the more unlikely the outcome and the stronger the evidence is against the null hypothesis. We would reject the null hypothesis if the evidence is strongly against it.
Draw a graph that shows the p-value. The hypothesis test is easier to perform if you use a graph because you see the problem more clearly.
Example
Suppose a baker claims that his bread height is more than 15 cm on average. Several of his customers do not believe him. To persuade his customers that he is right, the baker decides to do a hypothesis test. He bakes ten loaves of bread. The mean height of the sample loaves is 17 cm. The baker knows from baking hundreds of loaves of bread that the standard deviation for the height is 0.5 cm and the distribution of heights is normal.
The null hypothesis could be H0: μ ≤ 15.
The alternate hypothesis is Ha: μ > 15.
The words “is more than” calls for the use of the > symbol, so “μ > 15″ goes into the alternate hypothesis. The null hypothesis must contradict the alternate hypothesis.
Since σ is known (σ = 0.5 cm), the distribution for the population is known to be normal with mean μ = 15 and standard deviation = = 0.16.
Suppose the null hypothesis is true (the mean height of the loaves is no more than 15 cm). Then, is the mean height (17 cm) calculated from the sample unexpectedly large? The hypothesis test works by asking how unlikely the sample mean would be if the null hypothesis were true. The graph shows how far out the sample mean is on the normal curve. The p-value is the probability that, if we were to take other samples, any other sample mean would fall at least as far out as 17 cm.
This means that the p-value is the probability that a sample mean is the same or greater than 17 cm when the population mean is, in fact, 15 cm. We can calculate this probability using the normal distribution for means.
The p-value is P( > 17), which is approximately zero.
A p-value of approximately zero tells us that it is highly unlikely that a loaf of bread rises no more than 15 cm on average. That is, almost 0% of all loaves of bread would be at least as high as 17 cm purely by CHANCE had the population mean height really been 15 cm. Because the outcome of 17 cm is so unlikely (meaning it is happening NOT by chance alone), we conclude that the evidence is strongly against the null hypothesis that the mean height would be at most 15 cm. There is sufficient evidence that the true mean height for the population of the baker’s loaves of bread is greater than 15 cm.
Your Turn!
A normal distribution has a standard deviation of one. We want to verify a claim that the mean is greater than 12. A sample of 36 is taken with a sample mean of 12.5.
Find the p-value.
Solution
H0: μ ≤ 12
Ha: μ > 12
The p-value is 0.0013.
Draw a graph that shows the p-value.
Decision and Conclusion
A systematic way to decide whether to reject or not reject the null hypothesis is to compare the p-value and a preset or preconceived α (also called a significance level). A preset α is the probability of a type I error (rejecting the null hypothesis when the null hypothesis is true). It may or may not be given to you at the beginning of the problem. If there is no given preconceived α, then use α = 0.05.
When you make a decision to reject or not reject H0, do as follows:
- If α > p-value, reject H0. The results of the sample data are statistically significant. You can say there is sufficient evidence to conclude that H0 is an incorrect belief and that the alternative hypothesis, Ha, may be correct.
- If α ≤ p-value, fail to reject H0. The results of the sample data are not significant. There is not sufficient evidence to conclude that the alternative hypothesis, Ha, may be correct.
After you make your decision, write a thoughtful conclusion in the context of the scenario incorporating the hypotheses.
NOTE: When you “do not reject H0,” it does not mean that you should believe that H0 is true. It simply means that the sample data have failed to provide sufficient evidence to cast serious doubt about the truthfulness of Ho.
Example
When using the p-value to evaluate a hypothesis test, the following rhymes can come in handy:
If the p-value is low, the null must go.
If the p-value is high, the null must fly.
This memory aid relates a p-value less than the established alpha (“the p-value is low”) as rejecting the null hypothesis and, likewise, relates a p-value higher than the established alpha (“the p-value is high”) as not rejecting the null hypothesis.
Fill in the blanks:
- Reject the null hypothesis when .
- The results of the sample data .
- Do not reject the null when hypothesis when .
- The results of the sample data .
Solution
- Reject the null hypothesis when the p-value is less than the established alpha value.
- The results of the sample data support the alternative hypothesis.
- Do not reject the null hypothesis when the p-value is greater than the established alpha value.
- The results of the sample data do not support the alternative hypothesis.
Your Turn!
It’s a Boy Genetics Labs claim their procedures improve the chances of a boy being born. The results for a test of a single population proportion are as follows:
- H0: p = 0.50, Ha: p > 0.50
- α = 0.01
- p-value = 0.025
Interpret the results and state a conclusion in simple, non-technical terms.
Additional Resources
If you are using an offline version of this text, access the resources for this section via the QR code, or by visiting https://doi.org/10.7294/26207456.
Figure References
Figure 5.11: Alora Griffiths (2019). dalmatian puppy near man in blue shorts kneeling. Unsplash license. https://unsplash.com/photos/7aRQZtLsvqw
Figure 5.13: Kindred Grey (2020). Bread height probability. CC BY-SA 4.0.
Figure Descriptions
Figure 5.11: Dalmatian puppy near man sitting on the floor.
Figure 5.13: Normal distribution curve on average bread heights with values 15, as the population mean, and 17, as the point to determine the p-value, on the x-axis.
A decision-making procedure for determining whether sample evidence supports a hypothesis
The claim that is assumed to be true and is tested in a hypothesis test
A working hypothesis that is contradictory to the null hypothesis
A measure of the difference between observations and the hypothesized (or claimed) value
The probability that an event will occur, assuming the null hypothesis is true
Probability that a true null hypothesis will be rejected, also known as type I error and denoted by α
Finding sufficient evidence that the observed effect is not just due to variability, often from rejecting the null hypothesis