3.2 Visualizing Bivariate Quantitative Data

Bivariate Quantitative Data

When we are looking at bivariate data, we first need to decide if changing one variable seems to lead to a change in the other. A response variable (also called y, dependent variable, and predicted variable) measures or records an outcome of a study. An explanatory variable (also called x, independent variable, and predictor variable) explains changes in the response variable.

In the rest of this chapter, we will be studying “simple linear regression.” Note that this does not imply that these ideas are “simple” but just that we are working with one independent variable (x) and a linear relationship. This involves data that fits a line in two dimensions.

When considering the relationship between two quantitative variables:

  1. Start with a graph (scatter plot).
  2. Look for an overall pattern and deviations from the pattern.
  3. Use numerical descriptions of the data and overall pattern (correlation, coefficient of determination).
  4. Consider a mathematical model (regression).

Scatter Plots

Before we discuss linear regression and correlation, we need to examine a way to display the relation between the variables x and y. The most common and easiest way is a scatter plot. A scatter plot shows a lot about the relationship between the variables. When you look at a scatter plot, you want to notice the overall pattern and any potential deviations from the pattern. You can determine the strength of the relationship by looking at the scatter plot and seeing how close the points are together. When looking at a scatter plot you always want to note:

  • Shape
  • Trend
  • Strength

The following scatter plot examples illustrate these concepts.

Figure description available at the end of the section.
Figure 3.8: Scatter plot configurations. Figure description available at the end of the section.

Shape

Although we may see other shapes in a scatter plot, we are currently only interested in applying these ideas when we see a linear pattern. Linear patterns are quite common. The linear relationship is strong if the points are close to a straight line, except in the case of a horizontal line, which indicates no relationship. If we think that the points show a linear relationship, we draw a line on the scatter plot. Later, we will learn to calculate this line through a process called linear regression. However, we only calculate a regression line if one of the variables helps to explain or predict the other variable.

Trend

If we do see a linear pattern, what sort of relationship is there? A positive trend is seen when increasing x also increases y. On the other hand, a negative (inverse) trend is seen when increasing x appears to cause y to decrease. In other words:

  • High values of one variable occurring with high values of the other variable or low values of one variable occurring with low values of the other variable
  • High values of one variable occurring with low values of the other variable

Strength

At this point, we can think about the strength of a relationship by asking how tightly the points on a scatter plot fit the linear pattern. A stronger relationship has points clustered together closely, while in a weaker one, points are more spread out. The strength of a relationship is not always apparent in a scatter plot, but we will see them measured numerically in the future.

Example

Does the scatter plot appear linear? Strong or weak? Positive or negative?

Figure description available at the end of the section.
Figure 3.9: Scatter plot 1. Figure description available at the end of the section.

 

Solution

The data appear to be linear with a strong, positive correlation.

 

Does the scatter plot appear linear? Strong or weak? Positive or negative?

Figure description available at the end of the section.
Figure 3.10: Scatter plot 2. Figure description available at the end of the section.
Solution

The data appear to be linear with a weak, negative correlation.

 

Does the scatter plot appear linear? Strong or weak? Positive or negative?

Scatter plot with several points plotted all over the first quadrant. There is no pattern.
Figure 3.11: Scatter plot 3. Figure description available at the end of the section.
Solution

The data appear to have no correlation.

Your Turn!

Amelia plays basketball for her high school. She wants to improve to play at the college level. She notices that the number of points she scores in a game seems to go up in response to the number of hours she practices her jump shot each week. She records the following data:

Hours practicing jump shot (x) Points scored in a game (y)
5 15
7 22
9 28
10 31
11 33
12 36

Figure 3.12: Amelia’s points

Construct a scatter plot, and state whether Amelia’s hypothesis appears to be true.

Figure References

Figure 3.8: Kindred Grey (2020). Scatter plot configurations. CC BY-SA 4.0. Adaptation of Figures 12.6, 12.7, and 12.8 from OpenStax Introductory Statistics (2013) (CC BY 4.0). Retrieved from https://openstax.org/books/introductory-statistics/pages/12-2-scatter-plots

Figure 3.9: Kindred Grey (2020). Scatter plot 1. CC BY-SA 4.0. Adaptation of Figure 12.26 from OpenStax Introductory Statistics (2013) (CC BY 4.0). Retrieved from https://openstax.org/books/introductory-statistics/pages/12-practice

Figure 3.10: Kindred Grey (2020). Scatter plot 2. CC BY-SA 4.0. Adaptation of Figure 12.27 from OpenStax Introductory Statistics (2013) (CC BY 4.0). Retrieved from https://openstax.org/books/introductory-statistics/pages/12-practice

Figure 3.11: Kindred Grey (2020). Scatter plot 3. CC BY-SA 4.0. Adaptation of Figure 12.28 from OpenStax Introductory Statistics (2013) (CC BY 4.0). Retrieved from https://openstax.org/books/introductory-statistics/pages/12-practice

Figure Descriptions

Figure 3.8: Six scatterplots showing different patterns. First: positive linear pattern (strong)—shows dots in an almost perfect line from bottom left of graph to top right. Second: linear pattern with one deviation—shows the same pattern as first scatterplot with one outlier in the top left corner. Third: negative linear pattern (strong)—shows dots in an almost perfect line from top left to bottom right of graph. Fourth: negative linear pattern (weak)—shows dots from top left to bottom right of graph nowhere near a perfect line, but not completely random. Fifth: exponential growth pattern—shows a few dots on the x axis from left to right in a horizontal line and then gradually the dots move upwards towards the top right corner creating an upwards curve. Sixth: no pattern—random dots all over the graph.

Figure 3.9: Scatterplot with several points plotted in the first quadrant. The points form a clear pattern, moving upward to the right. The points do not line up , but the overall pattern can be modeled with a line.

Figure 3.10: Scatterplot with several points plotted in the first quadrant. The points move downward to the right. The overall pattern can be modeled with a line, but the points are widely scattered.

Figure 3.11: Scatter plot with several points plotted all over the first quadrant. There is no pattern.

definition

License

Icon for the Creative Commons Attribution-ShareAlike 4.0 International License

Significant Statistics: An Introduction to Statistics Copyright © 2024 by John Morgan Russell is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book