# Lesson 11

Comparing and Contrasting Data Distributions

• Let’s investigate variability using data displays and summary statistics.

### 11.1: Math Talk: Mean

Evaluate the mean of each data set mentally.

27, 30, 33

61, 71, 81, 91, 101

0, 100, 100, 100, 100

0, 5, 6, 7, 12

### 11.2: Describing Data Distributions

1. Your teacher will give you a set of cards. Take turns with your partner to match a data display with a written statement.
1. For each match that you find, explain to your partner how you know it’s a match.
2. For each match that your partner finds, listen carefully to their explanation. If you disagree, discuss your thinking and work to reach an agreement.
2. After matching, determine if the mean or median is more appropriate for describing the center of the data set based on the distribution shape. Discuss your reasoning with your partner. If it is not given, calculate (if possible) or estimate the appropriate measure of center. Be prepared to explain your reasoning.

### 11.3: Visual Variability and Statistics

Each box plot summarizes the number of miles driven each day for 30 days in each month. The box plots represent, in order, the months of August, September, October, November, and December.

1. The five box plots have the same median. Explain why the median is more appropriate for describing the center of the data set than the mean for these distributions.
2. Arrange the box plots in order of least variability to greatest variability. Check with another group to see if they agree.
3. The five dot plots have the same mean. Explain why the mean is more appropriate for describing the center of the data set than the median.
4. Arrange the dot plots in order of least variability to greatest variability. Check with another group to see if they agree.

1.  These two box plots have the same median and the same IQR. How could we compare the variability of the two distributions?

2. These two dot plots have the same mean and the same MAD. How could we compare the variability of the two distributions?

### Summary

The mean absolute deviation, or MAD, is a measure of variability that is calculated by finding the mean distance from the mean of all the data points. Here are two dot plots, each with a mean of 15 centimeters, displaying the length of sea scallop shells in centimeters.

Notice that both dot plots show a symmetric distribution so the mean and the MAD are appropriate choices for describing center and variability. The data in the first dot plot appear to be more spread apart than the data in the second dot plot, so you can say that the first data set appears to have greater variability than the second data set. This is confirmed by the MAD. The MAD of the first data set is 1.18 centimeters and the MAD of the second data set is approximately 0.94 cm. This means that the values in the first data set are, on average, about 1.18 cm away from the mean and the values in the second data set are, on average, about 0.94 cm away from the mean. The greater the MAD of the data, the greater the variability of the data.

The interquartile range, IQR, is a measure of variability that is calculated by subtracting the value for the first quartile, Q1, from the value for the the third quartile, Q3. These two box plots represent the distributions of the lengths in centimeters of a different group of sea scallop shells, each with a median of 15 centimeters.