Lesson 4

The Shape of Distributions

4.1: Which One Doesn’t Belong: Distribution Shape (10 minutes)

Warm-up

The mathematical purpose of this warm-up is to collect informal terminology students may use to describe shapes of distributions, as well as any ways to describe distributions they may remember from work in earlier grades. This warm-up prompts students to compare four distributions. It gives students a reason to begin using language precisely (MP6) and gives you the opportunity to hear how they use terminology and talk about characteristics of the items in comparison to one another. Listen for students who use the statistically correct vocabulary as well as those who use informal language to describe the shapes.

Launch

Arrange students in groups of 2–4. Display the distributions for all to see. Give students 1 minute of quiet think time and then time to share their thinking with their small group. In their small groups, tell each student to share their reasoning why a particular item does not belong and together find at least one reason each item doesn't belong.

Student Facing

Which one doesn’t belong?

A.

Dot plot from negative 21 to negative 9 by 1’s. Beginning at negative 21, number of dots above each increment is 0, 1, 1, 2, 3, 4, 5, 5, 4, 3, 2, 1, 0.

B.

Dot plot from negative 21 to negative 9 by 1's. Beginning at negative 21, number of dots above each increment is 0, 6, 4, 3, 2, 1, 1, 2, 3, 4, 6, 0, 0.
 

C.

Dot plot from negative 21 to negative 8 by 1’s. Beginning at negative 21, number of dots above each increment is 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0.

D.

Dot plot from negative 21 to negative 9 by 1’s. Beginning at negative 21, number of dots above each increment is 0, 1, 1, 1, 1, 2, 3, 4, 6, 7, 4, 3, 0.

Student Response

Student responses to this activity are available at one of our IM Certified Partners

Activity Synthesis

Select the identified groups to share their reason why a particular item does not belong so that those with informal language speak first and those with more precise terminology follow up. Ensure that each group shares one reason why a particular item does not belong. Record and display the responses for all to see. After each response, ask the class if they agree or disagree. Since there is no single correct answer to the question of which one does not belong, attend to students’ explanations and ensure the reasons given are correct. During the discussion, recast any informal language that is used to describe the shape of each distribution. Introduce and define the terms symmetric, skewed, uniform, bimodal, and bell-shaped. It is important to note that the bell-shaped distribution is also symmetric.

4.2: Matching Distributions (15 minutes)

Activity

The mathematical purpose of this activity is to give students a chance to practice finding data displays that represent the distribution of the same data set and using precise vocabulary for describing the shape of the distributions while taking turns matching cards. Students trade roles explaining their thinking and listening, providing opportunities to explain their reasoning and critique the reasoning of others (MP3).

Launch

Arrange students in groups of 2. Display the images of the dot plot and histogram. Ask students what they notice and wonder.

A dot plot.
A histogram.

 

If it does not come up, help students notice that these two data displays show the same data in different formats. They can also be described as skewed right.

Give each group a set of cut-up cards. Explain that a match is two different displays that represent the distribution of the same set of data. Ask students to take turns: the first partner identifies a match, explains why they think it is a match, then describes the distribution while the other student listens and works to understand. Then they switch roles.

Student Facing

Take turns with your partner matching 2 different data displays that represent the distribution of the same set of data.

  1. For each set that you find, explain to your partner how you know it’s a match.
  2. For each set that your partner finds, listen carefully to their explanation. If you disagree, discuss your thinking and work to reach an agreement.
  3. When finished with all ten matches, describe the shape of each distribution.

Student Response

Student responses to this activity are available at one of our IM Certified Partners

Anticipated Misconceptions

For students having trouble with the uniform distribution histograms, remind them that the lower bound for each interval is included and the upper bound is not. Ask them why this might change the last bar in each of these histograms. Some students may not know where to start to match data displays. You can tell them to look at the lowest and highest values as a starting point to finding similarities between two representations. 

Activity Synthesis

Once all groups have completed the matching, discuss the following:

  • “Which matches were tricky? Explain why.” (The uniform distributions may be difficult.)
  • “What vocabulary was useful to describe the shape of the distribution?” (symmetric, skewed, uniform, bimodal, bell-shaped)
  • “Were there any matches that could be described by more than one of these vocabulary terms?” (Yes, symmetric or skewed can also be used with some of the other terms for some of the distributions.)

If necessary, ask students to revoice less formal descriptions of the shape of the distribution using formal language including:

  • Symmetric distribution
  • Skewed distribution
  • Uniform distribution
  • Bimodal distribution
  • Bell-shaped distribution
Speaking, Listening: MLR 7 Compare and Connect. Ask students to prepare a visual display of their sorted cards. As students investigate each others’ work, ask students to share what worked well in a particular approach. Listen for and amplify any comments about the use of the words symmetric, skewed, bimodal, bell-shaped, and uniform to compare the two different displays. Then encourage students to make connections between the various ways to describe the different distributions. Revoice language students use to explain that the two different displays are equivalent. This will foster students’ meta-awareness and support constructive conversations as they compare and connect data displays that represent the same distribution. Design Principle(s): Cultivate conversation; Maximize meta-awareness
Action and Expression: Internalize Executive Functions. Provide students with a graphic organizer with the phrase “shapes of distributions” at the center, connecting to it all the related concepts mentioned during the discussion. 
Supports accessibility for: Language; Organization

4.3: Where Did The Distribution Come From? (10 minutes)

Activity

The mathematical purpose of this activity is to remind students of the importance of context to statistics. Although some analysis can be done outside of a context, it is often useful to think about the real situations in which the data was collected to engage student intuition and understanding.

Launch

Keep students in the same groups. Assign each pair of students one of the completed matches from the card sort activity. Tell students there are many possible answers for each representation. After 2 minutes of quiet work time, ask students to compare their responses to their partner’s and decide if they are both reasonable. You may need to demonstrate this activity before beginning if you think students may have trouble getting started. After each group finishes with their assigned distribution, assign the group another distribution to consider.

Conversing, Reading: MLR 2 Collect and Display. While students are working, circulate and listen to student talk when they share their educated guesses of the survey question that produced the data for their assigned matches. Write down common or important phrases you hear students say about each representation, such as “line of symmetry,” “bell-shaped,” “skewed,” “bimodal,” “uniform,” etc. Write the students’ words on a visual display of the data displays. This will help students read and use mathematical language during the whole-class discussion.
Design Principle(s): Support sense-making
Action and Expression: Develop Expression and Communication. To help get students started, display sentence frames such as  “A possible survey question that produced this data is _____ because _____.”
Supports accessibility for: Language; Organization

Student Facing

Your teacher will assign you some of the matched distributions. Using the information provided in the data displays, make an educated guess about the survey question that produced this data. Be prepared to share your reasoning.

Student Response

Student responses to this activity are available at one of our IM Certified Partners

Student Facing

Are you ready for more?

This distribution shows the length in inches of fish caught and released from a nearby lake.

Histogram. Length of fish in inches. Height of each interval bar as follows: 5, 10, 15, 25, 20, 15, 5, 2, 0, 1, 5, 10, 20, 28, 15, 10, 5, 2, 2.
  1. Describe the shape of the distribution.

  2. Make an educated guess about what could cause the distribution to have this shape.

 

Student Response

Student responses to this activity are available at one of our IM Certified Partners

Activity Synthesis

Ask each group to share their response for at least one of the distributions they were assigned. After each group shares, ask the class if their context is reasonable. Here are some questions for discussion:

  • “How did you use the shape of the data to come up with your question?” (Since the data was bell-shaped, I tried to think of situations where most of the data would be similar with a few points a little away from the these values.)
  • “Would you always expect your question to result in a [symmetric, skewed, bell-shaped, etc.] distribution?” (Not necessarily, but for most cases it would.)

Reveal the actual survey question that produced the distribution. Actual questions by row:

  1. How many points did Kiran score in each of his 22 games this season?
  2. What were typical low temperatures in a Siberian town during January?
  3. On a scale of 1–8, how was the service at the restaurant?
  4. How many questions did people get correct on the vocabulary test the first week of school?
  5. How many questions did people get correct on the vocabulary test the second week of school?
  6. How many feet below the surface were each of the core samples taken?
  7. How many trees are in my backyard at various temperatures?
  8. What was the sum when you spun a spinner labeled 0 to 5 twice?
  9. What was the weight of the crystal you grew in chemistry class?
  10. How many questions did students get correct on a 10-item matching test?

Ask students to share what they have learned about the distribution now that they can think of the data in a real situation.

Lesson Synthesis

Lesson Synthesis

In this lesson students describe the shape of distributions using formal language and invent contexts for distributions with different shapes. Here are some questions for discussion.

  • “What does a symmetric data set look like?” (It will have a line of symmetry in the middle and the left side will look like a reflection of the right side.)
  • “What does it mean to say that the shape of a distribution is uniform?” (There will by an equal number of each data value and the shape will look rectangular.)
  • “Have you heard of a bell curve before? How does this relate to a bell-shaped distribution?” (Yes. I have heard of it in science class where a bell curve was used to compare data in an experiment.)
  • “What is an example of a context where you would expect to find a bimodal distribution?” (You might find it if you measured the weight of a herd of cows in the springtime. The adult cows would be one peak and the calves would be the other peak.)
  • “Can a skewed distribution also be symmetric? Why or why not?” (No, because skewed means that one side of the peak of the data has more data values further away from the peak than the other side. There is no line of symmetry.)

4.4: Cool-down - Distribution Types (5 minutes)

Cool-Down

Cool-downs for this lesson are available at one of our IM Certified Partners

Student Lesson Summary

Student Facing

We can describe the shape of distributions as symmetric, skewed, bell-shaped, bimodal, or uniform. Here is a dot plot, histogram, and box plot representing the distribution of the same data set. This data set has a symmetric distribution.

Dot plot from 6 to 18 by 1’s. Beginning at 6, number of dots above each increment is 0, 1, 1, 2, 3, 5, 8, 5, 3, 2, 1, 1, 0.
Histogram from 6 to 18 by 2’s. Beginning at 6 up to but not including 8, height of bar at each interval is 1, 3, 8, 10, 5, 2
Box plot from 6 to 18 by 1’s. Whisker from 7 to 11. Box from 11 to 13 with vertical line at 12. Whisker from 13 to 17.

In a symmetric distribution, the mean is equal to the median and there is a vertical line of symmetry in the center of the data display. The histogram and the box plot both group data together. Since histograms and box plots do not display each data value individually, they do not provide information about the shape of the distribution to the same level of detail that a dot plot does. This distribution, in particular, can also be called bell-shaped. A bell-shaped distribution has a dot plot that takes the form of a bell with most of the data clustered near the center and fewer points farther from the center. This makes the measure of center a very good description of the data as a whole. Bell-shaped distributions are always symmetric or close to it.

Here is a dot plot, histogram, and box plot representing a skewed distribution.

Dot plot from 8 to 18 by 1’s. Beginning at 8, number of dots above each increment is 6, 5, 4, 3, 2, 2, 1, 1, 1, 0, 0.
Histogram from 8 to 18 by 2’s. Beginning at 8 up to but not including 10, height of bar at each interval is 11, 7, 4, 2, 1.
Box plot from 8 to 18 by 1’s. Whisker from 8 to 8.5. Box from 8.5 to 12 with vertical line at 10. Whisker from 12 to 16.

In a skewed distribution, one side of the distribution has more values farther from the bulk of the data than the other side. This results in the mean and median not being equal. In this skewed distribution, the data is skewed to the right because most of the data is near the 8 to 10 interval, but there are many points to the right. The mean is greater than the median. The large data values to the right cause the mean to shift in that direction while the median remains with the bulk of the data, so the mean is greater than the median for distributions that are skewed to the right. In a data set that is skewed to the left, a similar effect happens but to the other side. Again, the dot plot provides a greater level of detail about the shape of the distribution than either the histogram or the dot plot.

A uniform distribution has the data values evenly distributed throughout the range of the data. This causes the distribution to look like a rectangle.

Dot plot from 7 to 19 by 1’s. Beginning at 7, number of dots above each increment is 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0.
Histogram from 7 to 19 by 2’s. Beginning at 7 up to but not including 9, height of bar at each interval is 4.
Box plot from 7 to 19 by 1’s. Whisker from 7 to 9.5. Box from 9.5 to 15.5 with vertical line at 12.5. Whisker from 15.5 to 18.

In a uniform distribution the mean is equal to the median since a uniform distribution is also a symmetric distribution. The box plot does not provide enough information to describe the shape of the distribution as uniform, though the even length of each quarter does suggest that the distribution may be approximately symmetric.

A bimodal distribution has two very common data values seen in a dot plot or histogram as distinct peaks.

Dot plot from 6 to 20 by 1’s. Beginning at 6, number of dots above each increment is 0, 5, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 4, 1, 0.
Histogram from 6 to 20 by 2’s. Beginning at 6 up to but not including 8, height of bar at each interval is 5, 4, 2, 2, 2, 4, 5.
Box plot from 6 to 20 by 1’s. Whisker from 7 to 8. Box from 8 to 17 with vertical line at 12.5. Whisker from 17 to 19.

Sometimes, a bimodal distribution has most of the data clustered in the middle of the distribution. In these cases the center of the distribution does not describe the data very well. Bimodal distributions are not always symmetric. For example, the peaks may not be equally spaced from the middle of the distribution or other data values may disrupt the symmetry.