Lesson 9

What Makes a Good Sample?

9.1: Number Talk: Division by Powers of 10 (5 minutes)

Warm-up

The purpose of this number talk is to gather strategies and understandings students have for dividing by powers of 10. These understandings help students develop fluency and will be helpful later in this lesson when students will need to be able to find the mean for various samples.

While four problems are given, it may not be possible to share every strategy. Consider gathering only two or three different strategies per problem, saving most of the time for the final question.

Launch

Reveal one problem at a time. Give students 30 seconds of quiet think time for each problem and ask them to give a signal when they have an answer and a strategy. Keep all previous problems displayed throughout the talk. Follow with a whole-class discussion.

Representation: Internalize Comprehension. To support working memory, provide students with sticky notes or mini whiteboards.
Supports accessibility for: Memory; Organization

Student Facing

Find the value of each quotient mentally.

\(34,\!000\div10\)

\(340\div100\)

\(34\div10\)

\(3.4\div100\)

Student Response

For access, consult one of our IM Certified Partners.

Activity Synthesis

Ask students to share their strategies for each problem. Record and display their responses for all to see. To involve more students in the conversation, consider asking:

  • “Who can restate ___’s reasoning in a different way?”

  • “Did anyone have the same strategy but would explain it differently?”

  • “Did anyone solve the problem in a different way?”

  • “Does anyone want to add on to _____’s strategy?”

  • “Do you agree or disagree? Why?”

Speaking: MLR8 Discussion Supports. Display sentence frames to support students when they explain their strategy. For example, "First, I _____ because . . ." or "I noticed _____ so I . . . ." Some students may benefit from the opportunity to rehearse what they will say with a partner before they share with the whole class.
Design Principle(s): Optimize output (for explanation)

9.2: Selling Paintings (15 minutes)

Activity

In this activity, students begin to see numerical evidence that different samples can produce different results and thus different estimates for population characteristics (MP2). Students look at a small population and some different collections of samples from this population. Although the data for this population is small enough that it is not necessary to use a sample, it is helpful to get an idea of how data from a sample compares to the population data.

Launch

Arrange students in groups of 2. In each group, one student should be assigned to work with mean as their measure of center and the other should work with median as their measure of center.

Tell students that, often in this unit, the data sets are small enough that sampling is not necessary, but it will be easier to work with small data sets so that we may compare information from the sample to the same information from the population.

Representation: Internalize Comprehension. Activate or supply background knowledge of calculating measures of center. Allow students to use calculators to ensure inclusive participation in the activity.
Supports accessibility for: Memory; Conceptual processing
Speaking: MLR8 Discussion Supports. Use this routine to support whole-class discussion. For each response to the discussion questions, ask students to restate and/or revoice what they heard using precise mathematical language. Ask the original speaker whether their peer was accurately able to restate their thinking. Call students’ attention to any words or phrases that helped clarify the original statement. This will provide more students with an opportunity to produce language as they interpret the reasoning of others.
Design Principle(s): Support sense-making

Student Facing

Your teacher will assign you to work with either means or medians.

  1. A young artist has sold 10 paintings. Calculate the measure of center you were assigned for each of these samples:

    1. The first two paintings she sold were for $50 and $350.

    2. At a gallery show, she sold three paintings for $250, $400, and $1,200.

    3. Her oil paintings have sold for $410, $400, and $375.

  2. Here are the selling prices for all 10 of her paintings:

    $50

    $200

    $250

    $275

    $280

    $350

    $375

    $400

    $410

    $1,200

    Calculate the measure of center you were assigned for all of the selling prices.

  3. Compare your answers with your partner. Were the measures of center for any of the samples close to the same measure of center for the population?

Student Response

For access, consult one of our IM Certified Partners.

Activity Synthesis

The purpose of this discussion is to show that different samples can result in different estimates for a population characteristic as well as a reminder of reasons we might choose one measure of center over another.

Some questions for discussion:

  • “What is the population for this situation?” (All of the paintings sold.) 
  • “What are the samples used in the calculations?” (The first two paintings sold, those sold at a gallery show, and the oil paintings.)
  • “Why did the different samples have different means?” (Because they used different paintings.)
  • “Why were the means for the first two paintings sold and those sold at the gallery show so far off from the mean of all the paintings?” (Because they contained the cheapest one and most expensive one, respectively, with only a few other numbers to balance it out.)
  • “Based on the numbers in the population, does it make more sense to use median or mean?” (Median since the \$1,200 painting is much greater than the rest of the values, so the measure of center is affected much more by the one painting when using mean.)

9.3: Sampling the Fish Market (15 minutes)

Activity

In this activity, students begin to see that some samples represent the population better than others. Students compare the dot plot of a population of data with the dot plots of several samples and discuss some aspects that would make some samples better than others (MP7). In the discussion, the phrase representative sample is defined. 

Launch

Arrange students in groups of 2. Allow students 3 minutes of quiet work time followed by a partner discussion and whole-class discussion.

Student Facing

The price per pound of catfish at a fish market was recorded for 100 weeks.

  1. Here are dot plots showing the population and three different samples from that population. What do you notice? What do you wonder?
  2. If the goal is to have the sample represent the population, which of the samples would work best? Which wouldn't work so well? Explain your reasoning.

To use this applet, drag the gray bar at the bottom up to see the sample dot plots.

Student Response

For access, consult one of our IM Certified Partners.

Launch

Arrange students in groups of 2. Allow students 3 minutes of quiet work time followed by a partner discussion and whole-class discussion.

Student Facing

The price per pound of catfish at a fish market was recorded for 100 weeks.

  1. Here are dot plots showing the population and three different samples from that population. What do you notice? What do you wonder?

  2. If the goal is to have the sample represent the population, which of the samples would work best? Which wouldn't work so well? Explain your reasoning.

Population

A dot plot labeled dollars per pound of catfish. The numbers 1.6 through 2.8 in increments of zero point 2 are indicated.


Sample 1

A dot plot labeled dollars per pound of catfish. The numbers 1.6 through 2.8 in increments of zero point 2 are indicated

Sample 2

A dot plot labeled dollars per pound of catfish. The numbers 1.6 through 2.8 in increments of zero point 2 are indicated.

Sample 3

A dot plot labeled dollars per pound of catfish. The numbers 1.6 through 2.8 in increments of zero point 2 are indicated

Student Response

For access, consult one of our IM Certified Partners.

Student Facing

Are you ready for more?

When doing a statistical study, it is important to keep the goal of the study in mind. Representative samples give us the best information about the distribution of the population as a whole, but sometimes a representative sample won’t work for the goal of a study!

For example, suppose you want to study how discrimination affects people in your town. Surveying a representative sample of people in your town would give information about how the population generally feels, but might miss some smaller groups. Describe a way you might choose a sample of people to address this question.

Student Response

For access, consult one of our IM Certified Partners.

Activity Synthesis

Ask several groups to share things they noticed and wondered about the dot plots. Record responses for all to see. If possible, display the dot plots to refer to while students share.

Consider asking these discussion questions:

  • “What are some aspects that make for a good sample? Bad?” (A sample is “good” if it has a similar distribution to the population data. A sample is “bad” if the data does not have a similar distribution to the population data. For example, Sample 2 is bad because it is not centered in the same place.)
  • “If you were to find a measure of center to represent a typical value for the population, would you use mean or median?” (Median since the data is not approximately symmetric.)
  • “The population in this example has a mean of $2.06 and a median of $1.95. Sample 1 has a mean of $2.09 and median of $2. Sample 2 has a mean of $1.79 and a median of $1.80. Sample 3 has a mean of $2.36 and a median of $2.45. Based on this information, which seems to represent the population the best?” (Sample 1.)

Define representative sample. A representative sample is a sample that has a distribution that closely resembles the population distribution in center, shape, and spread.

Explain that a sample with the same mean as the population is not necessarily representative, since it may miss other important aspects of the population. 

  • Example 1: If the population for a question is all of the humans in the world and you use one person from each country as your sample, it may not actually be representative of the population. Larger countries, such as China are under-represented since there are actually many Chinese people, but only 1 is included in our sample. Similarly, a smaller country like Cuba might be over-represented since it has fewer people living there, but is represented in the sample exactly the same as all of the other larger countries.
  • Example 2: The average height of men in the world is approximately 70 inches. You might find two men, one who is 95 inches (7 feet 11 inches) tall and one who is 45 inches (3 feet 9 inches) tall. Their mean height may be the same as the world’s, but these two certainly do not represent the heights of most men.

Explain that a representative sample is the ideal type of sample we would like to collect, but if we do not know the data for the population, it will be hard to know if a sample we collect is representative or not. If we do know the population data, then a sample is probably unnecessary. In future lessons, we will explore methods of collecting samples that are more likely to produce representative samples (although they are still not guaranteed).

Representing, Speaking, Listening: MLR2 Collect and Display. Create a table with column headings “good sample” and “bad sample”. As students share aspects that make for a good or bad sample, write down the words and phrases students use in the appropriate column. Listen for and amplify words that compare features of the samples such as “similar or different center,” “shape,” or “spread.” Use the words and phrases that describe a good sample to define “representative sample.” This will help students use and connect mathematical language that makes a sample representative of the population.
Design Principle(s): Support sense-making; Maximize meta-awareness

9.4: Auditing Sales (10 minutes)

Optional activity

This activity is additional practice for students to understand the relationship between a sample and population. It may take additional time, and so is included as an optional activity.

In this activity, students attempt to recreate the data from the population data using three given samples (MP2). It is important for students to recognize that this is difficult to do and that some samples are more representative than others. Without knowing the population data, though, it can be difficult to know which samples will be representative. Methods for selecting samples in an unbiased way are explored in future lessons.

Launch

Keep students in groups of 2.

Remind students of the activity from a previous lesson where students selected papers (labeled A through O) from the bag and guessed at the sample space. That was an example of trying to interpret information about the population given a sample of information.

Read the first sentence of the task statement: “An online company tracks the number of pieces of furniture they sell each month for a year.” And then ask the students, “How many dots should be represented in the population data for one year?” (12, one for each month of the year.)

Allow students 5 minutes of partner work time followed by a whole-class discussion.

Student Facing

An online shopping company tracks how many items they sell in different categories during each month for a year. Three different auditors each take samples from that data. Use the samples to draw dot plots of what the population data might look like for the furniture and electronics categories.

Auditor 1’s sample

A dot plot for “monthly sales of furniture online in hundreds.” The numbers 66 through 74 are indicated. 

Auditor 2’s sample

A dot plot for “monthly sales of furniture online in hundreds.” The numbers 66 through 74 are indicated. The data titled "Auditor two's sample" are as follows: 70 hundred, 3 dots.

Auditor 3’s sample

A dot plot for “monthly sales of furniture online in hundreds.” The numbers 66 through 74 are indicated. 

Population

A blank number line for “monthly sales of furniture online in hundreds.” The numbers 66 through 74 are indicated.

 

Auditor 1’s sample

A dot plot for “monthly sales of electronics online in thousands.” The numbers 38 through 43 are indicated

Auditor 2’s sample

A dot plot for “monthly sales of electronics online in thousands.” The numbers 38 through 43 are indicated

Auditor 3’s sample

A dot plot for “monthly sales of electronics online in thousands.” The numbers 38 through 43 are indicated.

Population

A blank number line for “monthly sales of electronics online in thousands.” The numbers 38 through 43 are indicated.

Student Response

For access, consult one of our IM Certified Partners.

Anticipated Misconceptions

Students may consider that each of the auditors’ samples should be added together to create one larger sample, rather than considering that the auditors may have chosen the same data point in their separate samples.

Therefore, each auditor having a data point at 41,000 may mean that there is only one data point there, and each auditor included it in the sample, or it may mean that there are actually three data points there and each auditor included a different point from the population. 

Activity Synthesis

The purpose of the discussion is for students to understand that getting an understanding of the population data from a sample can be very difficult, especially when it is not known whether samples are representative of the population or not.

Display the population dot plots for all to see.

For furniture sales, the samples came from data represented in this dot plot.

Dot plot. Monthly sales of furniture online in hundreds. 

For electronics sales, the samples came from data represented in this dot plot.

Dot plot. Monthly sales of electronics online in thousands. 

Ask students

  • “How close was your estimate to the actual dot plot? Consider the shape, center, and spread of the data in your answer.”
  • “Were any samples better at mimicking the population than others?”
  • “What could the auditors have done to make their samples more representative of the population data without knowing what the population would be?” (They could include more information in their samples. They should also think about how the samples were selected. For example, if the auditors only came on months when there were large sales happening, they may be missing important data.)

We will explore how to be careful about selecting appropriate samples in future lessons.

Representing, Conversing: MLR7 Compare and Connect. Invite students to prepare a visual display that shows their dot plots for what the population data might look like for the furniture and electronics categories. Students should consider how to display their work so another student can interpret what is shown. Some students may wish to add notes or details to their drawings to help communicate their thinking. Invite students to investigate each other’s work and to compare representations. Listen for and amplify the language students use to describe the data plot for the population, and how they explain why it is difficult to create an accurate population dot plot, given three small samples. This will foster students’ meta-awareness and support constructive conversations as they relate sample and population data.
Design Principle(s): Optimize output (for comparison); Cultivate conversation

Lesson Synthesis

Lesson Synthesis

Consider asking these discussion questions:

  • “What does it mean for a sample to be representative of the population?” (The sample has a similar center, shape, and spread as the population data.)
  • “Why might it be important to get a representative sample rather than a more convenient sample?” (If we are going to answer questions about the entire population, it is useful if the sample looks similar to the population data. If not, we may miss some important information.)
  • “Usually, a sample is used because we can’t get data for the entire population. How do we know if the sample is representative of the population if we don’t know the population?” (It is OK for students to struggle with this answer at this point. In the next lesson, we’ll explore ways to make our best attempt at getting a representative sample.)

9.5: Cool-down - Reviews for School Lunches (5 minutes)

Cool-Down

For access, consult one of our IM Certified Partners.

Student Lesson Summary

Student Facing

A sample that is representative of a population has a distribution that closely resembles the distribution of the population in shape, center, and spread.

For example, consider the distribution of plant heights, in cm, for a population of plants shown in this dot plot. The mean for this population is 4.9 cm, and the MAD is 2.6 cm.

A dot plot for “height in centimeters.” The numbers 1 through 11 are indicated. 

A representative sample of this population should have a larger peak on the left and a smaller one on the right, like this one. The mean for this sample is 4.9 cm, and the MAD is 2.3 cm.

A dot plot for “height in centimeters.” The numbers 1 through 11 are indicated. 

Here is the distribution for another sample from the same population. This sample has a mean of 5.7 cm and a MAD of 1.5 cm. These are both very different from the population, and the distribution has a very different shape, so it is not a representative sample.

A dot plot for “height in centimeters.” The numbers 1 through 11 are indicated.