Lesson 9
Variability in Samples
9.1: Selecting Samples (10 minutes)
Warm-up
The mathematical goal of this activity is for students to calculate means and sample proportions to estimate the population mean and sample proportion in a future activity. Students should recognize that different samples can result in similar, but different means and proportions. The data collected in this warm-up is used in the next activity to begin working to understand margin of error.
Launch
Give each student one copy of the table from the blackline master. Demonstrate that students should select a value for their sample by rolling the number cube once for the row of the table and once for the column of the table. Tell students that a class histogram will be created using their data. Collect the sample means and proportions from the class to create histograms using the class data.
Student Facing
Coins are usually stamped with the year and location of the mint where they were made. D represents the mint in Denver, Colorado, and a blank or P represents the mint in Philadelphia, Pennsylvania.
Diego has a jar containing 36 coins. Select a coin by using the applet to generate a pair of numbers: one number for the row in the table and a second number for the column. For example, rolling a 3 and then a 5 would mean choosing the coin in the third row and fifth column, which is the coin marked 2000 P. Repeat this process to collect a sample of 5 coins.
coin 1 | coin 2 | coin 3 | coin 4 | coin 5 | sample mean year | sample proportion minted in Denver | |
---|---|---|---|---|---|---|---|
sample 1 | |||||||
sample 2 | |||||||
sample 3 |
- Find the mean date for the sample of 5 coins.
- Find the proportion of the sample of 5 coins that were minted in Denver.
- Repeat the process to find 2 more samples of 5 coins, then compute the mean date and proportion that were minted in Denver.
Student Response
For access, consult one of our IM Certified Partners.
Launch
Give each student one standard number cube and one copy of the table from the blackline master. Demonstrate that students should select a value for their sample by rolling the number cube once for the row of the table and once for the column of the table. Tell students that a class histogram will be created using their data. Collect the sample means and proportions from the class to create histograms using the class data.
Student Facing
Coins are usually stamped with the year and location of the mint where they were made. D represents the mint in Denver, Colorado, and a blank or P represents the mint in Philadelphia, Pennsylvania.
Diego has a jar containing 36 coins. Select a sample of 5 coins by rolling your number cube once to represent the row and rolling again to find the column. For example, rolling a 3 and then a 5 would represent selecting the coin marked 2000 P. Repeat this process to collect a sample of 5 coins.
coin 1 | coin 2 | coin 3 | coin 4 | coin 5 | sample mean year | sample proportion minted in Denver | |
---|---|---|---|---|---|---|---|
sample 1 | |||||||
sample 2 | |||||||
sample 3 |
- Find the mean for the sample of 5 coins.
- Find the proportion of the sample of 5 coins that were minted in Denver.
- Repeat the process to find 2 more samples of 5 coins, then compute the mean and proportion that were minted in Denver.
Student Response
For access, consult one of our IM Certified Partners.
Activity Synthesis
The goal of this discussion is to make sure students understand the difference between a sample mean and a sample proportion and to collect the sample means and proportions from the class to create histograms using the class data.
Collect the sample means and proportions from the class and create a histogram for each to be used in the following activity.
Here are some questions for discussion:
- “What was the lowest sample mean? The highest?” (Sample response: 2003.6 and 2013.8)
- “What was the lowest sample proportion? The highest?” (Sample response: 0.4 and 0.8)
- “When making a histogram of the sample means for the year and another histogram for the sample proportions minted in Denver, what interval size makes sense for each histogram?” (It makes sense to have an interval of 0.5 or 1 for the years and 0.2 for the sample proportions)
- “What information does the sample mean tell you?” (It tells you the mean of the years the 5 randomly selected coins were minted.)
- “What information does the sample proportion tell you?” (It tells you the proportion of the 5 randomly selected coins that were minted in Denver.)
9.2: Examining Sample Statistics (10 minutes)
Activity
The mathematical purpose of this activity is for students to gain a conceptual understanding of margin of error using class data from a previous activity. Students should use the data collected in the warm-up to estimate a mean and proportion for the population based on samples and express a margin of error. In the next activity, students will be given a rule of thumb for writing the margin of error.
Launch
Display the histograms of sample means and proportions using the sample means and proportions from the class warm-up for all to see.
Supports accessibility for: Language; Social-emotional skills
Student Facing
Use the data from the warm-up to answer the questions.
- Not all the samples have the same mean and proportion. Why not?
- Examine the histogram for the mean year of coins from the samples. What do you notice?
- Based on the mean years from the samples, estimate the mean year for all the coins in Diego’s jar. Explain your reasoning.
- Examine the histogram of the proportion of coins in each sample that were minted in Denver. What do you notice?
- Based on the sample proportions found by the class, estimate the proportion of coins minted in Denver for all the coins in Diego’s jar. Explain your reasoning.
Student Response
For access, consult one of our IM Certified Partners.
Activity Synthesis
The purpose of this discussion is for students to begin to develop a conceptual understanding of margin of error.
Here are some questions for discussion:
- “Use the histogram of sample means to answer these questions. If the mean year was 2011, would that surprise you? Would a mean year of 2017 be surprising? Would a mean year of 1999 be surprising? Explain your reasoning.” (No, yes, yes. 2011 is not surprising since it is close to the middle, but the others would be since there were no sample means near those values.)
- “The actual mean year is 2010. Does that make sense based on the histogram of sample means?” (Yes, that makes sense because the year 2010 is the middle of distribution shown in the histogram.)
- “The actual proportion of these 36 coins that were minted in Denver is about 0.42. Does that make sense based on the histogram of sample proportions?” (Yes, 0.4 was the most frequent value shown in the histogram.)
Design Principle(s): Support sense-making
9.3: Variability of Sample Estimates (10 minutes)
Activity
The mathematical purpose of this activity is to use the standard deviation to estimate margin of error. Students are introduced to a way to quantify a margin of error based on 2 standard deviations.
Launch
Supports accessibility for: Conceptual processing; Visual-spatial processing
Student Facing
A political campaign sends volunteers out into the various parts of the state to get a sense of how well their candidate will do in an upcoming election. Thirty volunteers each get a random sample of 10 people in the state and find the proportion of people who are expecting to vote for their candidate. The sample proportions are summarized in the histogram.
The mean of these sample proportions is 0.55, and the standard deviation is 0.15.
- Recall that, for normally distributed data, about 95% of the data is within 2 standard deviations of the mean. What percentage of sample proportions are within 2 standard deviations of the mean for these data? Does this match what we expect from approximately normally distributed data? Explain or show your reasoning.
- Estimates for population characteristics are usually given along with a margin of error. The margin of error is the maximum expected difference between the estimate of the population characteristic and the actual population characteristic. Each of the sample proportions are good estimates of the population proportion, so we should give a margin of error that contains about 95% of the sample proportions to be reasonably sure that the actual population proportion is in the range between the mean minus the margin of error to the mean plus the margin of error. What margin of error should be given along with the estimate of 0.55 for the population proportion? Explain or show your reasoning.
Student Response
For access, consult one of our IM Certified Partners.
Student Facing
Are you ready for more?
The margin of error we constructed here was constructed at a 95% confidence level. Since 95% of the time our sample proportion is within 2 standard deviations of the true proportion for the population, by using a margin of error of 2 standard deviations we will capture the true proportion in our interval 95% of the time.
- What would happen to the margin of error if we were okay with only capturing the true proportion 90% of the time?
- What would happen to the margin of error if needed to capture the true proportion 99% of the time?
- Why might someone choose a different confidence level?
Student Response
For access, consult one of our IM Certified Partners.
Anticipated Misconceptions
Some students may notice the isolated bar and claim that the data is not normally distributed. These students should be encouraged to continue wondering about extreme data as shown in the histogram, but for the purposes of this lesson, tell students that the data is approximately normal and some information about this distribution can be estimated by modeling it with a normal distribution.
Activity Synthesis
The purpose of this discussion is for students to estimate the margin of error using the mean and standard deviation of sample proportions. Here are some questions for discussion.
- “When I say, ‘The mean is 0.55 with a margin of error of 0.3’, what does that mean in this context?” (It means that about 95% of the sample proportions are between 0.25 and 0.85.)
- “Is a sample proportion of 0.20 a plausible estimate of population proportion?” (No, because it does not fall between 0.25 and 0.85, so it would be considered unusual.)
- “Imagine a similar scenario in which the mean of the sample proportions is 0.85 and the standard deviation is 0.06. What is the margin or error? What interval of values would be a good estimate for the population proportion? Explain your reasoning” (The margin of error is 0.12 since that is twice the standard deviation. This means that the population proportion is reasonably within the interval of 0.73 and 0.97 since those are the values that are within the margin of error of the mean.)
Tell students, “We use the mean of the sample proportion plus or minus the margin of error to represent an interval of plausible estimates of the population proportion.” Consider telling students, “Although we used many samples here to get an idea of what happens when you can take many samples (quickly and efficiently), this is not always possible (or easy to do). In the next lesson, we’ll explore how to make estimates based on a single sample.”
Lesson Synthesis
Lesson Synthesis
Here are some questions for discussion:
- “Why do we see variability between different random samples from the same population?” (If there is variation in the population, a sample will usually include some of this variation. With a random sample, we would expect differences in the sample to be reflective of variation in the population.)
- “Think about flipping a coin 8 times. How many heads would you expect to get? Explain your reasoning.” (I think I would get between 3 and 5 or between 2 and 6. It seems unlikely to get 0 or 1 heads or 7 or 8 heads.)
- “Why does it make sense to report the number of heads with more than one value?” (It makes sense because the number of heads when flipping coin 8 times is not predictable, but we can say that we expect it to happen in a certain interval most of the time or almost all of the time.)
- “If everyone in the class flipped a coin eight times, how could we use the results to find the margin of error?” (We could find the mean and standard deviation of the number of heads flipped. The margin of error would be the 2 times the standard deviation.)
- “Why do we double the standard deviation to find the margin of error?” (For normally distributed data, we expect 95% of the data to be within 2 standard deviations of the mean. So, it is likely that this range of estimates will include the actual population characteristic.)
9.4: Cool-down - What’s Wrong With That? (5 minutes)
Cool-Down
For access, consult one of our IM Certified Partners.
Student Lesson Summary
Student Facing
In many cases, it is difficult to collect data from an entire population, so using data from a small subset of the larger group is needed. The trade-off is that the incomplete information from such samples can only provide estimates of characteristics for the population.
For example, a researcher may wish to know how many fish of each type are present in a lake. It would be hard to collect all the fish in the pond to know all of the information, so a small group of fish might be caught to estimate the populations of the lake as a whole. Depending on how the fish are caught, the types of fish caught may be reflective of the entire lake or not.
To understand how varied the lake's fish are, the researcher may want to take several samples of fish from the lake. After taking many samples of fish, the researcher may find that the number of bass in a sample ranges from 2 to 20 and the number of catfish in the sample ranges from 5 to 7. Since the samples include a lot of different possibilities for the bass, the researcher might indicate that they have low confidence in an estimate for the bass population. On the other hand, the number of catfish in each sample is fairly consistent, so the researcher may be able to provide more confidence in an estimate for the catfish population in the lake.
To give a sense of the variability and confidence in estimates, a margin of error is usually given along with the estimate. A margin of error is the maximum expected difference between an estimate for a population characteristic and the actual value of the population characteristic. Estimates from samples tend to be approximately normal, so it is reasonable to expect that about 95% of the estimates are within 2 standard deviations of the mean from the estimates. In this unit, we will use 2 standard deviations as the margin of error.
From the fish example, the researcher may use the sample to estimate that there are 800 bass in the lake with a margin of error of 300 bass. This means that the researcher is fairly confident that the number of bass in the lake is somewhere between 500 and 1,100. The researcher may estimate that there are 650 catfish in the lake with a margin of error of 50. This means that the researcher is fairly confident that the number of catfish in the lake is between 600 and 700. Since the number of catfish in the samples was fairly consistent, there is a much smaller margin of error for the catfish population than there was for the bass population.