Lesson 5

Fitting Lines

5.1: Selecting the Best Line (5 minutes)

Warm-up

The mathematical purpose of this activity is for students to be able to visually assess the best line that fits data among a set of choices. Students are given a scatter plot and 2 lines that may fit the data. Students must select the line that better fits the data. The given lines address many common errors in student thinking about best fit lines including: going through the most points, dividing the data in half, and connecting the points on both ends of the scatter plot.

Listen for students using the terms slope and \(y\)-intercept.

Launch

Provide students access to the images. Give students 2 minutes of quiet time to work the questions.

Student Facing

Which of the lines is the best fit for the data in each scatter plot? Explain your reasoning.

1.

Scatter plot with two lines.

2.

Scatter plot with two lines of best fit.

3.

Scatter plot with two lines of best fit.

4.

Scatter plot with 2 lines of best fit.

Student Response

For access, consult one of our IM Certified Partners.

Activity Synthesis

The purpose of this discussion is to understand bad fit, good fit, and best fit. In each scatter plot, the solid line represents the line of best fit—except for the last two graphs, for which the dashed line is the best fit.

Ask a student who uses the term slope while working the questions, “Can you explain the relationship between the two lines in question about runs and wins using the concept of slope?” (The slope of the dashed line is positive and the slope of the solid line is negative.)

Ask a student who uses the term \(y\)-intercept, “Can you explain the significance of the \(y\)-intercept in the question about average survey scores and amount spent on dinner?” (The solid line will have a \(y\)-intercept less than the \(y\)-intercept for the dashed line. Because the two lines have approximately the same slope, they appear parallel in the scatter plot.)

If time permits, discuss questions such as:

  • “Is the dashed line in question about runs and wins a bad fit, good fit, or best fit?” (The line is a bad fit because it does not show the correct relationship between the variables. It shows that the value of \(y\) increases as the value of \(x\) increases, rather than the value of \(y\) decreasing as the value of \(x\) increases.)
  • “Is the dashed line in question about oil and gas a bad fit, good fit, or best fit?” (It is the best fit because it is close to going through the middle of the data and follows the same trend as the data.)
  • “What factors helped you select the linear model that fits the data best?” (The line should go through the middle of the data, follow the trend of the data, and have a similar number of points on each side of the line.)

5.2: Card Sort: Data Patterns (15 minutes)

Activity

The mathematical purpose of this activity is for students to:

  • distinguish between linear and nonlinear relationships in bivariate, numerical data
  • informally assess the fit of a linear model
  • compare the slope and the vertical intercepts of different linear models
  • describe the relationship between two variables.

Students are given cards showing scatter plots and a linear model. They sort the cards based on how well the lines fit the data as well as by slope and intercept in increasing order.

A sorting task gives students opportunities to analyze representations, statements, and structures closely and make connections (MP2, MP7).

Launch

Arrange students in groups of 2. Give students a chance to familiarize themselves with what is on the cards. For example, you might ask them to sort the cards into categories of their choosing, and explain their categories to their partner.

Conversing: MLR8 Discussion Supports. In pairs, ask students to take turns sorting the cards and explaining their reasoning to their partner. Display the following sentence frames for all to see: “ _____ should be before _____ because . . . .”, and “I noticed _____ , so I . . . .” Encourage students to challenge each other when they disagree. This will help students clarify their reasoning about linear models.
Design Principle(s): Support sense-making; Maximize meta-awareness
Action and Expression: Internalize Executive Functions. Provide students with a template for organizing their observations. Provide a template or invite students to fold a blank piece of paper in thirds, and label with three headers of “\(x\),” “\(y\)” and “linear model fits?” to collect their answers. Explain that in the first column, they will always write increasing, since they will be reading each graph from left to right, then have them fill in the behavior of the y values in the next column, and in the last column, their conclusions about whether a linear model fits well.
Supports accessibility for: Language; Organization

Student Facing

Your teacher will give you a set of cards that show scatter plots.

  1. Arrange all the cards in three different ways. Ensure that you and your partner agree on the arrangement before moving on to the next one. Sort all the cards in order from:
    1. best to worst for representing with a linear model
    2. least to greatest slope of a linear model that fits the data well
    3. least to greatest vertical intercept of a linear model that fits the data well
  2. For each card, write a sentence that describes how \(y\) changes as \(x\) increases and whether the linear model is a good fit for the data or not.

Student Response

For access, consult one of our IM Certified Partners.

Activity Synthesis

The purpose of this discussion is for students to discuss the goodness of fit for linear models.

Here are some questions for discussion.

  • “How are scatter plots of A and F the same? How are they different?” (They have the same slope and the linear model for each scatter plot are equally well fit. They are different because they have a different vertical intercept.)
  • “How do you know if a linear model is a good fit?” (You need to look at the scatter plot and the line of best fit and make a decision about whether or not the data follows a linear trend.)
  • “Why is the goodness fit for the linear model in scatter plot B better than the fit for the linear model in scatter plot A?” (The data in B falls on or very close to the linear model. The data in A is scattered around the line of best fit and has roughly the same number of values below that line of best fit as it does above the line of best fit.)

5.3: Fitting Lines with Technology (15 minutes)

Activity

It is recommended to use the digital version of this activity. The mathematical purpose of this activity is for students to use technology to compute a line of best fit for data given in a table, and to understand the meaning of the slope and \(y\)-intercept. If the digital version of the activity is not available, the students should be guided through using available technology to find the least-squares regression line as the line of best fit.

Launch

Assign one data table corresponding to the graphs in the previous activity to each group for question 5. Show students how to use the copy and paste features of their digital devices. Instruct students to copy the values from the data table and paste them into a blank line in the applet provided. The technology creates a scatter plot of the data in a table, and a movable line is already graphed for students to adjust. After groups have had a chance to estimate the best fit lines, pause the class. Show students how to use technology to find the least-squares regression line for the data and display the line with the scatter plot. Desmos can graph a line of best fit if you type in an equation, using a tilde (~) instead of an equals sign (=). You need to name the variables, \(x\) and \(y\), the same way they are named in the table, including the subscripts. The parameters can be named with any letter other than \(x\), \(y\), and \(e\). For example, type \(y_1 \sim ax_1 + b\), and Desmos will compute the values for the parameters \(a\) and \(b\).

If you will be using graphing technology other than Desmos for this activity, you may need to prepare alternate instructions.

Engagement: Develop Effort and Persistence. Provide prompts, reminders, or checklists that focus on increasing the length of on-task orientation in the face of distractions. For example, provide two copies of the steps: graph the table, find the best fit line, find the slope, and find the \(y\)-intercept. Include phrases to activate knowledge from prior activities such as, “As x increases . . .”
Supports accessibility for: Organization; Conceptual processing; Attention

Student Facing

Three ice cream cones

The weight of ice cream sold at a small store in pounds (\(x\)) and the average temperature outside in degrees Celsius (\(y\)) are recorded in the table.

\(x\) \(y\)
20 6
18 4.5
21 6.5
17 3.5
21.5 7.5
19.5 6.5
21 7
18 5

  1. For this data, create a scatter plot and a line that fits the data well.
  2. Use technology to compute the best fit line. Round any numbers to 2 decimal places.
  3. What are the values for the slope and \(y\)-intercept for the best fit line? What do these values mean in this situation?
  4. Use the best fit line to predict the \(y\) value when \(x\) is 10. Is this a good estimate for the data? Explain your reasoning.
  5. Your teacher will assign a table of data from the previous activity. Following your teacher’s directions, use technology and the table of data to create a scatter plot that also shows the line of best fit, and then interpret the slope and \(y\)-intercept.

Tables for last question:

A. (card A in the previous activity)

\(x\) \(y\)
1 2
2.2 4
3.3 5
3.3 4.5
3.6 6
3.8 6.5
3.9 5.7
4 7
4.4 6.5
4.5 7
4.7 7
4.8 6
4.9 8.7
5 7
5.1 7.7
5.2 6.7
5.5 8
5.5 8.5
6 9.5
6.6 8.6
7 9
7.7 10.313

B. (card B in the previous activity)

\(x\) \(y\)
1 11.86
2.2 11.332
3.3 10.848
3.4 10.741
3.6 10.716
3.8 10.628
3.9 10.584
4 10.54
4.4 10.364
4.5 10.32
4.7 10.232
4.8 10.188
4.9 10.144
5 10.1
5.1 10.056
5.2 10.5
5.5 9.88
5.7 9.753
6 9.66
6.6 9.396
7 9.22
7.7 8.912

C. (card C in the previous activity)

\(x\) \(y\)
1 6.11
2.2 7.142
3.3 8.088
3.5 8.19
3.6 8.346
3.8 2.92
3.9 8.604
4 8.69
4.4 9.034
4.5 9.12
4.7 9.292
4.8 13.6
4.9 9.464
5 9.55
5.1 9.636
5.2 9.722
5.5 9.98
5.8 10.32
6 10.41
6.6 10.926
7 11.27
7.7 11.872

D. (card E in the previous activity)

\(x\) \(y\)
1 13.9
2.2 11.5
3.3 9.3
3.5 9.2
3.6 8.7
3.8 8.3
3.9 8.1
4 7.9
4.4 7.1
4.5 6.9
4.7 6.5
4.8 6.3
4.9 6.1
5 5.9
5.1 5.7
5.2 5.5
5.5 4.9
5.8 4.3
6 3.9
6.6 1.3
7 1.9
7.7 0.5

E. (card F in the previous activity)

\(x\) \(y\)
1 6.5
2.2 8.5
3.3 9.5
3.3 9
3.6 10.5
3.8 11
3.9 10.2
4 11.5
4.4 11
4.5 11.5
4.7 11.5
4.8 10.5
4.9 13.2
5 11.5
5.1 12.2
5.2 11.2
5.5 12.5
5.5 13
6 14
6.6 13.1
7 13.5
7.7 14.813

Student Response

For access, consult one of our IM Certified Partners.

Launch

Provide data tables for the graphs for the cards from the previous activity that were fit well with a linear model. Assign one table to each group. For students using the paper task, show them how to use technology to create a scatter plot of the data in a table. After groups have had a chance to estimate the best fit lines, pause the class. Show students how to use technology to find the least-squares regression line for data and display the line with the scatter plot.

Display the tables for students to use for the last question:

A. (card A in the previous activity)
\(x\) \(y\)
1 2
2.2 4
3.3 5
3.3 4.5
3.6 6
3.8 6.5
3.9 5.7
4 7
4.4 6.5
4.5 7
4.7 7
4.8 6
4.9 8.7
5 7
5.1 7.7
5.2 6.7
5.5 8
5.5 8.5
6 9.5
6.6 8.6
7 9
7.7 10.313
B. (card B in the previous activity)
\(x\) \(y\)
1 11.86
2.2 11.332
3.3 10.848
3.4 10.741
3.6 10.716
3.8 10.628
3.9 10.584
4 10.54
4.4 10.364
4.5 10.32
4.7 10.232
4.8 10.188
4.9 10.144
5 10.1
5.1 10.056
5.2 10.5
5.5 9.88
5.7 9.753
6 9.66
6.6 9.396
7 9.22
7.7 8.912
C. (card C in the previous activity)
\(x\) \(y\)
1 6.11
2.2 7.142
3.3 8.088
3.5 8.19
3.6 8.346
3.8 2.92
3.9 8.604
4 8.69
4.4 9.034
4.5 9.12
4.7 9.292
4.8 13.6
4.9 9.464
5 9.55
5.1 9.636
5.2 9.722
5.5 9.98
5.8 10.32
6 10.41
6.6 10.926
7 11.27
7.7 11.872
D. (card E in the previous activity)
\(x\) \(y\)
1 13.9
2.2 11.5
3.3 9.3
3.5 9.2
3.6 8.7
3.8 8.3
3.9 8.1
4 7.9
4.4 7.1
4.5 6.9
4.7 6.5
4.8 6.3
4.9 6.1
5 5.9
5.1 5.7
5.2 5.5
5.5 4.9
5.8 4.3
6 3.9
6.6 1.3
7 1.9
7.7 0.5
E. (card F in the previous activity)
\(x\) \(y\)
1 6.5
2.2 8.5
3.3 9.5
3.3 9
3.6 10.5
3.8 11
3.9 10.2
4 11.5
4.4 11
4.5 11.5
4.7 11.5
4.8 10.5
4.9 13.2
5 11.5
5.1 12.2
5.2 11.2
5.5 12.5
5.5 13
6 14
6.6 13.1
7 13.5
7.7 14.813

Engagement: Develop Effort and Persistence. Provide prompts, reminders, or checklists that focus on increasing the length of on-task orientation in the face of distractions. For example, provide two copies of the steps: graph the table, find the best fit line, find the slope, and find the \(y\)-intercept. Include phrases to activate knowledge from prior activities such as, “As x increases . . .”
Supports accessibility for: Organization; Conceptual processing; Attention

Student Facing

The weight of ice cream sold at a small store in pounds (\(x\)) and the average temperature outside in degrees Celsius (\(y\)) are recorded in the table.

Three ice cream cones
\(x\) 20 18 21 17 21.5 19.5 21 18
\(y\) 6 4.5 6.5 3.5 7.5 6.5 7 5
  1. For this data, create a scatter plot and sketch a line that fits the data well.
  2. Use technology to compute the best fit line. Round any numbers to 2 decimal places.
  3. What are the values for the slope and \(y\)-intercept for the best fit line? What do these values mean in this situation?
  4. Use the best fit line to predict the \(y\) value when \(x\) is 10. Is this a good estimate for the data? Explain your reasoning.
  5. Your teacher will give you a data table for one of the other scatter plots from the previous activity. Use technology and this table of data to create a scatter plot that also shows the line of best fit, then interpret the slope and \(y\)-intercept.

Student Response

For access, consult one of our IM Certified Partners.

Student Facing

Are you ready for more?

Priya uses several different ride services to get around her city. The table shows the distance, in miles, she traveled during her last 10 trips and the price of each trip, in dollars.

distance (miles)

price ($)

3.1

12.5

4.2

14.75

5

16

3.5

13.25

2.5

12

1

9

0.8

8.75

1.6

9.75

4.3

12

3.3

14

  1. Priya creates a scatter plot of the data using the distance, \(x\), and the price, \(y\). She determines that a linear model is appropriate to use with the data. Use technology to find the equation of a line of best fit.

  2. Interpret the slope and the \(y\)-intercept of the equation of the line of best fit in this situation.

  3. Use the line of best fit to estimate the cost of a 3.6-mile trip. Will this estimate be close to the actual value? Explain your reasoning.

  4. On her next trip, Priya tries a new ride service and travels 3.6 miles, but pays only \$4.00 because she receives a discount. Include this trip in the table and calculate the equation of the line of best fit for the 11 trips. Did the slope of the equation of the line of best fit increase, decrease, or stay the same? Why? Explain your reasoning.

  5. Priya uses the new ride service for her 12th trip. She travels 4.1 miles and is charged \$24.75. How do you think the slope of the equation of the line of best fit will change when this 12th trip is added to the table?

Student Response

For access, consult one of our IM Certified Partners.

Anticipated Misconceptions

Students may struggle with interpreting slope and \(y\)-intercept. Remind students of how each relates to a situation. To help students interpret slope, ask them: “What does the \(x\) variable represent? What does the \(y\) variable represent? How is slope connected to the \(x\) and \(y\) variables? What happens to \(x\) as \(y\) increases (or decreases)?” To help students interpret the \(y\)-intercept, ask them: “What does the point \((20,6)\) mean in the scatter plot? What are the coordinates of the \(y\)-intercept? What do each of the coordinates mean in the situation described? What is the \(y\) value when \(x\) is 0? Which variable has a value of 0? Which variable is represented with y?”

Activity Synthesis

The purpose of this discussion is for students to make connections between the scatter plot and the equation of the line of best fit. Display each scatter plot, the line of best fit, and the equation of the line of best fit.

A. \(y = 1.1979x + 1.3196\)

Scatter plot.

B. \(y = -0.4337x + 12.288\)

Scatter plot.

C. \(y = 0.9749x + 4.6529\)

Discrete graph with one line.

D. \(y = -2.063x + 16.144\)

Discrete graph with one line.

E. \(y = 1.1979x + 5.8196\)

Discrete graph with one line.

Here are some questions for discussion:

  • “How does using technology help model the data in the scatter plot?” (It allows different people to come up with the same equation for the line of best fit. If the line is just drawn by hand, there can be different linear equations that seem to fit the data well, but there is only one “best” fit line.)
  • “What does the \(y\)-intercept represent in each scatter plot? When is it reasonable to use this interpretation?” (It represents the value of \(y\) estimated by the linear model when \(x = 0\). When the intercept is near the range of the data, it can be reasonable to use this interpretation because otherwise, the linear trend may not continue. There are also some situations in which a value of 0 for \(x\) does not make sense.)
  • “Why is the slope the same in scatter plot A and scatter plot F?” (It is the same because the data in scatter plot F is the same data as in scatter plot A, except that the values for \(y\) have all been increased by 4.5 units.)

Tell students they should be careful when predicting values outside the range of the data, in particular, for the \(y\)-intercept. Even when the data is fit well by a linear model, the behavior of the variables farther away may not be linear. It is important to remember that all predictions using the best fit line are estimates and the reasonableness of the predictions should be considered.

Representing, Conversing: MLR7 Compare and Connect. As students share the connections they noticed between the scatter plot and the equation of the line of best fit, call students’ attention to the different ways the slope and vertical intercept are represented. Wherever possible, amplify student words and actions that describe the connections between a specific feature of one mathematical representation and a specific feature of another representation.
Design Principle(s): Maximize meta-awareness; Support sense-making

Lesson Synthesis

Lesson Synthesis

Here are some questions for discussion.

  • “What is a line of best fit?” (The best linear model for the data.)
  • “How do you know that you have a line of best fit?” (You can use technology to generate the line of best fit, but then you need to graph it on the scatter plot to verify that it fits the data well. It needs to follow the trend of the data, it should go roughly through the middle of the data, and it should have roughly the same number of data points on either side of it.)
  • “Which line fits the data better, the solid line or the dashed line? Do you think it is the line of best fit? Explain your reasoning.” (The dashed line fits the data better, because its slope and vertical intercept more closely resemble the trend of the data than the slope and vertical intercept of the solid line. It is not the line of best fit because it does not go through the middle of the data. It should be a little lower on the graph.)
    Discrete graph with 2 lines.

5.4: Cool-down - Fresh Air (5 minutes)

Cool-Down

For access, consult one of our IM Certified Partners.

Student Lesson Summary

Student Facing

Some data appear to have a linear relationship, so finding an equation for a line that fits the data can help you understand the relationship between the variables.

A scatterplot. Horizontal, from 0 to 12, by 1's, labeled precipitation in centimeters. Vertical, 0 to 15, by 1’s, labeled crop yield, pounds per square meter. 19 dots trending upward and to the right.
 

Other data may follow non-linear trends or not have an apparent trend at all.

A scatterplot. Horizontal, from 0 to 12, by 1's, labeled precipitation in centimeters. Vertical, 0 to 140, by 10’s, labeled number of watermelons. 18 dots in a cluster near top right corner of graph.

When modeling data with a linear function seems useful, it is important to find a linear function that is close to the data. The line should have a \(y\)-intercept and slope to follow the shape of the data in the scatter plot as much as possible.

Technology can be used to quickly find a line of best fit for the data and provide the equation of the line that we can use to analyze the situation.