9.1: Used Car Relationships
Describe the strength and sign of the relationship you expect for each pair of variables. Explain your reasoning.
- Used car price and original sale price of the car.
- Used car price and number of cup holders in the car.
- Used car price and number of oil changes the car has had.
- Used car price and number of miles the car has been driven.
9.2: Cause or Effect?
Each of the scatter plots show a strong relationship. Write a sentence or two describing how you think the variables are related.
During the month of April, Elena keeps track of the number of inches of rain recorded for the day and the percentage of people who come to school with rain jackets.
A school book club has a list of 100 books for its members to read. They keep track of the number of pages in the books the members read from the list and the amount of time it took to read the book.
Number of tickets left for holiday parties at a venue and noise level at the party.
The height and score on a test of vocabulary for several children ages 6 to 13.
9.3: Find Your Cause
Describe a pair of variables with each condition. Explain your reasoning.
- Two variables with a causal relationship.
- The variables are strongly related, but a third factor might be the cause for the changes in the variables.
- The variables are only weakly related.
Look through news articles or advertisement for claims of causation or correlation. Find 2 or 3 claims and read or watch the articles or the advertisement. Answer these questions for each of the claims.
What is the claim?
What evidence is provided for the claim?
Does there appear to be evidence for causation or correlation? Explain your thinking.
Choose the claim with the least or no evidence. Describe an experiment or other way that you could collect data to show correlation or causation.
Humans are wired to look for connections and then use those connections to learn about the world around them. One way to notice connections is by looking for a pair of variables with a relationship. In order to learn about how the variables are related, we want to control one of the variables and see if there are changes in the other variable. For example, if we notice that people who tend to eat many calories also have a higher chance of having a heart attack, we might wonder if lowering our calorie intake would improve our health.
One common mistake people tend to make using statistics is to think that all relationships between variables are causal. Scatter plots can only show a relationship between the two variables. To determine if change in one of the variables actually causes a change in the other variable, or has a causal relationship, the context must be better understood and other options ruled out.
For example, we might expect to see a strong, positive relationship between the number of snowboard rentals and sales of hot chocolate during the months of September through January. This does not mean that an increase in snowboard rentals causes people to purchase more hot chocolate. Nor does it mean that increased sales of hot chocolate cause people to rent snowboards more. More likely there is a third variable, such as colder weather, that might be causing both variables to increase at the same time.
On the other hand, sometimes there is a causal relationship. A strong, positive relationship between hot chocolate sales and small marshmallow sales may be linked, because people buying hot chocolate may want to add small marshmallows to the drink, so an increase in the sales of hot chocolate are actually causing the marshmallow sale increase.
Finding relationships with the help of the correlation coefficient is a very good way to notice that there is a connection between variables. To determine whether the relationship is causal, the next step is usually to carefully design an experiment that isolates and precisely controls only one of the variables to determine how it affects the other variable.
- causal relationship
A causal relationship is one in which a change in one of the variables causes a change in the other variable.
- correlation coefficient
A number between -1 and 1 that describes the strength and direction of a linear association between two numerical variables. The sign of the correlation coefficient is the same as the sign of the slope of the best fit line. The closer the correlation coefficient is to 0, the weaker the linear relationship. When the correlation coefficient is closer to 1 or -1, the linear model fits the data better.
The first figure shows a correlation coefficient which is close to 1, the second a correlation coefficient which is positive but closer to 0, and the third a correlation coefficient which is close to -1.
- negative relationship
A relationship between two numerical variables is negative if an increase in the data for one variable tends to be paired with a decrease in the data for the other variable.
- positive relationship
A relationship between two numerical variables is positive if an increase in the data for one variable tends to be paired with an increase in the data for the other variable.
- strong relationship
A relationship between two numerical variables is strong if the data is tightly clustered around the best fit line.
- weak relationship
A relationship between two numerical variables is weak if the data is loosely spread around the best fit line.