Beruflich Dokumente
Kultur Dokumente
Does
wine consumption affect cause a decrease in
heart disease?
These questions reflect a desire to understand the
relationship between two variables.
What we need:
1. A plot/graph to view the relationship
2. Characteristics to describe
3. Measures of the characteristics
4. Method to make inferences about the relationship
Explanatory variable X
(independent variable)
Correlation & Regression
Do heavier people burn more energy?
1500
1000
30 40 50 60
Mass(kg)
300
hrt_death rate
200
100
0 1 2 3 4 5 6 7 8 9
Alcohol
wine consumption
• Patterns:
• Form (clusters, scatter, linear..)
• Direction (positive, negative)
• Strength ( how closely points follow form)
• Deviations:
• Outliers
Form: Linear is
probably the most
common form
Strength: We can
measure the strength of
Strength?
a linear relationship
…because our eyes can
deceive us!!!
Correlation
…measure the direction and strength of a linear relationship
• Quantitative variables
• Linear relationships
• r has no units
• r can be between –1 and
1
• Positive r =
positive association
• Negative r =
negative association
• 0 = no association
• r is influenced by
outliers
Do heavier people burn more energy?
Lean body mass vs. metabolic rate
2000
Rate(cal)
1500
1000
30 40 50 60
Mass(kg)
Males +
Rate(cal)
1500
Females o
1000
30 40 50 60
Mass(kg)
300
hrt_death rate
200
100
0 1 2 3 4 5 6 7 8 9
Alcohol
wine c onsumption
250
hrt death rate
200
150
1 2 3 4
Alc wine consumption
300
hrt_death rate
200
100
0 1 2 3 4 5 6 7 8 9
Alcohol
wine c onsumption
300
200
0 1 2 3 4 5 6 7 8 9
wine consumption
sy
b rsx b is the slope (rate of change in y when x
increases)
300
death rate
200
100
0 1 2 3 4 5 6 7 8 9
wine consumption
300
death rate
200
100
0 1 2 3 4 5 6 7 8 9
wine consumption
Alcohol
6 7 8 9
4. Individual points that are
extreme in the x
Do we have any influential direction
points here?
Correlation & Regression
Ideal residual pattern
Increasing variation
300
50
death rate
200
Residual
0
100
-50
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
wine consumption Alcohol
Regression Plot
Residuals Versus C5
C6 = 280.215 - 33.7666 C5
(response is C6)
S = 40.0879 R-Sq = 42.0 % R-Sq(adj) = 37.5 %
300 50
250
Residual 0
C6
200
150
-50
1 2 3 4
1 2 3 4
C5 C5