Beruflich Dokumente
Kultur Dokumente
9
Graphing Bivariate Numerical Data
C
h
a
p
t
e
r
Terminology
Independent variable
The variable you have control over, what you can choose and
manipulate. It is usually what you think will affect the
dependent variable.
Dependent variable
2
2
C
h
a
p
t
e
r
2
3
C
h
a
p
t
e
r
Scatterplots
HS GPA
4
C
h
a
p
t
e
r
Example
7
5
C
h
a
p
t
e
r
Example
y
10
9
8
7
6
5
4
3
2
1
C
h
a
p
t
e
r
4-Steps to Describe a
Scatterplot
2
7
C
h
a
p
t
e
r
575
550
525
500
475
450
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Participation Rate (proportion of state's 2008 graduating seniors who took the SAT)
C
h
a
p
t
e
r
575
550
525
500
475
450
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Participation Rate (proportion of state's 2008 graduating seniors who took the SAT)
C
h
a
p
t
e
r
575
550
525
500
475
450
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Participation Rate (proportion of state's 2008 graduating seniors who took the SAT)
10
C
h
a
p
t
e
r
575
550
525
500
475
450
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Participation Rate (proportion of state's 2008 graduating seniors who took the SAT)
11
correlation
linear
regression
1
Terminology
C
h
a
p
t
e
r
Correlation
Measures the strength of a certain type of
relationship between two measurement
variables.
11
2
C
h
a
p
t
e
r
11
Terminology
Formulas
C
h
a
p
t
e
r
11
( x x )( y
i
i 1
y)
( x x ) ( y
2
i 1
i 1
y)
C
h
a
p
t
e
r
11
Properties of r
1.
-1 r 1
2.
3.
C
h
a
p
t
e
r
1
1
Properties of r
5.
6.
7.
C
h
a
p
t
e
r
Visualizing Correlation
Coefficients using Scatterplots
(r=.1)
(r=.3)
(r=.5)
(r=.7)
(r=.9)
(r=1)
11
7
C
h
a
p
t
e
r
11
Visualizing Correlation
Coefficients using Scatterplots
(r= -1)
(r= -.4)
(r= -.8)
(r= -.2)
(r= -.6)
(r=0)
Strong Moderate
-1
-0.8
-0.5
Weak
Moderate Strong
0.5
0.8
11
9
Observation
556
638
588
550
580
642
568
642
129
119
132
123.5
112
113.5
95
104
10
11
12
13
14
15
556
616
549
504
515
551
594
104
93.5
108.5
95
117.5
128
127.5
Observation
11
10
Correlation
11
130
120
Foal Weight
C
h
a
p
t
e
r
110
100
90
500
550
600
650
Mare Weight
r = .485
moderate positive relationship
12
r = -.94
13
Week 3: Chapter 11
Simple Linear Regression
correlation
linear
regression
1
Terminology
C
h
a
p
t
e
r
Independent variable
An independent variable is the variable you have
control over, what you can choose and manipulate.
It is usually what you think will affect the
dependent variable.
Dependent variable
A dependent variable is what you measure in the
experiment and what is affected during the experiment.
The dependent variable responds to the independent
variable.
11
2
Terminology
C
h
a
p
t
e
r
11
Regression Analysis
Used to describe the relationship between a dependent
variable and one or more independent variables.
Linear Regression
used to construct a simple formula that will predict a
value or values for a variable given the value of another
variable.
OR
used to test whether and how a given variable is related
to another variable or variables.
Terminology
C
h
a
p
t
e
r
11
Deterministic model
The relationship between x and y is exact, and there is no
allowance for error.
Deterministic relationship
y = 1.5x
(Reaction time,
in seconds)
(Percentage of
drug in the blood)
Terminology
C
h
a
p
t
e
r
Probabilistic model
The relationship between x and y includes a deterministic
component and a random error component.
Accounts for unexplained variation caused by unknown
phenomena or other variables.
Probabilistic relationship
y = 1.5x + random error
(Reaction time,
in seconds)
11
(Percentage of
drug in the blood)
5
Terminology
C
h
a
p
t
e
r
Probabilistic Relationship
11
6
C
h
a
p
t
e
r
11
Therefore,
Mean value of y, E(y) = Deterministic component
Formulas
C
h
a
p
t
e
r
11
Line of Means
C
h
a
p
t
e
r
11
9
E(y) = 0 + 1x
Step 2
Use sample data to estimate the unknown parameter in
the model
11
10
(Reaction time,
in seconds)
4.5
4
3.5
3
2.5
2
1.5
1
11
0.5
0
0
(Percent of drug
in bloodstream)
Formulas
C
h
a
p
t
e
r
y
y1
y2
.
.
.
yn
11
E y = 0 + 1
12
Formulas
C
h
a
p
t
e
r
11
Model:
= 0 + 1
Estimates:
= 0 + 1
Deviation:
SSE:
= 0 + 1
0 + 1
Formulas
C
h
a
p
t
e
r
11
1 =
1
2
OR
1 =
y-intercept
0 = 1
14
Example
C
h
a
p
t
e
r
11
Where:
x 3
= 1.5811
y 2
= 1.2447
Percent
x of Drug
Reaction
Time y
(seconds)
= 0.9037
15
Answer a)
C
h
a
p
t
e
r
11
1.2247
.
9037
Slope 1 =
.7
1.5811
y intercept , 0 y 1 x
2 .7(3) .1
y .1 .7 x
16
Answer b)
C
h
a
p
t
e
r
11
Answer c)
SSE =
0 + 1
5
yi .1 .7 xi
i 1
(1 .1 .7 1) 2 (1 .1 .7 2) 2 (2 .1 .7 3) 2
(2 .1 .7 4) 2 (4 .1 .7 5) 2
1.10
17
Answer d)
C
h
a
p
t
e
r
(Reaction time,
in seconds)
Slope, 1 = .7
y .1 .7 x
4.5
4
3.5
3
2.5
y-intercept, 0 = .1
1.5
1
11
0.5
0
0
(Percent of drug
in bloodstream)
Coefficients of Determination, 2
C
h
a
p
t
e
r
=
=1
11
19
C
h
a
p
t
e
r
11
High r2
x provides important
information about y
Predictions are more accurate
based on the model
Low r2
Knowing values of x does not
substantially improve
predictions on y
There may be no relationship
between x and y, or it may be
more subtle than a linear
relationship
Example
C
h
a
p
t
e
r
Where:
x 3
= 1.5811
y 2
= 1.2447
= 0.9037
Percent
x of Drug
Reaction
Time y
(seconds)
11
21
Answer e)
C
h
a
p
t
e
r
11
2 = (.9037)2 = .817
Interpretation:
About 82% of the sample variation in reaction time (y)
can be explained by the fitted linear relationship
between reaction time (in seconds) and the percent of
drug received.