Beruflich Dokumente
Kultur Dokumente
T
Solution
45
52
64
66
91
86
104
45
20
To comment on
the relationship
(with respect to
the direction,
form and
strength) of the
bivariate data
using the
scatter diagram
104
98
90
90
Lecture example 2
Comment on the correlation between the two variables based on the scatter
diagram below.
100
80
60
40
20
0
20
60
40
100
80
60
40
20
0
80
50
100
100
20
15
10
5
0
50
0
0
10
20
No clear correlation
20
40
30
Non-linear correlation
60
To calculate
product
moment
coefficient
using formula
or TI-84+
Lecture example 3
In a physical education class, the number of push-ups (x) and sit-ups (y) done by
a sample of ten randomly chosen students were recorded and summarized as
shown.
Student
10
Push-ups (x)
27
22
15
35
30
52
35
55
40
40
Sit-ups (y)
30
26
25
42
38
40
32
54
50
43
14257
r=
( 351)( 380 )
10
( 351)2
( 380 )2
13717
15298
10
10
0.839
To find the
equation of the
regression line
of y on x or
equation of the
regression line
of x on y using
the given data
list.
Lecture example 5
[Note: The
regression line
of y on x or
equation of the
regression line
of x on y are
different unless
r = 1 ]
Find the equation of the regression line of y on x and the regression line of y on
10
Push-ups ( x )
27
22
15
35
30
52
35
55
40
40
Sit-ups ( y )
30
26
25
42
38
40
32
54
50
43
x.
Solution
From TI84+, the equation of the regression line of y on x is y = 0.658 x + 14.9
The equation of the regression line of x on y is x = 1.07 y 5.60 .
Regression Line y on x
Regression Line x on y
To determine
the appropriate
regression line
to use to
predict value;
justify the
choice of
regression line
used and
comment on
the reliability of
the prediction
The choice of the regression line used depends on the context of the situation:
(a) If there is a clear indication that x is the independent variable, we will
always use the regression line of y on x to do estimation.
(b) For cases where there is no clear independent variable, if we want to
estimate y for a given value of x , we use the regression line of y on x .
If we want to estimate x for a given value of y , use the regression line of
x on y .
Estimates using regression lines are only reliable if both the following conditions
are met:
(a)
The value of r of the data is close to 1 (and the scatter diagram also
suggests that there is a strong linear correlation).
(b)
The estimation is done within the given range of values of data.
Lecture example 6
An electrical fire was switched on in a cold room and the temperature of the
room was noted at 5-minute interval.
Time, x (in
minutes) from
switching on fire
Temperature, y
(in C )
10
15
20
25
30
35
40
0.4
1.5
3.4
5.5
7.7
9.7
11.7
13.5
15.4
Explain why the regression line of y on x rather than the regression line of x
on y should be used to predict the time that has passed after switching on the
fire if the temperature is 9 oC.
P
Solution
From the question, x is the controlled variable and y is measured based on
regular time interval of 5 minutes, suggesting that y is dependent on x . Thus,
the regression line of y on x should be used.
To interpret the
value of the
slope and yintercept in the
context of the
question
To find the
equation of the
regression line
of y on x or
equation of the
regression line
of x on y using
the data
statistics, b and
d values
Tutorial Example 4
The following summarizes the data from 10 sets of lengths(x) and breadths(y) in
mm:
x = 1782, y = 1483, x
x y
xy n 311.4
b=
=
= 0.58358 ;
533.6
x
x ( )
n
2
y y = b ( x x ) y = 0.584 x + 44.3
Method 1
x y
xy n 311.4
d=
=
= 0.94910
328.1
y
y ( )
n
2
x x = d ( y y ) x = 0.949 y + 37.4
Method 2
r 2 = bd
r 2 0.7442312
=
= 0.94910
b
0.58358
x x = d ( y y ) x = 0.949 y + 37.4
d=
To find missing
data using
regression lines
Tutorial Example 8
[Note: x and
FM2003/II/11OR(modified)
A random sample of eight pairs of values of x and y is used to obtain the
following equations of the regression lines of y on x and x on y respectively.
y lie on both
y=
regression
lines]
7
151
7
x+
, x = y + 20
10
10
6
10
11
12
11
17
14
19
7
151
7
x+
---(1) and
x = y + 20 ---(2)
10
10
6
Solving (1) & (2) gives x = 13, y = 6
y =
x = 94 + x
8
= 13 x8 = 10
8
y = 40 + y8 = 6 y = 8
y=
8
n
8
The eighth pair of values is (10,8 ) .
x=
To perform
transformat-ion
of data in order
to obtain
regression line
Relationship
Transformation
Linear Relationship
y = ax b
y = aebx
ln y = ln a + b ln x
ln y = ln a + bx
y 2 = ax + b
Square both sides
y = ax + b
y=
1
ax + b
1
= ax + b
y
1
i.e.,
and x have a linear
y
Take reciprocal
relationship.
To decide on
model using
(A) scatter
diagrams,
graphs of
various models
(ie checking
concavity and if
graph
increasing or
decreasing),
(B) r values
(ie observing
which model
has r closer
to 1).
(A)
Tutorial Example 7
The data shows the result of an experiment to investigate the relationship
between two variables x and t, where x is dependent on t.
x 22.5 25.0 28.0 30.5 38.0 40.5 42.5 48.0 54.5 55.0 70.0
t
6.3
4.0
(i)
10
40
30
20
10
0
0
10
20
30
40
50
(ii)
(a)
(b)
x = a + bt 2 , a > 0, b < 0 : Graph is concave downwards.
From the scatter diagram, the shape of the graph is concave upwards,
therefore the model x = at b , a > 0, b < 0 is more appropriate to fit the
data points.
(B)
Lecture example 8
The following data were collected during an experiment which investigated the
average lifespan of plants, t days, as the pH, y, of the soil in which the plant was
grown varied.
y
4.5
5.2
6.1
6.5
7.0
7.3
8.5
9.5
1.14
1.20
1.26
1.29
1.33
1.35
1.42
1.48
State, with a reason, which of the following models is more appropriate to fit
the data points:
(a)
(b)
t = a + by
t = ay b
Solution
t = ay b ln t = ln a + b ln y
If t = ay b is a suitable model, ln y and ln t should have a strong linear
correlation.
If t = a + by is a suitable model, y and t should have a strong linear correlation.
Using TI84+, the value of r between ln y and ln t is 0.99953 and the value of
r between y and t is 0.9976. Since the value of r between y and t is closer
to 1. Hence the model t = ay b should be more suitable than the model
t = a + by .
11
12
To estimate the
value of
unknowns in
within model
through the
regression line
obtained after
transformation
To compare the
sum of squares
of residuals
Solution
x = at b ln x = ln a + b ln t
Generate new transformed data ln x and ln t using GC, and obtained required
regression line.
between the
least squares
regression line
and other lines.