You are on page 1of 27

WINTER

REGRESSION
ANALYSIS
Template

SUBJECT: BASIC AND INFERENTIAL STATISTICS


REPORTER: SHIELA ROBETH B. VINARAO
TOPIC:
REGRESSION ANALYSIS
PROFESSOR:
DR. GLORIA T. MIANO

THE SIMPLE LINEAR


REGESSION ANALYSIS
D
E
F
I
N
I

The simple linear regression


analysis is used when there is a
significant
relationship
between and variables.

T
I
O
N

This is used in predicting the


value of a dependent variable
given
the
value
of
the
independent variable .

THE SIMPLE LINEAR


REGESSION ANALYSIS
E
X
A
M
P
L
E

Suppose the advertising cost


and sales are correlated, then
we can predict the future sales
in terms of advertising cost .
Another type of problem which
uses regression analysis is when
variables corresponding to years
are given, it is possible to
predict the value of that variable
several years hence or several

THE SIMPLE LINEAR


REGESSION ANALYSIS
F
O
R
M
U
L
A

Example

WINTER

Consider the following data:

7
6

Template
5

y-axis

0
0

x-axis

Straight line indicates that the two variables are to some

extent LINEARLY RELATED

The variable we are basing our predictions on is called the


predictor variable and is referred to as . When there is only
one predictor variable, the prediction method is called

SIMPLE LINEAR REGESSION


1
1
2
2

6
6
4
4

6
6
8
8

1
1
4
4

3
3

3
3

9
9

9
9

4
4

5
5

20
20

16
16

5
5

4
4

20
20

25
25

6
6

2
2

12
12

36
36

75

3.5

=4

4
3.5

75

=4

Example
A study is
conducted on the
relationship of the
number of absences
and the grades of
the students in
English.

Determine the
relationship using
the following data.

Number of Absences

Grades in English

1
1
2
2
2
2
3
3

90
90
85
85
80
80
75
75

05

3
3
8
8

80
80
65
65

6
6
1
1

70
70
95
95

4
4
5
5

80
80
80
80

5
5
1

75
75
92

89

80

65

Scatter Diagram
Number of

Grades in

Absences

English

1
1
2
2
2
2
3
3
3
3
8
8
6
6
1
1
4
4

90
90
85
85
80
80
75
75
80
80
65
65
70
70
95
95
80
80

5
5
5
5

80
80
75
75

1
1
2

92
92
89

80

65

100
90
80
70
60

GradesinEnglish
y

50
40
30
20
10
0
0

Numberofabsences
x

9 10

Solving by the Stepwise Method

WINTER

Problem : Is there a significant relationship between


the number of absences and the grades of
15 students in English class?

Hypotheses :

Template

Ho:

There is no significant relationship between the


number of absences and the grades of 15 students
in English class.

H1:

There is a significant relationship between the


number of absences and the grades of 15 students
in English class.

Level of significance :

df = n 2
= 15 2
= 13

Pearson Product Moment


Coefficient of Correlation

S
T
A
T
I
S
T
I
C
S

281

7335

3950

Decision Rule :

If the r computed value is greater than or beyond


the critical value, reject Ho.

Conclusion :

The computed r value of is beyond the critical value


of at level of significance with degrees of freedom, so the
null hypothesis is rejected.

This means that there is a significant relationship


between the number of absences and the grades of
students in English. Since the value of r is negative, it
implies that students who had more absences had lower
grades.

Suppose we want to predict the grade of the student

who has incurred 7 absences . To get the value of x, the


simple linear regression analysis will be used.

69 is the grade of
the student with
7 absences.

Remarks:

WINTER

It is important to remember that the


values of a and Template
b are only estimates of the
corresponding parameters of a and b.

To justify the assumption of linearity, a


test for linearity of regression should be
performed.
If there are two or more independent
variables, the regression equation becomes
+

Significance Test in Simple


Linear Regression
The significance of the slope of the regression line is to determine if
the regression model is usable.
If the slope is not equal to zero, then we can use the regression
model to predict the dependent variable for any value of the
independent variable.
If the slope is equal to zero, we do not use the model to make
predictions.
The scatter plot amounts to determining whether or not the slope
of the line of the best fit is significantly different from a horizontal
line or not.
A horizontal line means there is no association between two
variables, that is
In testing for significance in simple linear regression, the null
hypothesis is H0: and the alternative hypothesis is H1:

Slope of
Linear Regression
1.2

If

0.8

y-axis 0.6
0.4
0.2
0
0.5

1.5

2.5

3.5

4.5

5.5

x-axis

A horizontal line means there is no


association between two variables.

Significance Test in Simple


Linear Regression
F
O
R
M
U
L
A
1

The t-test is conducted for testing the significance of r to


determine if the relationship is not a zero correlation.

Where:

Significance Test in Simple


Linear Regression
F
O
R
M
U
L
A
2

The t-test is conducted for testing the significance of r to


determine if the relationship is not a zero correlation.

Where:

is the estimated standard deviation of


is the standard deviation of the values about
the regression line.

Example
Given are two sets of data on the number of customers
(in hundreds) and sales (in thousand of pesos) for a given
period of time from ten eateries. Find the equation of the
regression line which can predict the amount of sales
from the number of customers. Can we conclude that we
can use the model to make such a prediction?
Eatery

10

12

16

20

20

22

26

58

105

117

137 157

169

149

202

88 118

Significance Test in Simple


Linear Regression
2

58

116

3364

105

630

36

11025

88

704

64

7744

118

944

64

13924

12

117

1404

144

13689

16

137

2192

256

18769

20

157

3140

400

24649

20

169

3380

400

28561

22

149

3278

484

22201

26

202

5252

676

40804

Given:

Significance Test in Simple


Linear Regression

58

116

3364

70

144

144

105

630

36

11025

90

64

225

88

704

64

7744

100

36

144

118

944

64

13924

100

36

324

12

117

1404

144

13689

120

16

137

2192

256

18769

140

20

157

3140

400

24649

160

36

20

169

3380

400

28561

160

36

81

22

149

3278

484

22201

170

64

441

26

202

5252

676

40804

190

144

144

Decision:
Since 8.62 is greater than 3.355, we
reject the H0 or accept the H1. Thus,
the obtained relationship is
significant or is non zero using .005
level.
We can conclude that we can use
the model to predict sales from
population.

COMPUTATION USING
MICROSOFT EXCEL
Pearson r

Syntax
+pearson(array1,array2)
or
+correl(array1,array2)

Slope b
Intercept a

+slope(known_ys,known_xs)
+intercept(known_ys,known_xs)

PEARSON r

SLOPE b

INTERCEPT a

THANK YOU