Sie sind auf Seite 1von 4

Chapter 12 (Simple Regression)

(a)
The sample correlation coefficient (measures the strength of linear relationship)|:
s
( x x )( y y ) / n
r = xy =
( x x )( y y )
2
2 =
sx s y
(
x

x
)
(
y

y
)

( x x ) ( y y )
n
n
(Population) Regression equation: y = 0 + 1 x + i
(See 12.4, Page-411)
Estimated regression equation: y = b0 + b1 x
where, the regression coefficient: b1 =

sxy
sx

( x x )( y y ) , and
(x x )
2

b) Test of correlation coefficient, H 0 : = 0 , Test statistics: t =


c) Test of regression coefficient: H 0 : 1 = 0 , Test statistics: t =
where, se(b1 ) =

se

( xi x )2

, se =

(y y)

intercept, b0 = y b1 x .

r n2
(1 r 2 )

: t (n 2)

b1
se(b1 )

d)12.4 The Explanatory Power of a Linear Regression Equation (see page-418)


SSE
2
2
2
The coefficient of determination: R = 1
, SST = ( yi y ) , SSE = ( yi y )
SST
SST=SSR+SSE,
(See page-421)
2
e) If r is significant then, | r |>
.
n
12.21 (pp417, modified) A corporation administers an aptitude test to all new sales representatives.
Management is interested in the extent to which this test is able to predict their eventual success.
The accompanying table records average weekly sales (in thousands of dollars) and aptitude test
scores for a random samples of eight representatives.
Weekly sales : 10 12 28
Test score : 55 60 85

24
75

18
80

16
85

15
65

12
60

a) Plot the data. Calculate sample correlation coefficient, regression coefficient and intercept.
Also, write down the estimated regression model. Estimate the weekly sales for a test score of
100. Also, determine SST, SSR, SSE, R 2 ,
b) At 5% level of significance, test the null hypothesis , H 0 : = 0 against H a : > 0 .
(test the null hypothesis that the population correlation is 0, i.e., there is no relation between
weekly sales and test score) against an appropriate alternative hypothesis..
c) Test the following hypotheses at 5% level of significance:
H 0 : 1 = 0 against, H1 : 1 > 0 .
d) Find and interpret the coefficient of determination.
e) Find the 90% confidence interval for the regression coefficient.

f) Assess the significance of population correlation coefficient.


Solution :
(a)
Scatter Diagram of Weekley sales and Test scores

Weekly Sales (y)

30
25
20
15
10
5
0
50

55

60

65

70

75

80

85

90

Scores

x
55
60
85
75
80
85
65
60
70.625

( x x) ( y y) ( x x)2 ( y y)2 ( x x) ( y y)

y
10
12
28
24
18
16
15
12
16.875

-15.625
-10.625
14.375
4.375
9.375
14.375
-5.625
-10.625

-6.875
-4.875
11.125
7.125
1.125
-0.875
-1.875
-4.875

244.1406
112.8906
206.6406
19.14063
87.89063
206.6406
31.64063
112.8906
1021.875

r= 0.774781

r==

b1 =

sxy
sx

47.2656
23.7656
123.7656
50.7656
1.2656
0.7656
3.5156
23.7656
274.875

107.4219
51.7969
159.9219
31.1719
10.5469
-12.5781
10.5469
51.7969
410.6250

b1=0.4018

410.625
( x x)( y y) =
= 0.774781
1021.875

274.875
( x x ) ( y y )
( x x )( y y ) = 410.625 = 0.4018 and,
=
1021.875
(x x )

b0 = y b1 x = 16.875-0.4018?*70.625=-11.5046
=-11.5046 +.4018x
Estimated Regression Equation, y
When x=100,

y =$28675.4

(b) H 0 : = 0 vs. H a : > 0


Under H 0 the test statistics

y =-11.5046
+.4018x
10.5963
12.6055
22.6514
18.6330
20.6422
22.6514
14.6147
12.6055

e = yi y

e 2

-0.5963
-0.6055
5.3486
5.3670
-2.6422
-6.6514
0.3853
-0.6055
0.0000

0.3556
0.3666
28.6078
28.8044
6.9812
44.2408
0.1485
0.3666
109.8716

b0=-11.5046

r n2

~ t ( n 2)
1 r2
Let = 0.05
Decision Rule: we reject H 0 at 5% ls if,
tcal > t.05 (b) = 2.4469
t=

Calculation: tcal =

0.774781 6

= 3.002
1 .7747812
We may reject H 0 at 5% ls, since
tcal = 3.002 > tcrit = 2.4469
that is, there is a positive relation between weekly sales and test score.
(c) H 0 : = 0 vs H 0 : > 0
b1
.4018
t=
=
= 2.897
se ( b1 )
109.8716
6
1021.875
t.05 ( 6 ) = 2.4469
since
2.897 > 2.4469
we may reject at 5% ls
(d)
SSE
SST
109.8716
= 1
= 1 .39971 = 0.600285 60%
274.875
60% of the total variation in the weekly sales can be explained by test.
R2 = 1

SUMMARY OUTPUT
Regression Statistics
Multiple R
0.7747809
R Square
0.6002854
Adjusted R Square 0.5336663
Standard Error
4.2792437
Observations
8
ANOVA
df
Regression
Residual
Total

Intercept

1
6
7

SS
165.0034
109.8716
274.875

MS
F
Significance F
165.0034 9.010709 0.023953
18.31193

Coefficients Standard Error t Stat


P-value Lower 95%
-11.50459
9.57453
-1.20158 0.274797
-34.9326

Upper 95%
11.92346191

X Variable 1

0.4018349

0.133865

3.001784 0.023953

0.074278

0.729391779

RESIDUAL OUTPUT
Observation
1
2
3
4
5
6
7
8

Predicted Y
10.59633
12.605505
22.651376
18.633028
20.642202
22.651376
14.614679
12.605505

Residuals
-0.59633
-0.6055
5.348624
5.366972
-2.6422
-6.65138
0.385321
-0.6055

Using Calculator:
1)Reg Mode: mode mode 2 1
2) Data entry:
(x ,y)
55,10 M+ 60,12 M+
60,12 M+ and so on.
3) Results: b0 (A): Shift 2 (Press right arrow of gray colored Replay button twice) 1= -11.5046
b1 :
2= 0.4018
r:
3= 0.774781
y =-11.5046+.4018x
Calculator:
Find x and s for the data set:
2, 5, 8
1) SD mode: Press mode twice press 1
2) Data Entry: 5 M+ 5 M+,
8 M+
3)Results: x : Shift 2 1= 5
Shift 2 3= 3
s2 :
Assignment:
Refer to the data on employee rate in exercise 12.25. Use data file Employee Absence,
^
a) Find the predict values y and the residuals, ei for the least squares regression of change in
mean employee absence rate due to own illness on change in unemployment rate.
b) Find the sums of squares SST,SSR and SSE and verify that
SST=SSR+SSE
c) Using the results in part b), find and interpret the coefficient of determination.

Das könnte Ihnen auch gefallen