Statistics For Managers Using Microsoft Excel: The Simple Linear Regression Model and Correlation

Statistics for Managers
Using Microsoft Excel
Chapter 13
The Simple Linear Regression
Model and Correlation
© 1999 Prentice-Hall, Inc. Chap. 13 - 1
Chapter Topics
• Types of Regression Models
• Determining the Simple Linear Regression
Equation
• Measures of Variation in Regression and
Correlation
• Assumptions of Regression and Correlation
• Estimation of Predicted Values
• Correlation - Measuring the Strength of the
Association

Introduction to Regression
Models

Linear Regression Gauss-
Markoff Assumptions
1. Normality
 Y values are normally distributed for each X
 Probability distribution of error is normal
2. Homoscedasticity (constant variance)

3. Independence of errors E(eiej)=0 (i<>j)
4. Linearity Yi    X i
5. Variables are measured without error
(NONSTOCHASTIC)
Linear Regression Gauss-
Markoff Assumptions
If these assumptions hold -
The formulas that we use to estimate the coefficients in
BLUE (Best Linear Unbiased
a regression yield BLUE
Estimators)
Best = “Most Efficient” = smallest variance

Unbiased = Expected value of estimator=true
population value
Normality & Constant
Variance Assumptions
f(e)
Y
X 1
X 2
X

Variation of Errors Around
the Regression Line
y values are normally distributed
f(e) around the regression line.
For each x value, the “spread” or
When is this variance around the regression
realistic? line is the same.
Y
X2
X1
X
Regression Line
Regression Models
Answer ‘What is the relationship between the

variables?’
Equation used
 1 numerical dependent (response) variable
 What is to be predicted: Y
 1 or more numerical or categorical
independent (explanatory) variables: X

Specifying the Model

Model Specification
Is Based on Theory
Economic, Psychological & business theory
Mathematical theory
Previous research
‘Common sense’
We ASSUME causality flows

from X to Y
Thinking Challenge:
Which Is More Logical?
Sales Sales
Advertising Advertising
Sales Sales
Advertising Advertising
© 1999 Prentice-Hall, Inc. Alone Group Class Chap. 13 - 11

Types of
Regression Models
1 Explanatory Regression 2+ Explanatory
Variable Models Variables
Simple Multiple
Non- Non-
Linear Linear
Linear Linear

Types of Regression Models
Positive Linear Relationship Relationship NOT Linear
Negative Linear Relationship No Relationship

Linear Regression Model

Types of
Regression Models
1 Explanatory Regression 2+ Explanatory
Variable Models Variables
Simple Multiple
Non- Non-
Linear Linear
Linear Linear

Linear Equations
Y
Y = bX + a
Change
b = S lo p e in Y
C h a n g e in X
a = Y -in te r c e p t
X
High School Teacher

© 1984-1994 T/Maker Co.

The Scatter Diagram
Plot of all (Xi , Yi) pairs

Y
60
40
20
0 X
0 20 40 60

Simple Linear
Regression Model
• Relationship Between Variables Is a Linear Function
• The Straight Line that Best Fit the Data
Y intercept (Constant term) Random

Error
Yi   0   1 X i   i
Dependent
(Response) Independent
Slope (Explanatory)
Variable
Variable
Population
Linear Regression Model
Y Yi   0  1X i   i Observed
Value
i = Random Error
   0  1X i
YX
(E(Y))
X
Observed Value
Sample Linear
Regression Model

Yi  b0  b1 X i

Yi = Predicted Value of Y for observation i
Xi = Value of X for observation i
b0 = Sample Y - intercept used as estimate of

the population 0
b1 = Sample Slope used as estimate of the
population 1
Estimating Parameters:
Least Squares Method

Thinking Challenge
How would you draw a line through the
points? How do you determine which line
‘fits best’?
Regression Applet

Least Squares
‘Best fit’ means difference between actual Y

values & predicted Y values are a
minimum
 But positive differences off-set negative

What should we
expect?
• If Y and X are not related, then E(Y|
X)=E(Y) - we should predict the same Y
for every value of X.
Y
M ean o f Y
X
Y = c o n s ta n t+ (0 )X
= E (Y )
What should we
expect?
• If Y and X are
related, then E(Y| Y
X)<>E(Y) - we
should predict a
different Y for M ean of Y
every value of X.
• Therefore, the slope
will not be zero M ean of X X
B <>0

What should we
expect?
• At the mean of X, we will predict the
mean of Y. When X deviates from its
mean, we expect Y to also deviate from its
mean
• Therefore, we can also think about X
“explaining” deviation of Y from its mean
value.

Simple Linear Regression
Equation: Example
Annual
Store Square Sales
Feet ($000)
1 1,726 3,681
You wish to examine the
2 1,542 3,395
relationship between the
square footage of produce 3 2,816 6,653
stores and its annual sales. 4 5,555 9,543
Sample data for 7 stores 5 1,292 3,318
were obtained. Find the
6 2,208 5,563
equation of the straight
line that fits the data best 7 1,313 3,760

Scatter Diagram
Example
12000
Annual Sales ($000)
10000
8000 Y  5130
6000
4000
X  2350
2000
0
0 1000 2000 3000 4000 5000 6000
Square Feet
Excel Output

Equation for the Best
Straight Line

Yi  b0  b1 X i
 1636.415  1.487 X i
From Excel Printout:

Coefficients
Intercept 1636.414726
X Variable 1 1.486633657
If X=0, then Ŷ =1636.414 Realistic?
Graph of the Best
Straight Line
12000
Annual Sales ($000)
10000
8000
8 7Xi
5130 1.4
6000
5 +
.4 1
4000  636
= 1
2000
Yi
2350
0
0 1000 2000 3000 4000 5000 6000
Square Feet

Interpreting the Results

Interpreting the Results

Yi = 1636.415 +1.487Xi
The slope of 1.487 means for each increase of one

unit in X, the Y is estimated to increase 1.487units.
For each increase of 1 square foot in the size of the

store, the model predicts that the expected annual
sales are estimated to increase by $1487.

Does X explain a significant
portion of the variation in
Y?

Explaining Variation in
Y
• If X and Y have no relationship, we
should predict the mean of Y for every
X value.
• We would like to measure whether
knowing the value of X helps us explain
why Y differs from its mean value.

Measures of Variation:
The Sum of Squares
Y 
SSE =(Yi - Yi )2
_ b Xi
 b0 + 1
SST = (Yi - Y) 2
Yi =
 _
SSR = (Yi - Y)2
_
Y
X
X Xi
The Sum of Squares
SST = Total Sum of Squares
•measures_the variation of the Yi values around their
mean Y
SSR = Regression Sum of Squares
•explained variation attributable to the relationship
between X and Y
SSE = Error Sum of Squares
•variation attributable to factors other than the
relationship between X and Y

The Sum of Squares
SST = Total Sum of Squares
•This is the_ identical measure that we used in ANOVA
SSR = Regression Sum of Squares

•We called this Sum of Squares Among in ANOVA
SSE = Error Sum of Squares

•We called this Sum of Squares Within in ANOVA

Measures of Variation
The Sum of Squares: Example
Excel Output for Produce Stores

df SS MS F Significance F
Regression 1 30380456.12 30380456 81.17909015 0.000281201
Residual 5 1871199.595 374239.9
Total 6 32251655.71
SSR SSE SST

Interpreting Anova
Results
• The F-test tests the null hypothesis that
the regression does not explain a
significant proportion of the variation in
Y
• The degrees of freedom for the F-test of a
simple regression are 1 and n-2
• In this example, F=81.2 with 1 and 5
degrees of freedom.
The Coefficient of
Determination
SSR regression sum of squares

r =
2
=
SST total sum of squares
Measures the proportion of variation that is

explained by the independent variable X in
the regression model

Coefficients of Determination
(r2) and Correlation (r)
Y r2 = 1, r = +1 Y r2 = 1, r = -1
^=b +b X
Yi 0 1 i
^=b +b X
Yi 0 1 i
X X
Yr2 = .8, r = +0.9 Y r2 = 0, r = 0
^=b +b X
Y ^=b +b X
Y
i 0 1 i i 0 1 i
X X
R and F connection
2
SSR 2 2
F  r * SST n  2 r n  2
(2  1)
*  *
SSE  2 2
 n  2 (1  r ) * SST 2 1 1  r 1
The F-test can be written in terms of the r2.

The F-test is the test that the r2=0.

Standard Error of
Estimate
n 
SSE  ( Yi  Yi )
2
Syx  = i 1
n2
n2
The standard deviation of the variation of

observations around the regression line

Example
Excel Output for Produce Stores
Regression Statistics
Multiple R 0.9705572
R Square 0.94198129
Adjusted R Square 0.93037754
Standard Error 611.751517
Observations 7
r2 = .94 Syx
94% of the variation in annual sales can be
explained by the variability in the size of the
store as measured by square footage
Inferences about the
Slope: t Test
• t Test for a Population Slope
Is a Linear Relationship Between X & Y ?
•Null and Alternative Hypotheses
H0: 1 = 0 (No Linear Relationship)
H1: 1  0 (Linear Relationship)
•Test Statistic: b1   1 SYX

t  Where Sb 
Sb1 1 n
2
 i
( X  X )
i 1
and df = n - 2
Example: Produce Stores
Data for 7 Stores: Regression
Annual Model Obtained:
Store Square Sales 
Feet ($000)
Yi = 1636.415 +1.487Xi
1 1,726 3,681
2 1,542 3,395 The slope of this model
3 2,816 6,653
is 1.487.
4 5,555 9,543 Is there a linear
5 1,292 3,318 relationship between the
6 2,208 5,563 square footage of a store
7 1,313 3,760 and its annual sales?
Inferences about the
Slope: t Test Example
H0: 1 = 0 Test Statistic:
H1: 1  0 From Excel Printout
t Stat P-value
  .05 Intercept 3.6244333 0.0151488
df  7 - 2 = 5 X Variable 1 9.009944 0.0002812
Critical Value(s): Decision:

Reject Reject Reject H0
.025 .025 Conclusion:

There is evidence of a
-2.5706 0 2.5706
t relationship.
Connection of F and t in simple
regression
A NOV A
df SS MS F S ignif ic anc e F
b1 tn-2 Sb1

Regres s ion 1 30380456.12 81.17909 0.0002812
Res idual 5 1871199.595
Total 6 32251655.71
Inte rce pt
Excel Printout
Coeffic ients
1636.41473
forEProduce
S t andard Stores
rror t S tatP-value
451.4953308
Lower 95%
0.0151488 475.810926
X V a ria bl e 1 1.48663366 0.164999212 0.0002812 1.06249037
Note: These are identical in simple regression!
The t test for =0 is identical to the F test for r2=0 for
simple regression. The t-statistic will be the square root of
the F statistic (t=1.4866/.1649=9.01) F1,n-2=t2n-2

Inferences about the Slope:
Confidence Interval Example
Confidence Interval Estimate of the Slope
b1 tn-2 Sb1
Excel Printout for Produce Stores
Lower 95% Upper 95%
Intercept 475.810926 2797.01853
X Variable 11.06249037 1.91077694
At 95% level of Confidence The confidence Interval for the
slope is (1.062, 1.911). Does not include 0.
Conclusion: There is a significant linear relationship
between annual sales and the size of the store.
Slope estimates make line pivot
around mean point
• Different estimates of B U p p e r 9 5 % e s tim a te
o f B ( 1 .9 1 )
tilt the line around the R e g r e s s io n L in e

Y = 1 6 3 6 + 1 .4 9 X
mean point
SALES
• If B is different this will
give small differences in the
forecast for Y near the
mean, but big differences L o w e r 9 5 % e s tim a t e o f
B (1 .0 6 )
away from the mean

S q u a re F o o ta g e

Estimation of
Predicted Values
Confidence Interval Estimate for XY
The Mean of Y given a particular Xi
Size of interval vary according to
Standard error distance away from mean, X.
of the estimate
2
1 ( Xi  X )
Ŷi  t n  2  Syx  n
n  ( X  X )2
t value from table i
i 1
with df=n-2
Estimation of
Predicted Values
Confidence Interval Estimate for
Individual Response Yi at a Particular Xi
Addition of this 1 increased width of
interval from that for the mean Y
2
1 ( Xi  X )
Ŷi  t n  2  Syx 1  n
n  ( X  X )2
i
i 1

Confidence Bands
• Error associated with a forecast has two

components:
 Error at the mean (standard error of
estimate)
 Error in estimating B
• Therefore, the confidence intervals

around forecasts will be larger as we
move away from the mean of X
Interval Estimates for
Different Values of X
Confidence Interval Confidence
for a individual Yi Interval for the
Y mean of Y
 + b X
1 i
Yi = b0
_ X
X A Given X
Example: Produce Stores
Data for 7 Stores:
Annual
Store Square Sales Predict the annual
Feet ($000)
sales for a store with
1 1,726 3,681 2000 square feet.
2 1,542 3,395
3 2,816 6,653 Regression Model Obtained:
4 5,555 9,543
5 1,292 3,318 
6 2,208 5,563
Yi = 1636.415 +1.487Xi
7 1,313 3,760
Estimation of Predicted
Values: Example
Confidence Interval Estimate for Individual Y
Find the 95% confidence interval for the average annual sales
for stores of 2,000 square feet

Predicted Sales Yi = 1636.415 +1.487Xi = 4610.45 ($000)
X = 2350.29 SYX = 611.75 tn-2 = t5 = 2.5706
1 ( X i  X )2
Ŷi  t n  2  Syx  n = 4610.45  980.97
n  ( X  X )2
i
i 1 Confidence interval for mean Y
Values: Example
Confidence Interval Estimate for XY
Find the 95% confidence interval for annual sales of one
particular stores of 2,000 square feet

Predicted Sales Yi = 1636.415 +1.487Xi = 4610.45 ($000)
X = 2350.29 SYX = 611.75 tn-2 = t5 = 2.5706
1 ( X i  X )2
Ŷi  t n  2  Syx 1  n = 4610.45  1853.45
n  ( X  X )2
i Confidence interval for indivi
i 1
Y
Values: Example
Example SPSS
Example EXCEL

Modern Portfolio
Theory (MPT)
• A method to compare the riskiness of stocks
(or mutual funds)
• Morningstar - service that rates stocks and
computes risk measures
• Comparison:
 Mutual fund excess return over treasury bill and
 S&P 500 (the “benchmark” index) - 90 day
treasury bill

Modern Portfolio
Theory (MPT)
• Regression statistics cited:
 R2 - low indicates stock fund does not move very
well with an index. High indicates strong
movement
 Beta - this is not the standardized beta that we’ve
talked about. However, it is normalized so that 1.1
means 10% better on the up-side (and 10% worse
on the down-side) - it is just the coefficient in the
regression

Example - Fidelity Fund
Evaluator

Evaluator

Evaluator

Capital Asset Pricing
Model (CAPM)
 > 0 means excess return on market

 > 0 means more risk than market
Capital Asset Pricing
Model (CAPM)

CAPM estimates for
Beta by Industry

Simple Correlation
(Pearson Correlation)
(Product Moment Correlation)

Correlation: Measuring the
Strength of Association
• Answer ‘How Strong Is the Linear
Relationship Between 2 Variables?’
• Coefficient of Correlation Used
 Population correlation coefficient denoted
 (‘Rho’)
 Values range from -1 to +1
 Measures degree of association
• Is the Square Root of the Coefficient of

Determination
Test of
Coefficient of Correlation
• Tests If There Is a Linear Relationship
Between 2 Numerical Variables
• Same Conclusion as Testing Population
Slope 1 and testing r2=0
• Hypotheses
 H0:  = 0 (No Correlation)
 H1:   0 (Correlation)

Test of
Coefficient of Correlation
Test statistic for significance of r is
either:
r  n  2
2
F1,n  2  or
1 r 2
r 2  n  2
t n2 
1 r 2
SAMPLE SIZE IS REALLY IMPORTANT FOR
SIGNIFICANCE!
Guessing Correlations

Guessing Correlations

Rating Jobs Using r
N=13 jobs * = p<.05 (One-tail)

Rating Job dimensions
N=208 employers * = p<.01 (2-tail)

Connection to results
N=208 * = p<.05 (one-tail)

SPSS PRINTOUT EXAMPL
E

Application of Correlation
Test Reliability and Validity

Reliability
Reliability: Does the indicator render the

same result over repeated trials?
 Very large measurement error can threaten
reliability. This will lead to low correlations
or R2’s.
 There are several measures of reliability:
test-retest, split-half and coefficient Alpha

are the mostly commonly used.

Reliability
 coefficient Alpha measures the internal

reliability of a summed scale. Are the scale
items positively associated with each other?
 You may need to reverse code items that are
negatively correlated with each other.
 Items might need to be excluded from a scale if a
substantial improvement in alpha will result.

 A reliable scale might not be valid.

Reliability
 Example: How should results on questions

be related to one another?
 Persons who score high overall should get the
question correct more often than those who
score low overall (+ correlation of question
results)
 Negative correlation makes test questions
“unreliable”

Satisfaction Questions
SATADVN 66a. I am satisfied with advanced opportunities
Measurement Level: Ordinal
1 Not satisfied at all
6 Very satisfied
SATBENFT 66b. I am satisfied with benefits
SATCLTRE 66c. I am satisfied with company culture
SATRECGN 66d. I am satisfied with individual recognition
SATPAY 66e. I am satisfied with pay
SATCONDT 66f. I am satisfied with working conditions
SATPEOPL 66g. I am satisfied with the "people"/coworkers
SATWORK 66h. I am satisfied with the "work" I do
SATSTRTG 66i. I am satisfied with overall corporate strategy
SATMNGR 66i. I am satisfied with my manager
SATCOMP 66k. I am satisfied with total compensation (pay and benefit
SATEVAL 66l. I am satisfied with the performance evaluation process
SATTRAN 66m. I am satisfied with the training programs

Correlation Matrix
SATADVN SATBENFT SATCLTRE SATRECGN SATPAY
SATADVN 1.0000
SATBENFT .2220 1.0000
SATCLTRE .5136 .3559 1.0000
SATRECGN .5403 .3044 .5579 1.0000
SATPAY .4381 .3015 .3817 .4849 1.0000
SATCONDT .4202 .3172 .5136 .4755 .3879
SATPEOPL .2336 .2715 .3782 .3390 .2488
SATWORK .5353 .1683 .4167 .3766 .3145
SATSTRTG .4905 .3233 .6131 .4636 .3582
SATMNGR .3710 .2009 .3665 .4831 .2907
SATCOMP .4170 .4621 .3861 .4707 .7996
SATEVAL .4727 .2714 .4830 .5549 .4410
SATTRAN .4323 .2470 .3959 .4263 .3304

Correlation Matrix
SATCONDT SATPEOPL SATWORK SATSTRTG SATMNGR
SATCONDT 1.0000
SATPEOPL .3756 1.0000
SATWORK .3898 .2645 1.0000
SATSTRTG .4869 .2998 .4717 1.0000
SATMNGR .3544 .2711 .3068 .3246 1.0000
SATCOMP .3913 .2627 .3003 .3751 .3197
SATEVAL .4240 .2656 .3497 .4585 .3909
SATTRAN .3543 .2795 .3507 .4058 .2986
SATCOMP SATEVAL SATTRAN
SATCOMP 1.0000
SATEVAL .4739 1.0000
SATTRAN .3320 .4713 1.0000

Scale Results
Scale Scale Corrected
Mean Variance Item- Squared Alpha
if Item if Item Total Multiple if Item
Deleted Deleted Correlation Correlation Deleted
SATADVN 50.2318 100.8919 .6523 .4893 .8791

SATBENFT 48.4199 112.5294 .4201 .2963 .8896
SATCLTRE 49.3321 104.6498 .6795 .5300 .8781
SATRECGN 49.6284 103.3635 .7040 .5287 .8767
SATPAY 49.6863 104.1664 .6047 .6704 .8815
SATCONDT 49.1083 105.9671 .6149 .4005 .8811
SATPEOPL 48.6147 112.8664 .4281 .2261 .8892
SATWORK 49.3662 105.8659 .5359 .3675 .8852
SATSTRTG 49.1749 106.6603 .6434 .4830 .8802
SATMNGR 49.0763 106.0940 .4974 .2858 .8876
SATCOMP 49.2123 104.9356 .6338 .7048 .8801
SATEVAL 49.8835 103.4273 .6452 .4514 .8794
SATTRAN 49.8659 106.3384 .5440 .3223 .8846
Alpha for scale =.8906

Feelings about job
FEELJOB 1.How do you feel about your job
1 I like my job a great deal
2 I am somewhat satisfied with my job
3 I don't strongly like or dislike my job
4 I am somewhat dissatisfied with my job
5 I don't like my job at all
JOBINTER 8. I feel my job is interesting and challenging
1 Strongly disagree
2 Disagree
3 Somewhat disagree
4 Somewhat agree
5 Agree
6 Strongly agree
JOBGROW 11. My job is helping me grow personally and professionally
SLVEPROB 12. I feel empowered to solve problems
QSTNVP 13a. I feel I can question a company policy or practice

Correlation Matrix
FEELJOB JOBINTER JOBGROW SLVEPROB
FEELJOB 1.0000
JOBINTER -.7033 1.0000
JOBGROW -.6182 .7152 1.0000
SLVEPROB -.5211 .6084 .6762 1.0000
QSTNVP -.2964 .3465 .3901 .427

Scale Results

Mean Variance Item- Squared
Alpha
if Item if Item Total Multiple
if Item
Deleted Deleted Correlation Correlation
Deleted
FEELJOB 16.2317 20.6597 -.6654 .5245 .8140

JOBINTER 14.2311 8.5268 .5217 .6349 .1858
JOBGROW 14.3501 7.7780 .6354 .6205 .0780
SLVEPROB 14.2142 8.3096 .6493 .5169 .1025
QSTNVP 14.4856 9.0389 .4250 .2035 .2661

Revised Scale
RECODE feeljob (1=5) (5=1) (2=4) (4=2).

FEELJOB 16.2317 20.6597 .6654 .5245 .8140

JOBINTER 15.9382 17.9115 .7471 .6349 .7871
JOBGROW 16.0572 17.7167 .7632 .6205 .7822
SLVEPROB 15.9213 19.2072 .7041 .5169 .8008
QSTNVP 16.1927 21.0990 .4285 .2035 .8761
Alpha = .8456

Poor Correlation for one item

ETHCUST 35.3256 40.0477 .5121 .3338 .7016

FAIRSHFT 35.4128 39.6869 .4975 .2935 .7028
BLNCINTR 35.4387 38.5496 .6158 .5572 .6847
LEADVP 35.2014 40.2039 .5143 .3792 .7017
INTRASS 35.7308 36.3271 .6960 .5877 .6664
MNGRFAIR 35.1823 39.1670 .4901 .3676 .7033
ASKFBACK 35.4536 41.0221 .3511 .2960 .7279
RATEPROD 35.5887 47.6683 .1390 .0389 .7487
DAYSMISS 33.2262 42.5076 .1217 .0579 .7935
© 1999 Prentice-Hall, Inc. Scale Alpha = .7397 Chap. 13 - 89

Validity
Validity: Does the indicator measure what

it is supposed to measure?
 Validity may be judgmental. Statistical
procedures are only one method of
validation.
 A valid scale must be reliable

Validity
 There are numerous ways of establishing

validity:
 Content validity - Does the scale include the
obvious measures of the phenomenon?
 Face validity - Is it valid on the surface?
 Construct Validity - Is the scale consistent
with accepted theoretical constructs?

 Replication across multiple settings can
strengthen construct validity.

Validity
 Predictive validity/Concurrent validity:

Depends on whether measurement is done
with criterion and scale measured at the
same time or at different times.
 Convergent - Does a scale correlate strongly
with theoretically related measures?
 Discriminant - Does a scale correlate weakly
with theoretically unrelated measures.

Statistical Validity Test
Coefficients
t Sig.
Model B Std. Error Beta
(Constant) 3.782 .059 64.536 .000
ENJYCUST 2.3E-02 .014 .037 1.651 .099
a Dependent Variable: RATEPROD 60. How would your

manager rate your productivity?
ENJYCUST 70. I enjoy working with our customers.

Chapter Summary
• Described Types of Regression Models
• Determined the Simple Linear Regression
Equation
• Provided Measures of Variation in Regression and
Correlation
• Stated Assumptions of Regression and Correlation
• Provided Estimation of Predicted Values
• Discussed Correlation - Measuring the Strength of
the Association

Statistics For Managers Using Microsoft Excel: The Simple Linear Regression Model and Correlation

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Statistics For Managers Using Microsoft Excel: The Simple Linear Regression Model and Correlation

Hochgeladen von

Copyright:

Verfügbare Formate

Statistics for Managers

Using Microsoft Excel

© 1999 Prentice-Hall, Inc. Chap. 13 - 2

© 1999 Prentice-Hall, Inc. Chap. 13 - 3

2. Homoscedasticity (constant variance)

Best = “Most Efficient” = smallest variance

© 1999 Prentice-Hall, Inc. Chap. 13 - 6

Answer ‘What is the relationship between the

 1 or more numerical or categorical

independent (explanatory) variables: X

© 1999 Prentice-Hall, Inc. Chap. 13 - 8

© 1999 Prentice-Hall, Inc. Chap. 13 - 9

We ASSUME causality flows

© 1999 Prentice-Hall, Inc. Alone Group Class Chap. 13 - 11

© 1999 Prentice-Hall, Inc. Chap. 13 - 12

Negative Linear Relationship No Relationship

© 1999 Prentice-Hall, Inc. Chap. 13 - 13

© 1999 Prentice-Hall, Inc. Chap. 13 - 14

© 1999 Prentice-Hall, Inc. Chap. 13 - 15

High School Teacher

© 1999 Prentice-Hall, Inc. Chap. 13 - 16

Plot of all (Xi , Yi) pairs

© 1999 Prentice-Hall, Inc. Chap. 13 - 17

• The Straight Line that Best Fit the Data

Y intercept (Constant term) Random

Xi = Value of X for observation i

b0 = Sample Y - intercept used as estimate of

© 1999 Prentice-Hall, Inc. Chap. 13 - 21

© 1999 Prentice-Hall, Inc. Chap. 13 - 22

‘Best fit’ means difference between actual Y

© 1999 Prentice-Hall, Inc. Chap. 13 - 23

© 1999 Prentice-Hall, Inc. Chap. 13 - 25

© 1999 Prentice-Hall, Inc. Chap. 13 - 26

© 1999 Prentice-Hall, Inc. Chap. 13 - 27

© 1999 Prentice-Hall, Inc. Chap. 13 - 28

From Excel Printout:

© 1999 Prentice-Hall, Inc. Chap. 13 - 30

© 1999 Prentice-Hall, Inc. Chap. 13 - 31

The slope of 1.487 means for each increase of one

For each increase of 1 square foot in the size of the

© 1999 Prentice-Hall, Inc. Chap. 13 - 32

© 1999 Prentice-Hall, Inc. Chap. 13 - 33

© 1999 Prentice-Hall, Inc. Chap. 13 - 34

© 1999 Prentice-Hall, Inc. Chap. 13 - 36

SSR = Regression Sum of Squares

SSE = Error Sum of Squares

© 1999 Prentice-Hall, Inc. Chap. 13 - 37

Excel Output for Produce Stores

SSR SSE SST

© 1999 Prentice-Hall, Inc. Chap. 13 - 38

SSR regression sum of squares

Measures the proportion of variation that is

© 1999 Prentice-Hall, Inc. Chap. 13 - 40

Yr2 = .8, r = +0.9 Y r2 = 0, r = 0

The F-test can be written in terms of the r2.

© 1999 Prentice-Hall, Inc. Chap. 13 - 42

The standard deviation of the variation of

© 1999 Prentice-Hall, Inc. Chap. 13 - 43

•Test Statistic: b1   1 SYX

Critical Value(s): Decision:

.025 .025 Conclusion:

b1 tn-2 Sb1