Edu6950 Multiple Regression

MULTIPLE REGRESSION
EDU6950 ADVANCED STATISTICS IN EDUCATION
PREPARED FOR :
PROF. DR. MOHD MAJID BIN KONTING
PREPARED BY :
NOR SAADAH BINTI JAMALUDDIN
GS48233
MULTIPLE REGRESSION
HATCO management has long been interested in more accurately predicting the level
of business obtained from its customers in the attempt to provide a better basis for
production control and marketing errors.
In doing multiple regression, the objective that researchers want to focuses on ;
o To this end, researchers at HATCO proposed that a multiple regression
analysis should be attempted to predict the product usage levels of the

customers based on their perceptions of HATCOs performance.
o To finding a way to accurately predicts usage levels, the researchers were also
interested in identifying the factors that led to increased product usage for
application in differentiated marketing campaigns.
To apply the regression procedure, researcher selected variables as below ;
o Product Usage Level (X9) as Dependent Variables
o X1 X7 as Independent Variables ;
X1 = Delivery speed
X2 = Price Level
X3 = Price Flexibility
X4 = Manufacturer Image
X5 = Overall Image
X6 = Salesforce Image
X7 = Product Quality
The relationship among the seven independent variables and product usage was
assumed to be statistical, not functional, because it involved perceptions of
performance and may have had levels of measurement error.
a) What is Multiple Regression?

b)
2
a) Procedure for Standard Multiple Regression
1) From the menu at the top of the screen click on: Analyze, then click on Regression,
then on Linear.
2) Click on your dependent variable (e.g. Product Usage Level) and move it into the
Dependent box.
3) Click on your independent variables (Perception of HATCO - Delivery Speed, Price
Level, Price Flexibility, Manufacturers Image, Service, Salesforce Image and Product
Quality) and move them into the Independent box.
4) For Method, make sure Enter is selected (this will give you standard multiple
regression).
5) Click on the Statistics button.
a. Tick the box marked Estimates, Model fit and Descriptives,
b. Click on Continue.
6) Click on OK.
The output generated from this procedure is shown below:-
Descriptive Statistics
Mean Std. Deviation N

Usage Level 46.100 8.9888 100
Delivery Speed 3.515 1.3207 100
Price Level 2.364 1.1957 100
Price Flexibility 7.894 1.3865 100
Manufacturer Image 5.248 1.1314 100
Service 2.916 .7513 100
Salesforce Image 2.665 .7709 100
Product Quality 6.971 1.5852 100
Correlation
3
The descriptives command also gives you a correlation matrix, showing you the Pearson rs
between the variables (in the top part of this table).
What is Pearson Correlation?
Correlation between sets of data is a measure of how well they are related. The most common
measure of correlation in stats is the Pearson Correlation. The full name is the Pearson
Product Moment Correlation or PPMC.
Why Pearson Correlation?
Pearsons correlation coefficient is the test statistics that measures the statistical
relationship, or association, between two continuous variables. It is known as the best
method of measuring the association between variables of interest because it is based on the
4
method of covariance. It gives information about the magnitude of the association, or
correlation, as well as the direction of the relationship.
The Pearsons r for the correlation between the Service (IV) and Usage Level (DV) in our
example is 0.701.
When Pearsons r is close to 1
This means that there is a strong relationship between your two variables. This means that
changes in one variable are strongly correlated with changes in the second variable. In our
example, Pearsons r is 0.701. This number is very close to 1. For this reason, we can
conclude that there is a strong relationship between our Usage Level and Service. Its means
that the perception on HATCO service will affect the usage level. However, we cannot make
any other conclusions about this relationship, based on this number.
When Pearsons r is close to 0
This means that there is a weak relationship between your two variables. This means that
changes in one variable are not correlated with changes in the second variable. If our
Pearsons r were 0.01, we could conclude that our variables were not strongly correlated. In
this example, price level is not correlated with the changes of the usage level.
When Pearsons r is positive (+)
This means that as one variable increases in value, the second variable also increase in value.
Similarly, as one variable decreases in value, the second variable also decreases in value. This
is called a positive correlation. In our example, our Pearsons r value of 0.701 was positive.
We know this value is positive because SPSS did not put a negative sign in front of it. So,
positive is the default. Since our example Pearsons r is positive, we can conclude that when
the service (our first variable) increase, the usage level (our second variable) also increases.
When Pearsons r is negative (-)
5
This means that as one variable increases in value, the second variable decreases in value.
This is called a negative correlation. In our example, our Pearsons r value of product quality
and usage level is -0.192. So, we could conclude that when the product quality (our first
variable) is decrease, the usage level (our second variable) decreases.
Multicollinearity. The correlations between the variables in your model are provided
in the table labelled Correlations. Check that your independent variables show at
least some relationship with your dependent variable (above .3 preferably). In this
case 3/7 of the scales (Delivery Speed, Price Flexibility and Service) correlate
substantially with Usage Level (.676, .559, and .701 respectively). Meanwhile, 4/7 of
the scales (Price Level, Manufacturer Image, Salesforce Image and Product
Quality) show relationship with dependent variable below .3 (which are .082, .224, .
256 and -.192).
Also check that the correlation between each of your independent variables is not too
high. Tabachnick and Fidell (2001, p. 84) suggest that you think carefully before
including two variables with a bivariate correlation of, say, .7 or more in the same
analysis. If you find yourself in this situation you may need to consider omitting
one of the variables or forming a composite variable from the scores of the two
highly correlated variables. In the example presented here the correlation between
salesforce image variable and manufacturer image variable is .788, which is more
than .7, therefore all variables will be retained.
Model Summary
Adjusted R Std. Error of the

Model R R Square
Square Estimate
a
1 .880 .775 .758 4.4237
a. Predictors: (Constant), Product Quality, Service, Salesforce Image, Price

Flexibility, Price Level, Manufacturer Image, Delivery Speed
6
ANOVAa
Sum of Mean
Model df F Sig.
Squares Square
Regression 6198.677 7 885.525 45.252 .000b
1 Residual 1800.323 92 19.569
Total 7999.000 99
a. Dependent Variable: Usage Level
b. Predictors: (Constant), Product Quality, Service, Salesforce Image, Price Flexibility, Price
Level, Manufacturer Image, Delivery Speed
Coefficientsa
Standardized
Unstandardized Coefficients
Model Coefficients t Sig.
B Std. Error Beta
1 (Constant) -10.187 4.977 -2.047 .044
Delivery Speed -.058 2.013 -.008 -.029 .977
Price Level -.697 2.090 -.093 -.333 .740
Price Flexibility 3.368 .411 .520 8.191 .000
Manufacturer Image -.042 .667 -.005 -.063 .950
Service 8.369 3.918 .699 2.136 .035
Salesforce Image 1.281 .947 .110 1.352 .180
Product Quality .567 .355 .100 1.595 .114
Variables Entered/Removeda
Variables Variables
Model Method
Entered Removed
1 Service Stepwise (Criteria:
Probability-of-F-to-enter <= .
.
050, Probability-of-F-to-
remove >= .100).
7
2 Price Flexibility Stepwise (Criteria:
.
remove >= .100).
3 Salesforce Image Stepwise (Criteria:
.
remove >= .100).
Model Summary

Model R R Square
Square Estimate
a
1 .701 .491 .486 6.4458
b
2 .869 .755 .750 4.4980
c
3 .877 .768 .761 4.3938
a. Predictors: (Constant), Service
b. Predictors: (Constant), Service, Price Flexibility
c. Predictors: (Constant), Service, Price Flexibility, Salesforce Image
8
ANOVAa
Sum of
Model Squares df Mean Square F Sig.
1 Regression 3927.309 1 3927.309 94.525 .000b
Residual 4071.691 98 41.548
Total 7999.000 99
2 Regression 6036.513 2 3018.256 149.184 .000c
Residual 1962.487 97 20.232
Total 7999.000 99
3 Regression 6145.700 3 2048.567 106.115 .000d
Residual 1853.300 96 19.305
Total 7999.000 99

b. Predictors: (Constant), Service
c. Predictors: (Constant), Service, Price Flexibility
d. Predictors: (Constant), Service, Price Flexibility, Salesforce Image
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 21.653 2.596 8.341 .000
Service 8.384 .862 .701 9.722 .000

2 (Constant) -3.489 3.057 -1.141 .257
Service 7.974 .603 .666 13.221 .000
3 (Constant) -6.520 3.247 -2.008 .047
Service 7.621 .607 .637 12.547 .000
9
Variables Entered/Removeda
Variables Variables
Model Entered Removed Method
1 Product Quality,
Service,
Salesforce
Image, Price
Flexibility, Price . Enter
Level,
Manufacturer
Image, Delivery
Speedb
2 Backward
(criterion:
. Delivery Speed Probability of F-
to-remove >= .
100).
3 Backward
(criterion:
Manufacturer
. Probability of F-
Image
to-remove >= .
100).
4 Backward
(criterion:
. Price Level Probability of F-
to-remove >= .
100).
5 Backward
(criterion:
. Product Quality Probability of F-
to-remove >= .
100).

b. All requested variables entered.
10
Model Summary

Model R R Square Square Estimate
a
1 .880 .775 .758 4.4237
b
2 .880 .775 .760 4.3998
c
3 .880 .775 .763 4.3764
d
4 .879 .772 .763 4.3796
e
5 .877 .768 .761 4.3938
a. Predictors: (Constant), Product Quality, Service, Salesforce Image,

Price Flexibility, Price Level, Manufacturer Image, Delivery Speed
b. Predictors: (Constant), Product Quality, Service, Salesforce Image,
Price Flexibility, Price Level, Manufacturer Image
c. Predictors: (Constant), Product Quality, Service, Salesforce Image,
Price Flexibility, Price Level
d. Predictors: (Constant), Product Quality, Service, Salesforce Image,
Price Flexibility
e. Predictors: (Constant), Service, Salesforce Image, Price Flexibility
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 6198.677 7 885.525 45.252 .000b
Residual 1800.323 92 19.569
Total 7999.000 99
2 Regression 6198.661 6 1033.110 53.367 .000c
Residual 1800.339 93 19.358
Total 7999.000 99
3 Regression 6198.591 5 1239.718 64.726 .000d
Residual 1800.409 94 19.153
Total 7999.000 99
4 Regression 6176.787 4 1544.197 80.506 .000e
Residual 1822.213 95 19.181
Total 7999.000 99
5 Regression 6145.700 3 2048.567 106.115 .000f
Residual 1853.300 96 19.305
11
Total 7999.000 99

b. Predictors: (Constant), Product Quality, Service, Salesforce Image, Price Flexibility, Price Level,
Manufacturer Image, Delivery Speed
c. Predictors: (Constant), Product Quality, Service, Salesforce Image, Price Flexibility, Price Level,
Manufacturer Image
d. Predictors: (Constant), Product Quality, Service, Salesforce Image, Price Flexibility, Price Level
e. Predictors: (Constant), Product Quality, Service, Salesforce Image, Price Flexibility
f. Predictors: (Constant), Service, Salesforce Image, Price Flexibility
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) -10.187 4.977 -2.047 .044
Delivery Speed -.058 2.013 -.008 -.029 .977
Price Level -.697 2.090 -.093 -.333 .740
Service 8.369 3.918 .699 2.136 .035
Product Quality .567 .355 .100 1.595 .114

2 (Constant) -10.216 4.845 -2.109 .038
Price Level -.640 .604 -.085 -1.059 .292
Service 8.260 .822 .690 10.051 .000
Product Quality .567 .353 .100 1.603 .112
3 (Constant) -10.298 4.621 -2.228 .028
Price Level -.641 .601 -.085 -1.067 .289
Service 8.253 .810 .690 10.186 .000
Product Quality .566 .351 .100 1.611 .111
4 (Constant) -10.699 4.610 -2.321 .022
Service 7.680 .607 .642 12.648 .000
12
Product Quality .403 .317 .071 1.273 .206
5 (Constant) -6.520 3.247 -2.008 .047
Service 7.621 .607 .637 12.547 .000
Excluded Variablesa
Collinearity
Partial Statistics
Model Beta In t Sig. Correlation Tolerance
2 Delivery Speed -.008b -.029 .977 -.003 .028

3 Delivery Speed -.006c -.020 .984 -.002 .029
c
Manufacturer Image -.005 -.060 .952 -.006 .355
d
4 Delivery Speed .086 1.014 .313 .104 .333
Manufacturer Image -.007d -.091 .928 -.009 .355
Price Level -.085d -1.067 .289 -.109 .375
e
5 Delivery Speed .030 .389 .698 .040 .403
Manufacturer Image -.002e -.021 .983 -.002 .357
Price Level -.029e -.405 .687 -.041 .462
Product Quality .071e 1.273 .206 .130 .768

b. Predictors in the Model: (Constant), Product Quality, Service, Salesforce Image, Price Flexibility, Price
Level, Manufacturer Image
c. Predictors in the Model: (Constant), Product Quality, Service, Salesforce Image, Price Flexibility, Price
Level
d. Predictors in the Model: (Constant), Product Quality, Service, Salesforce Image, Price Flexibility
e. Predictors in the Model: (Constant), Service, Salesforce Image, Price Flexibility
13
b) Interpretation of Output from Standard Multiple Regression
As with the output from most of the SPSS procedures, there are lots of rather confusing
numbers generated as output from regression.
Step 1: Checking the assumptions
Coefficientsa
Unstandardized Standardized 95.0% Confidence Collinearity

Correlations
Coefficients Coefficients Interval for B Statistics
Model t Sig.
Std. Lower Upper Zero-
B Beta Partial Part Tolerance VIF
Error Bound Bound order
1 (Constant) -.567 .445 -1.274 .206 -1.451 .317
Delivery
.240 .180 .370 1.333 .186 -.118 .598 .651 .138 .062 .028 35.747
Speed
Price Level .176 .187 .246 .942 .349 -.195 .547 .028 .098 .044 .032 31.597
Price
.290 .037 .470 7.882 .000 .217 .363 .525 .635 .366 .608 1.645
Flexibility
Manufacturer
.429 .060 .567 7.183 .000 .310 .547 .476 .599 .334 .347 2.879
Image
Service .132 .351 .116 .376 .708 -.565 .828 .631 .039 .017 .023 43.834
Salesforce
-.196 .085 -.177 -2.315 .023 -.364 -.028 .341 -.235 -.108 .371 2.697
Image
Product
-.046 .032 -.085 -1.446 .152 -.109 .017 -.283 -.149 -.067 .623 1.606
Quality
a. Dependent Variable: Satisfaction Level
SPSS also performs collinearity diagnostics on your variables as part of the

multiple regression procedure. This can pick up on problems with multi-collinearity
that may not be evident in the correlation matrix. The results are presented in the
table labelled Coefficients. Two values are given: Tolerance and VIF.
Tolerance is an indicator of how much of the variability of the specified independent

is not explained by the other independent variables in the model and is calculated
14
using the formula 1R2 for each variable. If this value is very small (less than .10), it
indicates that the multiple correlation with other variables is high, suggesting the
possibility of multi-collinearity. The other value given is the VIF (Variance inflation
factor), which is just the inverse of the Tolerance value (1 divided by Tolerance).
VIF values above 10 would be a concern here, indicating multi-collinearity.
I have quoted commonly used cut-off points for determining the presence of multi-
collinearity (tolerance value of less than .10, or a VIF value of above 10). These
values, however, still allow for quite high correlations between independent variables
(above .9), so you should take them only as a warning sign, and check the correlation
matrix.
In this example the tolerance value for Delivery Speed (.028), Price Level (0.32)
and Service (.023) variable show result less than .10; therefore, we have violated
the multi-collinearity assumption. Meanwhile, the tolerance value for Price
Flexibility (.608), Manufacturer Image (.347), Salesforce Image (.371) and
Product Quality (.623) variables, which is not less than .10; therefore, we have
not violated the multi-collinearity assumption.
This is also supported by the VIF values, which are Delivery Speed (35.747), Price
Level (31.597) and Service (43.834) variables show results above 10. Meanwhile,
Price Flexibility (1.645), Manufacturer Image (2.879), Salesforce Image (2.697)
and Product Quality (1.606) variables shows good results which below the cut-off
of 10. If you exceed these values in your own results, you should seriously
consider removing one of the highly inter-correlated independent variables from
the model.
So that, you should remove delivery speed, price level and service variables in your
model.
Outliers, Normality, Linearity, Homoscedasticity, Independence of Residuals.
One of the ways that these assumptions can be checked is by inspecting the residuals
scatterplot and the Normal Probability Plot of the regression standardised residuals
that were requested as part of the analysis. These are presented at the end of the output. In
the Normal Probability Plot you are hoping that your points will lie in a reasonably straight
15
diagonal line from bottom left to top right. This would suggest no major deviations from
normality. In the Scatterplot of the standardised residuals (the recond plot displayed) you are
hoping that the residuals will be roughly rectangular distributed, with most of the scores
concentrated in the centre (along the 0 point). What you dont want to see is a clear or
systematic pattern to your residuals (e.g. curvilinear, or higher on one side than the other).
The presence of outliers can also be detected from the Scatterplot. Tabachnick and Fidell
(2001) define outliers as cases that have a standardised residual (as displayed in the
scatterplot) of more than 3.3 or less than 3.3. With large samples, it is not uncommon to find
a number of outlying residuals. If you find only a few, it may not be necessary to take any
action. The results of Scatterplot is shown below :
The other information in the output concerning unusual cases is in the Table titled
Casewise Diagnostics. This presents information about cases that have standardised
16
residual values above 3.0 or below 3.0. In a normally distributed sample we would
expect only 1 per cent of cases to fall outside this range.
To check whether this strange case is having any undue influence on the results for
our model as a whole, we can check the value for Cooks Distance given towards the
bottom of the Residuals Statistics table. According to Tabachnick and Fidell (2001,
p. 69), cases with values larger than 1 are a potential problem. In our example the
maximum value for Cooks Distance is .100, suggesting no major problems. In your
own data, if you obtain a maximum value above 1 you will need to go back to your
data file, sort cases by the new variable that SPSS created at the end of your file.
Check each of the cases with values above 1you may need to consider removing the
offending case/cases.
Residuals Statisticsa
Minimum Maximum Mean Std. Deviation N
Predicted Value 3.129 6.495 4.771 .7658 100
Std. Predicted Value -2.145 2.252 .000 1.000 100
Standard Error of Predicted
.070 .240 .108 .028 100
Value
Adjusted Predicted Value 3.097 6.446 4.771 .7668 100
Residual -.9393 .7193 .0000 .3815 100
Std. Residual -2.374 1.818 .000 .964 100
Stud. Residual -2.519 1.884 .000 .999 100
Deleted Residual -1.0577 .7726 .0003 .4103 100
Stud. Deleted Residual -2.596 1.911 -.003 1.011 100
Mahal. Distance 2.109 35.390 6.930 5.043 100
Cook's Distance .000 .100 .009 .015 100
Centered Leverage Value .021 .357 .070 .051 100
Step 2: Evaluating the Model
Look in the Model Summary box and check the value given under the heading R
Square. This tells you how much of the variance in the dependent variable
(Satisfaction Level) is explained by the model (which includes the variables of
Perceptions HATCO - Delivery Speed, Price Level, Price Flexibility, Manufacturers
Image, Service, Salesforce Image and Product Quality).
17
Model Summaryb

Model R R Square Square Estimate
1 .895a .801 .786 .3957
a. Predictors: (Constant), Product Quality, Service, Salesforce Image,
Price Flexibility, Price Level, Manufacturer Image, Delivery Speed
b. Dependent Variable: Satisfaction Level
In this case the value is .801. Expressed as a percentage (multiply by 100, by shifting
the decimal point two places to the right), this means that our model (which includes
Perception of HATCO) explains 80.1 per cent of the variance in satisfaction level.
This is quite a respectable result (particularly when you compare it to some of the
results that are reported in the journals!). You will notice that SPSS also provides an
Adjusted R Square value in the output. When a small sample is involved, the R
square value in the sample tends to be a rather optimistic overestimation of the true
value in the population (see Tabachnick & Fidell, 2001, p. 147).
The Adjusted R square statistic corrects this value to provide a better estimate of the
true population value. If you have a small sample you may wish to consider reporting
this value, rather than the normal R Square value. To assess the statistical significance
of the result it is necessary to look in the table labelled ANOVA. This tests the null
hypothesis that multiple R in the population equals 0. The model in this example
reaches statistical significance (Sig = .000, this really means p<.0005).
ANOVAa

1 Regression 58.058 7 8.294 52.962 .000b
Residual 14.408 92 .157
Total 72.466 99
b. Predictors: (Constant), Product Quality, Service, Salesforce Image, Price Flexibility, Price Level,
Manufacturer Image, Delivery Speed
Step 3: Evaluating Each of the Independent Variables
18
The next thing we want to know is which of the variables included in the model contributed
to the prediction of the dependent variable. We find this information in the output box
labelled Coefficients. Look in the column labelled Beta under Standardised Coefficients.
To compare the different variables it is important that you look at the standardised
coefficients, not the unstandardised ones. Standardised means that these values for each of
the different variables have been converted to the same scale so that you can compare them.
If you were interested in constructing a regression equation, you would use the
unstandardized coefficient values listed as B.
Coefficientsa
Unstandardized Standardized 95.0% Confidence Collinearity

Correlations
Coefficients Coefficients Interval for B Statistics
Model t Sig.
Std. Lower Upper Zero-
B Beta Partial Part Tolerance VIF
Error Bound Bound order
1 (Constant) -.567 .445 -1.274 .206 -1.451 .317
Delivery
.240 .180 .370 1.333 .186 -.118 .598 .651 .138 .062 .028 35.747
Speed
Price Level .176 .187 .246 .942 .349 -.195 .547 .028 .098 .044 .032 31.597
Price
.290 .037 .470 7.882 .000 .217 .363 .525 .635 .366 .608 1.645
Flexibility
Manufacturer
.429 .060 .567 7.183 .000 .310 .547 .476 .599 .334 .347 2.879
Image
Service .132 .351 .116 .376 .708 -.565 .828 .631 .039 .017 .023 43.834
Salesforce
-.196 .085 -.177 -2.315 .023 -.364 -.028 .341 -.235 -.108 .371 2.697
Image
Product
-.046 .032 -.085 -1.446 .152 -.109 .017 -.283 -.149 -.067 .623 1.606
Quality
In this case we are interested in comparing the contribution of each independent

variable; therefore we will use the beta values. Look down the Beta column and
find which beta value is the largest (ignoring any negative signs out the front). In
this case the largest beta coefficient is .567, which is for Manufacturer Image
Variable. This means that this variable makes the strongest unique contribution to
explaining the dependent variable, when the variance explained by all other variables
in the model is controlled for.
19
The Beta value for Product Quality Variable was slightly lower (.085), indicating
that it made less of a contribution. For each of these variables, check the value in the
column marked Sig. This tells you whether this variable is making a statistically
significant unique contribution to the equation. This is very dependent on which
variables are included in the equation, and how much overlap there is among the
independent variables. If the Sig. value is less than .05 (.01, .0001, etc.), then the
variable is making a significant unique contribution to the prediction of the dependent
variable. If greater than .05, then you can conclude that variable is not making a
significant unique contribution to the prediction of your dependent variable. This may
be due to overlap with other independent variables in the model.
In this case, Price Flexibility (.000), Manufacturer Image (.000) and Salesforce
Image (.023) variables made a unique, and statistically significant, contribution
to the prediction of satisfaction level scores. Meanwhile, Delivery Speed (.186),
Price Level (.349), Service (.708) and Product Quality (.152) not making a
significant unique contribution to the prediction of your dependent variable. This
may be due to overlap with other independent variables in the model.
The other potentially useful piece of information in the coefficients table is the Part
correlation coefficients. Just to confuse matters, you will also see these coefficients
referred to as semi-partial correlation coefficients (see Tabachnick and Fidell, 2001, p.
140). If you square this value (whatever it is called) you get an indication of the
contribution of that variable to the total R squared. In other words, it tells you how
much of the total variance in the dependent variable is uniquely explained by that
variable and how much R squared would drop if it wasnt included in your model. In
this example the Product Quality scale has a part correlation coefficient of .085.
If we square this (multiply it by itself) we get .72, indicating that Product Quality
uniquely explains 72 per cent of the variance in Satisfaction Level scores.
20
1) Hierarchical multiple regression
In hierarchical regression (also called sequential) the independent variables are entered into
the equation in the order specified by the researcher based on theoretical grounds. Variables
or sets of variables are entered in steps (or blocks), with each independent variable being
assessed in terms of what it adds to the prediction of the dependent variable, after the
previous variables have been controlled for. For example, if you wanted to know how well
optimism predicts life satisfaction, after the effect of age is controlled for, you would enter
age in Block 1 and then Optimism in Block 2. Once all sets of variables are entered, the
overall model is assessed in terms of its ability to predict the dependent measure. The relative
contribution of each block of variables is also assessed.
a) Procedure for Hierarchical Multiple Regression
1) From the menu at the top of the screen click on: Analyze, and then click on
Regression, then on Linear.
2) Choose your continuous dependent variable (e.g. Satisfaction Level) and move it into
the Dependent box.
3) Move the variables you wish to control for into the Independent box (e.g. Usage
Level). This will be the first block of variables to be entered in the analysis (Block 1
of 1).
4) Click on the button marked Next. This will give you a second independent variables
box to enter your second block of variables into (you should see Block 2 of 2).
5) Choose your next block of independent variables (e.g. Perception of HATCO).
21
6) In the Method box make sure that this is set to the default (Enter). This will give you
standard multiple regressions for each block of variables entered.
7) Click on the Statistics button. Tick the boxes marked Estimates, Model fit, R squared
change, Descriptive, Part and partial correlations and Collinearity diagnostics. Click
on Continue.
8) Click on the Options button. In the Missing Values section click on Exclude cases
pairwise.
9) Click on the Save button. Click on Mahalonobis and Cooks. Click on Continue and
then OK.
Some of the output generated from this procedure is shown below.
Model Summaryc
Change Statistics
R Adjusted Std. Error of
Model R R Square F Sig. F
Square R Square the Estimate df1 df2
Change Change Change
a
1 .711 .505 .500 .6049 .505 100.016 1 98 .000
2 .895b .801 .784 .3979 .296 19.363 7 91 .000
a. Predictors: (Constant), Usage Level
b. Predictors: (Constant), Usage Level, Price Level, Salesforce Image, Product Quality, Price Flexibility,
Manufacturer Image, Delivery Speed, Service
c. Dependent Variable: Satisfaction Level
b) Interpretation of Output from Hierarchical Multiple Regression
The output generated from this analysis is similar to the previous output, but with some extra
pieces of information. In the Model Summary box there are two models listed. Model 1
refers to the first block of variables that were entered (Usage Level), while Model 2 includes
all the variables that were entered in both blocks (Perception of HATCO : X1-X7 Variable).
Step 1: Evaluating the model
Check the R Square values in the first Model summary box. After the variables in Block 1
(Usage Level) have been entered, the overall model explains 5.0 per cent of the variance
(.050 100). After Block 2 variables (Perception of HATCO) have also been included, the
22
model as a whole explains 80.1 per cent (.801 100). It is important to note that this second
R square value includes all the variables from both blocks, not just those included in the
second step. To find out how much of this overall variance is explained by our variables of
interest (Perception of HAATCO) after the effects of Usage Level desirable responding are
removed, you need to look in the column labelled R Square change.
In the output presented above you will see, on the line marked Model 2, that the R square
change value is .296. This means that X1-X7 variables explain an additional 29.6 per
cent (.296 100) of the variance in Satisfaction Level, even when the effects of Usage
Level is statistically controlled for. This is a statistically significant contribution, as indicated
by the Sig. F change value for this line (.000). The ANOVA table indicates that the model as a
whole (which includes both blocks of variables) is significant [F (8, 91) = 45.84, p<.0005).
ANOVAa
1 Regression 36.602 1 36.602 100.016 .000b
Residual 35.864 98 .366
Total 72.466 99
2 Regression 58.060 8 7.257 45.843 .000c
Residual 14.406 91 .158
Total 72.466 99
b. Predictors: (Constant), Usage Level
c. Predictors: (Constant), Usage Level, Price Level, Salesforce Image, Product Quality, Price
Flexibility, Manufacturer Image, Delivery Speed, Service
2) Stepwise multiple regression
In stepwise regression the researcher provides SPSS with a list of independent variables and
then allows the program to select which variables it will enter, and in which order they go
into the equation, based on a set of statistical criteria. There are three different versions of this
approach: forward selection, backward deletion and stepwise regression. There are a number
of problems with these approaches, and some controversy in the literature concerning their
use (and abuse). Before using these approaches I would strongly recommend that you read up
on the issues involved (see p. 138 in Tabachnick & Fidell, 2001). It is important that you
23
understand what is involved, how to choose the appropriate variables and how to interpret the
output that you receive.
24

Edu6950 Multiple Regression

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Edu6950 Multiple Regression

Hochgeladen von

Copyright:

Verfügbare Formate

MULTIPLE REGRESSION

EDU6950 ADVANCED STATISTICS IN EDUCATION

PROF. DR. MOHD MAJID BIN KONTING

NOR SAADAH BINTI JAMALUDDIN

analysis should be attempted to predict the product usage levels of the

a) What is Multiple Regression?

The output generated from this procedure is shown below:-

Mean Std. Deviation N

What is Pearson Correlation?

Why Pearson Correlation?

When Pearsons r is close to 1

When Pearsons r is close to 0

When Pearsons r is positive (+)

When Pearsons r is negative (-)

Adjusted R Std. Error of the

a. Predictors: (Constant), Product Quality, Service, Salesforce Image, Price

Adjusted R Std. Error of the

1 Regression 3927.309 1 3927.309 94.525 .000b

Residual 4071.691 98 41.548

Residual 1853.300 96 19.305

a. Dependent Variable: Usage Level

Model B Std. Error Beta t Sig.

1 (Constant) 21.653 2.596 8.341 .000

Service 8.384 .862 .701 9.722 .000

Service 7.621 .607 .637 12.547 .000

Price Flexibility 3.376 .320 .521 10.562 .000

Salesforce Image 1.406 .591 .121 2.378 .019

a. Dependent Variable: Usage Level

a. Dependent Variable: Usage Level

Adjusted R Std. Error of the

a. Predictors: (Constant), Product Quality, Service, Salesforce Image,

Model Sum of Squares df Mean Square F Sig.

1 Regression 6198.677 7 885.525 45.252 .000b

Residual 1800.323 92 19.569

Residual 1853.300 96 19.305

a. Dependent Variable: Usage Level

Model B Std. Error Beta t Sig.

1 (Constant) -10.187 4.977 -2.047 .044

Delivery Speed -.058 2.013 -.008 -.029 .977

Price Level -.697 2.090 -.093 -.333 .740

Price Flexibility 3.368 .411 .520 8.191 .000

Manufacturer Image -.042 .667 -.005 -.063 .950

Service 8.369 3.918 .699 2.136 .035

Salesforce Image 1.281 .947 .110 1.352 .180

Product Quality .567 .355 .100 1.595 .114

Price Flexibility 3.376 .320 .521 10.562 .000

Service 7.621 .607 .637 12.547 .000

Salesforce Image 1.406 .591 .121 2.378 .019

a. Dependent Variable: Usage Level

Model Beta In t Sig. Correlation Tolerance

2 Delivery Speed -.008b -.029 .977 -.003 .028

Manufacturer Image -.002e -.021 .983 -.002 .357

Price Level -.029e -.405 .687 -.041 .462

Product Quality .071e 1.273 .206 .130 .768

a. Dependent Variable: Usage Level

Step 1: Checking the assumptions

Unstandardized Standardized 95.0% Confidence Collinearity

a. Dependent Variable: Satisfaction Level

SPSS also performs collinearity diagnostics on your variables as part of the

Tolerance is an indicator of how much of the variability of the specified independent

Outliers, Normality, Linearity, Homoscedasticity, Independence of Residuals.

Step 2: Evaluating the Model

Adjusted R Std. Error of the