Regression I: Simple Regression: Class 21

Regression I:
Simple Regression
Class 21
Schedule for Remainder of Term
Nov 21: Simple Regression
Nov 28: Multiple Regression

Stats Take-Home Exercise assigned
Nov 30: Moderated Multiple Regression
Dec. 5: Moderated Multiple Regression Wrap Up, Review

Quiz 3
Dec. 7: Research Questions
Dec. 19: Final Exam, Room 302, 1:30-4:30 or Regular Time?

Stats Take-Home Exercise Due
Caveat on Regression Sequence
Regression is complex, rich topic –
simple and multiple regression can be
a course in itself.
We can cover only a useful introduction in 3 classes.
Will cover:
Simple Regression: Does income affect purchasing?
Multiple Regression: Do income and caffeine affect purchasing?
Moderated Multiple Regression: Does caffeine moderate the effect
of income on purchasing?
If Time Permits: Diagnostic stats, outliers, influential cases, cross

validation, regression plots, checking assumptions
ANOVA VS. REGRESSION
35
30
ANOVA: Do the means
Aggression
25
of Group A, Group B 20
15
and Group C differ? 10
5
Categorical data only 0
Tennis fans Football fans Hocky fans
Regression: Does 12
10
Variable X influence
Aggression
8
Outcome Y? 6
4
Continuous Data and 2
Categorical Data 0
low medium high very extreme
high
Frustration
Regression vs. ANOVA as
Vehicles for Analyzing Data
ANOVA: Sturdy, straightforward,

robust to violations, easy to
understand inner workings, but
limited range of tasks.
Regression: Amazingly versatile,

agile, super powerful, loaded with
nuanced bells & whistles, but very
sensitive to violations of
assumptions. A bit more art.
Functions of Regression
1. Establishing relations between variables
Do frustration and aggression co-vary?
2. Establishing causality between variables
Does frustration (at Time 1) predict aggression (at Time 2)?
3. Testing how multiple predictor variables relate to, or predict, an outcome variable.
Do frustration, and social class, and family stress predict aggression?
[additive effects]
4. Testing for unique effects of Variable A controlling for other variables.
Does frustration predict aggression, beyond social class and family stress?
5. Test for moderating effects between predictors on outcomes.
Does frustration predict aggression, but mainly for people with more family
stress? [interactive effect]
6. Forecasting/trend analyses
If incomes continue to decline in the future by X amount, aggression will
increase by Y amount.
The Palace Heist: A True-Regression Mystery
Sterling silver from the royal palace is missing. Why?
Facts gathered during investigation
A. General public given daily tours of palace

B. Reginald, the ADD butler, misplaces things
C. Prince Guido, the playboy heir, has gambling debts
Possible Explanations?
A. Public is stealing the silver
B. Reginald is misplacing the silver
C. Guido is pawning the silver
The Palace Heist: A True-Regression Mystery
Possible explanations:
A. Public is stealing silver
B. Reginald’s ADD leads to misplaced silver
C. Guido is pawning silver
Is it just one of these explanations, or a combination of them?
E.g., Public theft, alone, OR public theft plus Guido’s gambling?
If it is multiple causes, are they equally important or is one
more important than another?
E.g., Crowd size has a significant effect on lost silver, but is
less important than Guido’s debts.
Moderation: Do circumstances interact?
E.g., Does more silver get lost when Reginald’s ADD is severe,
but only when crowds are large?
Regression Can Test Each of These Possibilities,
And Can Do So Simultaneously
DATASET ACTIVATE DataSet1.

REGRESSION
/DESCRIPTIVES MEAN STDDEV CORR SIG N
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA COLLIN
TOL CHANGE
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT missing.silver
/METHOD=ENTER crowds.size Variable 1
/METHOD=ENTER reginald.ADD Variable 2
/METHOD=ENTER guido.debts Variable 3
/METHOD=ENTER crowds.reginald. Variable 1 X Variable 2
Why Do Bullies Harass Other Students?
Investigation shows that bullies are often:
A. Reprimanded by teachers for unruly behavior
B. Have a lot of family stress
Possible explanations for bullies’ behavior?

A. Frustrated by teachers’ reprimands—take it out on others.
B. Family stress leads to frustration—take it out on others.
Questions based on these possible explanations are:
Is it reprimands alone, or stress alone or reprimands + stress?

Are reprimands important, after considering stress (and vice versa)?
Do reprimands matter only if there is family stress?
Simple Regression
Features: Outcome, Intercept, Predictor, Error
NOTE:
“Residual” in regression = error.
Y = b0 + b1 + Error (residual) In ANOVA and planned contrast,
“Residual” = interaction. Sorry!
Do bullies aggress more after being reprimanded?
Y = DV = Aggression
bo = Intercept = average of DV before other variables are considered.

How much is aggression due just to being a bully, regardless of
other influences?
b1 = slope of IV = influence of IV on outcome.

How much do reprimands cause bullies to aggress?
Y = DV (aggression) Elements of Regression
Equation
b0 = intercept;
b0 = the average value of DV
Y = b0 + b 1 + Ɛ
BEFORE accounting for IV
b0 = mean DV WHEN IV = 0
Aggression
4
3
B1 = slope
2
B1 = Effect of DV on IV (effect
1
of reprimands on aggression)
1 2 3 4 5 6 7 8
Coefficients = parameters; Reprimands
things that account for Y.
b0 and b1 are coefficients.
Ɛ = error; changes in DV that

are not due to coefficients.
Translating Regression Equation
Into Expected Outcomes
Y = 2 + 1.0b + Ɛ means that
6
bullies will aggress 2 times a day
5
Aggression
plus (1 * number of reprimands).
4
3
How many times will a bully
2
aggress if he/she is reprimanded
1
3 times?
1 2 3 4 5 6 7 8
Y = 2 + 1.0 (3) = 5
Reprimands
Regression allows one to predict how an individual will behave

(e.g., aggress) due to certain causes (e.g., reprimands).
Quick Review of New Terms
Y = b0 + b1 + Ɛ
Y is the: Outcome, aka DV
B0 is the: Intercept; average score when IV = 0
B1 is the: Slope, aka predictor, aka IV
Ɛ is the: Error, aka changes in DV not explained by IV
The coefficients include: Intercept and slope(s), B0 and B1
Does B0 = mean of the sample? NO! B0 is expected score ONLY

when slope = 0
If Y = 5.0 + 2.5b + Ɛ, what is Y
when b = 2? 5.0 + (2.5 * 2) = 10
Regression In English*
The effect that days-per-week meditating has on SAT scores
Y = 1080 + 25b in English?
Students’ SAT is 1080 without meditation, and increases by 25
points for each additional day of weekly meditation.
The effect of Anxiety (in points) on threat detection Reaction Time (in ms)
Y = 878 -15b in English?
Reaction time is 878 ms when anxiety score = 0, and decreases by 15
ms for each 1 pt increase on anxiety measure.
The effect of parents’ hours of out-loud reading on toddlers’ weekly word
acquisition.
Y =35 + 8b in English?
Toddlers speak 35 words when parents never read out-loud, and
acquires 8 words per week for every hour of out-loud reading.
* Fabricated outcomes
Positive, Negative, and Null Regression Slopes
7
6
5
4
3
2
1
1 2 3
Y=3+0
Regression Tests “Models”
Model: A predicted TYPE of relationship between one
or more IVs (predictors) and a DV (outcome).
Relationships can take various shapes:
Linear: Calories consumed and weight gained.
Curvilinear: Stress and performance
J-shaped: Insult and response intensity
Catastrophic or exponential: How many

months old and language ability.
Regression Tests How Well the Model
“Fits” (Explains) the Obtained Data
Predicted Model: As reprimands increase, bullying will increase.
This is what kind of model? Linear
4
Aggression
3
2
1
1 2 3 4 5 6 7 8
Reprimands
Linear Regression asks: Do data describe a straight, sloped line?

Do they confirm a linear model?
Locating a "Best Fitting" Regression Line
* *
9
*
8
Individual
* Response *
7
*
Aggression * * *
6
5 * * *
* * * *
4
* * * *
3
* *
2
* * *
1
1 2 3 4 5 6 7 8 9 10 11 12
Reprimands
Line represents the "best fitting slope".
Disparate points represent residuals = deviations from slope.
"Model fit" is based on method of least squares.
Method of Least Squares
Regression attempts to find the “best
fitting” line to describe data.
This is line in which, on average,

deviations (residuals) between actual
*
responses (data points) and predicted *
responses (regression slope) are
smallest. *
Least squares refers to “least squared

differences” between data points and
slope.
Method of least squares is calculation

done to determine the best fitting line,
using residuals.
Error = Average Difference Between All Predicted Points
(X88 - Ŷ88) and Actual Points (X88 - Y88)
Actual Response
9 10
Deviation,
i.e., Error = * *
predicted – * X88 - Y88 *
8
Aggression (Y)
actual.
*
7
ε 88 * * *
6
* * *
5
* * * *
X88 - Ŷ88
4
Predicted
* * * *
* *
3
Response
* * *
2
Note "88" = Subject # 88

1
1 2 3 4 5 6 7 8 9 10 11 12
Reprimands (X)
Regression Compares Slope to Mean
10
*
9
8 * * *
*
*
4 5 6 7
Aggression
* *
* *
*
Null Hyp: Mean score of aggression is best predictor,
reprimands unimportant (b1 = 0)
3
Alt. Hyp: Reprimands explain aggression

2
above and beyond the mean, (b1 > 0)

1
1 2 3 4 5 6 7 8 9 10 11 12
Reprimands
Observed slope
10
Null slope
4 5 6 79
8
Aggression
Random slopes, originating

at random means
3
Is observed slope random or meaningful?

2
That's the Regression question.

1
1 2 3 4 5 6 7 8 9 10 11 12
Reprimands
Total Sum of Squares (SST)
10
*
Model
9
8 * * * Slope
*
*
4 5 6 7
Aggression
* *
* * Null Slope
Total Sum of Squares (SST) = Deviation of each score

3
from DV mean (null slope, i.e., zero slope),

square these deviations, then sum them.
2
1
1 2 3 4 5 6 7 8 9 10 11 12
Reprimands
Residual Sum of Squares (SSR)
10
*
9
8 * * *
*
*
4 5 6 7
Aggression
* *
* *
*
Residual Sum of Squares (SSR) = Each residual

3
from regression line, square, then sum all

these squared residuals.
2
1
1 2 3 4 5 6 7 8 9 10 11 12
Reprimands
The Regression Question
Does the model (e.g., the regression line) do a better job
describing obtained data than the mean (i.e., b1 = 0)?
In other words,
Are residuals, on average, smaller around the model than around

the mean?
Regression compares residuals around the mean to residuals

around the model (e.g., line).
If model residuals are smaller, the model “wins”, if model residuals

not smaller, the model loses.
Elements of Regression
Total Sum of Squares (SST) = Deviation of each score from the

DV mean, square these deviations, then sum them.
Residual Sum of Squares (SSR) = Deviation of each score from the
regression line, squared, then sum all these squared residuals.
Model Sum of Squares (SSM) = SST – SSR = The amount that the
regression slope explains outcome above and beyond the simple
mean.
R2 = SSM / SST = Proportion of model, (i.e. proportion of variance)
explained, by the predictor(s). Measures how much of the DV
is predicted by the IV (or IVs).
R2 = (SST – SSR) / SST
NOTE: What happens to R2 when SSR is smaller? It gets bigger
NOTE: Max R2 = 1 (SST – SSR ) / SST;
If SSR = 0 then R2 = SST / SST = 1, = 100% Var.
Assessing Overall Model:
The Regression F Test
In ANOVA F = Treatment / Error, = MSB / MSW
In Regression F = Model / Residuals, = MSM / MSR

AKA slope line / random error around slope line
MSM = SSM / df (model) MSR = SSR / df (residual)
df (model) = number of predictors (betas, not counting intercept)

df (residual) = number of observations (i.e., subjects) – estimates
(i.e. all betas and intercept). If N = 20, then df = 20 – 2 = 18
F in Regression measures whether overall model does better than
chance at predicting outcome.
F Statistic in Regression
SSM df = No. predictors (reprimands) = 1
SSR df = subjects – (coefficients)

= 20 – (intercept, reprimands) = 18
Regression F
“Regression” = model
SSM
SSR
MSR MSM
Assessing Individual Predictors
Is the predictor slope significant, i.e. does IV predict outcome?
b1 = slope of sole predictor in simple regression.

If b1 = 0 then change in predictor has zero influence on outcome.
If b1 > 0, then it has some influence. How much greater than 0 must b1
be in order to have significant influence?
t stat tests significance of b1 slope.
b observed – b expected (null effect b; i.e., b = 0)
t=
SEb
b observed t df = n – 1 – predictors (1) = n - 2

t=
SEb Note: Predictors = betas, not b0
t Statistic in Regression
predictor t sig. of t
B = slope; Std. Error = Std. Error of slope t = B / Std. Error of B
Beta = Standardized B. Shows how many SDs outcome changes
per each SD change in predictor.
Beta allows comparison between predictors, of predictor strength.
Interpreting Simple Regression
Overall F Test: Our model of reprimand having an effect on

aggression is confirmed.
t Test: Reprimands lead to more aggression. In fact, for every 1

reprimand there is a .61 aggressive act, or roughly 1
aggressive act for every 2 reprimands.
Key Indices of Regression
R = Degree to which entire model correlates with outcome
R2 = Proportion of variance model explains
F= How well model exceeds mean in predicting outcome
b = The influence of an individual predictor at influencing

outcome.
beta = b transformed into standardized units
t of b = Significance of b (b / std. error of b)

Multiple Regression (MR)
Y = bo + b1 + b2 + b3 + ……bx + ε
Multiple regression (MR) can incorporate any number of

predictors in model.
“Regression plane” with 2 predictors, after that it
becomes increasingly difficult to visualize result.
MR operates on same principles as simple regression.
Multiple R = correlation between observed Y and Y as

predicted by total model (i.e., all predictors at once).
Two Variables Produce "Regression Plane"
Aggression
Reprimands Family Stress

Multiple Regression Example
Is aggression predicted by teacher reprimands and
family stresses?
Y = bo + b1 + b2 + ε
Y = __ Aggression
bo = __ Intercept (being a bully, by itself)
b1 = __ reprimands
b2 = __ family stress
ε = __ error
Elements of Multiple Regression
Total Sum of Squares (SST) = Deviation of each score from DV mean,
square these deviations, then sum them.
Residual Sum of Squares (SSR) = Each residual from total model (not
simple line), squared, then sum all these squared residuals.
Model Sum of Squares (SSM) = SST – SSR = The amount that the
total model explains result above and beyond the simple mean.
R2 = SSM / SST = Proportion of variance explained, by the total model.

Adjusted R2 = R2, but adjusted to having multiple predictors
NOTE: Main diff. between these values in mutli. regression and simple
regression is use of total model rather than single slope. Math much
more complicated, but conceptually the same.
Methods of Regression
Hierarchical: 1. Predictors selected based on theory or past work
2. Predictors entered into analysis in order of predicted
importance, or by known influence.
3. New predictors are entered last, so that their
unique contribution can be determined.
Forced Entry: All predictors forced into model simultaneously. No
starting hypothesis re. relative importance of predictors.
Stepwise: Program automatically searches for strongest
predictor, then second strongest, etc. Predictor
1—is best at explaining entire model, accounts for
say 40% . Predictor 2 is best at explaining
remaining 60%, etc. Controversial method.
In general, Hierarchical is most common and most accepted.
Avoid “kitchen sink” Limit number of predictors to few as possible, and
to those that make theoretical sense.
Sample Size in Regression
Simple rule: The more the better!

Field's Rule of Thumb: 15 cases per predictor.
Green’s Rule of Thumb:
Overall Model: 50 + 8k (k = #predictors)
Specific IV: 104 + k
Unsure which? Use the one requiring larger n
Multiple Regression in SPSS
REGRESSION
/DESCRIPTIVES MEAN STDDEV CORR SIG N
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA CHANGE
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT aggression
/METHOD=ENTER family stress
/METHOD=ENTER reprimands.
“OUTS” refers to variables excluded in, e.g. Model 1

“NOORIGIN” means “do show the constant in outcome report”.
“CRITERIA” relates to Stepwise Regression only; refers to which IVs
kept in at Step 1, Step 2, etc.
SPSS Regression Output: Descriptives
SPSS Regression Output: Model Effects
Same as correlation
R = Power of regression
R2 = Amount var. explained
Adj. R2 = Corrects for multiple
predictors
R sq. change = Impact of each
added model
Sig. F Change = does new model

explain signif. amount added variance
SPSS Regression Output: Predictor Effects
Requirements and Assumptions
(these apply to Simple and Multiple Regression)
Variable Types: Predictors must be quantitative or categorical (2

values only, i.e. dichotomous); Outcomes must be interval.
Non-Zero Variance: Predictors have variation in value.
No Perfect multicollinearity: No perfect 1:1 (linear) relationship
between 2 or more predictors.
Predictors uncorrelated to external variables: No hidden “third
variable” confounds
Homoscedasticity: Variance at each level of predictor is constant.
Requirements and Assumptions
(continued)
Independent Errors: Residuals for Sub. 1do not determine

residuals for Sub. 2.
Normally Distributed Errors: Residuals are random, and sum to

zero (or close to zero).
Independence: All outcome values are independent from one

another, i.e., each response comes from a subject who is
uninfluenced by other subjects.
Linearity: The changes in outcome due to each predictor are

described best by a straight line.
Regression Assumes Errors are normally, independently, and
identically Distributed at Every Level of the Predictor (X)
X1 X2 X3
Homoscedasticity and
Heteroscedasticity
Assessing Homoscedasticity
Select: Plots
Enter: ZRESID for Y and ZPRED for X
Ideal Outcome: Equal distribution across chart
Extreme Cases
Cases that deviate greatly from * *
expected outcome > ± 2.5 can * * *
warp regression. *
* * * *
*
First, identify outliers using *
Casewise Diagnostics option.
Then, correct outliers per outlier-

correction options, which are:
1. Check for data entry error

2. Transform data
3. Recode as next highest/lowest plus/minus 1
4. Delete
Casewise Diagnostics Print-out in SPSS
Possible problem
case
Casewise Diagnostics for Problem Cases Only
In "Statistics" Option, select Casewise Diagnostics
Select "outliers outside:" and type in how many Std. Dev. you
regard as critical. Default = 3
More than 3 DV
What If Assumption(s) are Violated?
What is problem with violating assumptions?
Can't generalize obtained model from test sample

to wider population.
Overall, not much can be done if assumptions are substantially

violated (i.e., extreme heteroscedasticity, extreme auto-
correlation, severe non-linearity).
Some options:
1. Heteroscedasticity: Transform raw data (sqr. root, etc.)

2. Non-linearity: Attempt logistic regression
A Word About Regression Assumptions
and Diagnostics
Are these conditions complicated to understand? Yes

Are they laborious to check and correct? Yes
Do most researchers understand, monitor, and
address these conditions? No
Even journal reviewers are often unschooled, or don’t take time,
to check diagnostics. Journal space discourages authors from
discussing diagnostics. Some have called for more attention to
this inattention, but not much action.
Should we do diagnostics? GIGO, and fundamental ethics.
Reporting Hierarchical Multiple Regression
Table 1:
Effects of Family Stress and Teacher Reprimands on Bullying
B SE B β
Step 1
Constant -0.54 0.42
Fam. Stress 0.74 0.11 .85 *
Step 2
Constant 0.71 0.34
Fam. Stress 0.57 0.10 .67 *
Reprimands 0.33 0.10 .38 *
Note: R2 = .72 for Step 1, Δ R2 = .11 for Step 2 (p = .004); * p < .01

Regression I: Simple Regression: Class 21

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Regression I: Simple Regression: Class 21

Hochgeladen von

Copyright:

Verfügbare Formate

Regression I:

Nov 28: Multiple Regression

Nov 30: Moderated Multiple Regression

Dec. 5: Moderated Multiple Regression Wrap Up, Review

Dec. 7: Research Questions

Dec. 19: Final Exam, Room 302, 1:30-4:30 or Regular Time?

We can cover only a useful introduction in 3 classes.

If Time Permits: Diagnostic stats, outliers, influential cases, cross

Tennis fans Football fans Hocky fans

ANOVA: Sturdy, straightforward,

Regression: Amazingly versatile,

Sterling silver from the royal palace is missing. Why?

Facts gathered during investigation

A. General public given daily tours of palace

DATASET ACTIVATE DataSet1.

Possible explanations for bullies’ behavior?

Questions based on these possible explanations are:

Is it reprimands alone, or stress alone or reprimands + stress?

Do bullies aggress more after being reprimanded?

bo = Intercept = average of DV before other variables are considered.

b1 = slope of IV = influence of IV on outcome.

Ɛ = error; changes in DV that

Y = 2 + 1.0b + Ɛ means that

Regression allows one to predict how an individual will behave

Y is the: Outcome, aka DV

B0 is the: Intercept; average score when IV = 0

B1 is the: Slope, aka predictor, aka IV

Ɛ is the: Error, aka changes in DV not explained by IV

The coefficients include: Intercept and slope(s), B0 and B1

Does B0 = mean of the sample? NO! B0 is expected score ONLY

Linear: Calories consumed and weight gained.

Curvilinear: Stress and performance

J-shaped: Insult and response intensity

Catastrophic or exponential: How many

Linear Regression asks: Do data describe a straight, sloped line?

This is line in which, on average,

Least squares refers to “least squared

Method of least squares is calculation

Note "88" = Subject # 88

Alt. Hyp: Reprimands explain aggression

above and beyond the mean, (b1 > 0)

Random slopes, originating

Is observed slope random or meaningful?

That's the Regression question.

Total Sum of Squares (SST) = Deviation of each score

from DV mean (null slope, i.e., zero slope),

Residual Sum of Squares (SSR) = Each residual

from regression line, square, then sum all

Are residuals, on average, smaller around the model than around

Regression compares residuals around the mean to residuals

If model residuals are smaller, the model “wins”, if model residuals

Total Sum of Squares (SST) = Deviation of each score from the

In Regression F = Model / Residuals, = MSM / MSR

MSM = SSM / df (model) MSR = SSR / df (residual)

df (model) = number of predictors (betas, not counting intercept)

SSM df = No. predictors (reprimands) = 1

SSR df = subjects – (coefficients)

b1 = slope of sole predictor in simple regression.

b observed t df = n – 1 – predictors (1) = n - 2

Overall F Test: Our model of reprimand having an effect on

t Test: Reprimands lead to more aggression. In fact, for every 1

F= How well model exceeds mean in predicting outcome

b = The influence of an individual predictor at influencing