Beruflich Dokumente
Kultur Dokumente
BehavioralSciences
Lesson7
Correlation
RogerN.Morrissette,PhD
I.Correlations(VideoLesson7I)(YouTubeversion)
Acorrelationisastatisticaltestthatdemonstratestherelationshipbetweentwovariables.Even
thoughyoumaybeabletoshowasignificantrelationshipbetweentwovariables,acorrelation
doesnotshowacausalrelationshipbetweenthetwovariables.Forexample,although
depressionandselfesteemaretwovariablesthataresignificantlycorrelatedtoeachotherwe
cannotsaythatlowselfesteemcausesdepression.Likewise,wecannotsaythatdepression
causeslowselfesteem.Thetwovariablesmaybesignificantlycorrelatedbutnocausal
relationshipisassumed.Correlationsarebestrepresentedgraphicallybyascatterplotandbest
calculatedbyusingthePearsonProductMomentCorrelationformula.
Let'stestthishypothesisthatdepressionscoresarenegativelycorrelatedtoselfesteem
scores.Wedesignoursurveysandsample8subjects.Theirdataispresentedbelow.Datafora
correlationarealwayspresentedintwocolumnslikethedatasetshownbelow.Depression
scoresareourXdataandSelfEsteemScoresareourYdata:
Self
Depression
Esteem
(X)
(Y)
10
12
19
4
25
15
21
7
104
100
98
150
75
105
82
133
II.Scatterplots(VideoLesson7II)(YouTubeversion)
Ascatterplotisagraphicalrepresentationofthetwosetsofdatayouarecomparing.TheXaxis
plotsyourfirstor"X"data,andtheYaxisplotsyoursecondor"Y"setofdata.Thescatterplot
cantellyoutwoimportantthingsabouttherelationshipbetweenyourtwovariables.Firstitcan
showyouifyouhaveaweakorstrongrelationshipbetweenyourvariables.Secondly,itcan
tellyouifyourvariablesarenegativelyorpositivelyrelated.
A.Scatterplotscanshowthestrengthoftherelationshipbetweentwovariables
1.Weakrelationshipswillhaveawidescatteringoftheplots
2.Strongrelationshipswillhaveaminimalscatteringoftheplots
B.Scatterplotscanshowthedirectionortypeoftherelationshipbetweentwovariables
1.PositiveCorrelation
bothfactorsvaryinthesamedirection
asonefactorincreases,theotherincreases
2.NegativeCorrelation
bothfactorsvaryinoppositedirections
asonefactorincreases,theotherdecreases
3.ZeroorNeutralCorrelation
thetwofactorsshownorelationshiptooneanother
III.ThePearsonProductMomentCorrelation(CorrelationCoefficient)(VideoLesson7III)(YouTube
version)(CorrelationCalculationYouTubeversion)(mp4version)
Thecorrelationcoefficientisastatisticthatcalculatestheactualrelationshipbetweentwo
variables.Ithasarangebetween1.00and+1.00.Youcannotgetacorrelationof1.5.Avalue
of1.00wouldbeaperfect(verystrong)negativecorrelation,avalueof+1.00wouldbea
perfect(verystrong)positivecorrelation,andavalueof0.00wouldbea(veryweak)zeroor
neutralcorrelation.TocalculatethecorrelationcoefficientweusethePearsonProduct
MomentCorrelation(r):
Theformulareads:requals.Inthenumerator:nornumberofpairsmultipliedbythesumofX
andYthensubtractthesumofXtimesthesumofY.Inthedenominator:Takethesquareroot
ofthefinalsumofntimesthesumofXsquaredminusthesumofXthensquared,thenmultiply
thatvaluebyntimesthesumofYsquared,thenminusthesumofYthensquared.
Depression
Self
(X)
Esteem(Y)
10
12
19
4
25
15
21
7
104
100
98
150
75
105
82
133
Tocalculatethecorrelationcoefficient(r)forthedataabovewefirstneedtoexpandthe
columnsjustaswedidwhenwecalculatedstandarddeviation.Ifyoulookattheformulaabove
youwillseethatweneedanXsquaredcolumn,aYsquaredcolumnandanXtimesY
column.Thisfirststepisshowbelow:
X
X2
Y2
XY
10
104
100
10816
1040
12
100
144
10000
1200
19
4
25
15
21
7
98
150
75
105
82
133
361
16
625
225
441
49
9604
22500
5625
11025
6724
17689
1862
600
1875
1575
1722
931
Thenextstepistocalculatethesumsofourcolumns:
X
X2
Y2
XY
10
12
19
4
25
15
21
7
113
n=8
104
100
98
150
75
105
82
133
847
100
144
361
16
625
225
441
49
1961
10816
10000
9604
22500
5625
11025
6724
17689
93983
1040
1200
1862
600
1875
1575
1722
931
10805
Nowwehavealltheinformationweneedtosolveourequation:
r=(8x10805)(113x847)/squareroot[(8x1961)(113)2]x[(8x93983)(847)2]
r=(86440)(95711)/squareroot[(15688)(12769)]x[(751864)(717409)]
r=9271/squareroot[(2919)x(34455)]
r=9271/squareroot(100574145)
r=9271/10028.666
r=0.9244
Ourcorrelationalcoefficientisnegativeandverycloseto1.00whichtellsusthatwehavea
strongnegativerelationshipbetweenourtwovariables.Ifwelookatthescatterplotofourdata
wecanseethatthescatterplotisinalignedwithourcorrelationalcoefficient.
IV.DeterminingSignificance(VideoLesson7IV)(YouTubeversion)
Nowthatwehavecalculatedourcorrelationcoefficientweneeddeterminehowsignificantitis.
Therearetwowaystodeterminethesignificanceofacorrelation:thefirstistocalculatethe
CoefficientofDeterminationandthesecondistousetheRTable.
A.TheCoefficientofDetermination
Thecoefficientofdeterminationdetermineshowmuchofthevarianceofonefactorcanbe
explainedbythevariabilityofafactorwithwhichitiscorrelated.Tocalculatethecoefficientof
determinationwesimplysquarethervalue.
CoefficientofDetermination=r2
Forourrof0.9244,thecoefficientofdeterminationwouldber2=0.8545.Thismeansthat
85%ofthevarianceofourdepressionscorescanbeexplainedbythevarianceofourselfesteem
scores.Thisisaprettyhighvalueandsuggestsaverystrongrelationshipbetweenthetwo
variables.Althoughthecoefficientofdeterminationisagoodpredictorofthestrengthofthe
relationshipbetweentwovariables,itdoesnotpredictsignificance.WewillneedtousetheR
Tabletoconfirmifourcorrelationisstatisticallysignificant.
B.TheRTable
TheRTableislocatedinitsentiretyinAppendixAinthebackofthetextbook.Ashortenedversion
isalsoavailableatthebottomofthislecture.Itstartsonpage435andgivesthecriticalRvalues
basedondegreesoffreedomofyoursample,thelevelofsignificanceofthestatisticaltest,and
whetheryourhypothesisisoneortwotailed.ThesethreefactorsplusthecriticalRvaluesare
representedintheRTableandwillbeexplainedoneatatime.
1.DegreesofFreedom.
Thetermdegreesoffreedomreferstothenumberofscoreswithinadatasetthatarefreetovary.In
anysamplewithafixedmean,thesumofthedeviationscoresisequaltozero.Ifyoursamplehasan
nequalto10.Thefirst9scoresarefreetovarybutthe10thscoremustbeaspecificvaluethat
makestheentiredistributionequaltozero.Thereforeinasinglesamplethedegreesoffreedom
wouldbeequalton1.Thedegreesoffreedomforacorrelationisslightlydifferentbecausenequals
numberofpairsnotsimplysamplesize.Therefore,thedegreesoffreedomforacorrelationinn2.
Sotocalculatethedegreesoffreedomyousimplytakethenumberofpairsandsubtracttwo.Forour
datasetofdepressionandselfesteemscoresthedegreesoffreedomarecalculatedthefollowing
way:
df=n2
df=82
df=6
TheRTableshowsthedegreesoffreedomvaluesinthefarleftcolumnasshownbelow:
LevelsofSignificanceforaOneTailedTest
.05
.025
.01
.005
LevelsofSignificanceforaTwoTailedTest
df
.10
.05
.02
.01
1
.988
.997
.9995
.9999
2
.900
.950
.980
.990
3
.805
.878
.934
.959
4
.729
.811
.882
.917
5
.669
.754
.833
.874
6
.622
.707
.789
.834
7
.582
.666
.750
.798
8
.549
.632
.716
.765
9
.521
.602
.685
.735
10
.497
.576
.658
.708
Thetablecontinues
2.OneorTwotailedhypotheses.
Thenumberoftailsofahypothesispredictthedirectionofthehypothesis.Thisconceptwillbe
discussedingreaterdetailinchapter11.Fornow,youshouldknowthatifacorrelationhypothesisis
simplypredictinganeffectwithoutpredictingeitheranegativeorpositivedirectionofthateffect,itis
consideredaTwoTailedhypothesis.Ifthehypothesisispredictingeitheranegativeorpositive
directionthenitisaOneTailedhypothesis.Sinceourhypothesisasstatedpredictsanegative
correlationitisaOneTailedTest.Thetwolevelsofhypothesistestsarehighlightedbelow:
LevelsofSignificanceforaOneTailedTest
.05
.025
.01
.005
LevelsofSignificanceforaTwoTailedTest
df
.10
.05
.02
.01
1
.988
.997
.9995
.9999
2
.900
.950
.980
.990
3
.805
.878
.934
.959
4
.729
.811
.882
.917
5
.669
.754
.833
.874
6
.622
.707
.789
.834
7
.582
.666
.750
.798
8
.549
.632
.716
.765
9
.521
.602
.685
.735
10
.497
.576
.658
.708
3.LevelsofSignificance.
Thelevelsofsignificanceor"pvalues"willalsobediscussedingreaterdetailinchapters11,12,and
13.Fornowyoushouldsimplyknowthatalevelofsignificanceat.05isequivalenttop=.05which
meansthatthereisa95%probabilityofstatisticalsignificance(1.000.05=0.95or95%)between
yourtwovariables.The.05valueisconsideredstandardinscience.Levelsofsignificancethatare
smallershowgreatersignificanceandvaluesthatarelargershowlesssignificance.Thisvaluemust
begiventoyouintheproblem.Forourexamplelet'suseap=.05.Thetablebelowshowsthe
highlightedlevelsofsignificance:
LevelsofSignificanceforaOneTailedTest
.05
.025
.01
.005
df
1
2
3
4
5
6
7
8
9
10
LevelsofSignificanceforaTwoTailedTest
.10
.05
.02
.01
.988
.997
.9995
.9999
.900
.950
.980
.990
.805
.878
.934
.959
.729
.811
.882
.917
.669
.754
.833
.874
.622
.707
.789
.834
.582
.666
.750
.798
.549
.632
.716
.765
.521
.602
.685
.735
.497
.576
.658
.708
4.CriticalRValues.
Criticalvaluesarethresholdvaluesforsignificance.Yourcalculatedrvaluemustexceedthecriticalr
valueintheRTabletobeconsideredsignificant.Thetablebelowshowsthehighlightedcriticalr
values:
LevelsofSignificanceforaOneTailedTest
.05
.025
.01
.005
LevelsofSignificanceforaTwoTailedTest
df
.10
.05
.02
.01
1
.988
.997
.9995
.9999
2
.900
.950
.980
.990
3
.805
.878
.934
.959
4
.729
.811
.882
.917
5
.669
.754
.833
.874
6
.622
.707
.789
.834
7
.582
.666
.750
.798
8
.549
.632
.716
.765
9
.521
.602
.685
.735
10
.497
.576
.658
.708
Nowlet'sputitalltogether.Thetablebelowshowsthecriteriaofourexampletodetermineifour
calculatedrvalueofissignificant:
LevelsofSignificanceforaOneTailedTest
.05
.025
.01
.005
LevelsofSignificanceforaTwoTailedTest
df
.10
.05
.02
.01
1
.988
.997
.9995
.9999
2
.900
.950
.980
.990
3
.805
.878
.934
.959
4
.729
.811
.882
.917
5
.669
.754
.833
.874
6
.622
.707
.789
.834
7
.582
.666
.750
.798
8
.549
.632
.716
.765
9
.521
.602
.685
.735
10
.497
.576
.658
.708
AccordingtoTableR,ForaOneTailedtestatp=.05with6degreesoffreedomthecritical
valuewemustexceedtoconsiderourcalculatedrvaluetobesignificantis.622.
Sinceourcalculatedr=0.9244
Weconcludethatourcorrelationissignificant.
*Notethatthefinalcumulativepercentscoreshouldequal100%
AdditionalLinksabouttheConcepts
thatmighthelp:
HowtoCalculatethe
CorrelationCoefficient
Scatterplots
MakingaScatterplot
PearsonProduct
MomentCorrelation
Coefficientof
Determination