Sie sind auf Seite 1von 111

Statistics for the Social Sciences

Psychology 340
Spring 2005
Introductions
Statistics for the
Social Sciences
Outline (for week)
Variables: IV, DV, scales of measurement
Distributions and characteristics of
Using graphs
Locating scores: z-scores and other transformations
Descriptive statistics decision tree
Statistics for the
Social Sciences
Basic Concepts
Variable
A condition or characteristic that can have
different values
Value
A possible number or category that a score can
have
Score
A particular persons value on a variable

Statistics for the
Social Sciences
Variables
Identify the things that were studying
Variables
Characteristics or conditions that change or has different values for
different individuals (or situations)
Independent (explanatory) variables
The variable that does the explaining
In an experiment it is the variable that is manipulated by the researcher
Dependent (response) variable
The variable that is observed for changes in order to assess the effect of
the manipulation
Typically it is the variable measured in an experiment
Statistics for the
Social Sciences
Measuring and Manipulating
Variables
Two levels of variables
Conceptual level of the variables
What the theory is about (absence, fondness)




Operational level of the variables
What is actually manipulated/measured in the
research program
Duration of time apart
Rated fondness
Operational definition
Specifies the relationship between
the conceptual and operational levels
Statistics for the
Social Sciences
Measuring and Manipulating
Variables
Operational definition
Specifies the relationship between
the conceptual and operational levels
1. It describes a set of operations or procedures for
measuring a conceptual variable
2. It defines the variable in terms of the resulting
measurements
Statistics for the
Social Sciences
Experimental Unit
What is the level at which the research is focused?
Individuals
Between individuals
Within individuals
Across groups
Couples
Families
Cities
Ethnic groups
Absence makes the heart grow fonder; what level(s) could
we focus on?
Statistics for the
Social Sciences
Measurement
Properties of our measurement?
Units of measurement - whether the
measurement has a minimum sized unit or not
Scales of measurement - the correspondence
between the numbers representing the properties
that were measuring
Statistics for the
Social Sciences
Levels of Measurement
Numeric (quantitative) variable
Equal-interval variables
e.g., GPA
Rank-order (ordinal) variables
e.g., position finished in a race
Nominal (categorical) variables
e.g., gender
Statistics for the
Social Sciences
Units of Measurement
Continuous variables
Variables can take any number and can be infinitely broken down
into smaller and smaller units
E.g., For lunch I can have



2, 3, or 2.5 cookies
Discrete variables
Broken into a finite number of discrete categories
that cant be broken down
E.g., In my family I can have
1 kid or 2 kids , but not 2.5
Statistics for the
Social Sciences
Scales of measurement
Categorical variables
Nominal scale
Statistics for the
Social Sciences
Scales of measurement
Nominal Scale: Consists of a set of categories that
have different names.
Measurements on a nominal scale label and categorize
observations, but do not make any quantitative
distinctions between observations.
Example:
Eye color:
blue, green, brown, hazel
Statistics for the
Social Sciences
Scales of measurement
Categorical variables
Nominal scale
Ordinal scale
Statistics for the
Social Sciences
Scales of measurement
Ordinal Scale: Consists of a set of categories that
are organized in an ordered sequence.
Measurements on an ordinal scale rank observations in
terms of size or magnitude.
Example:
T-shirt size: Small, Med, Lrg, XL, XXL
Statistics for the
Social Sciences
Scales of measurement
Categorical variables
Nominal scale
Ordinal scale
Quantitative variables
Interval scale
Statistics for the
Social Sciences
Scales of measurement
Interval Scale: Consists of ordered categories
where all of the categories are intervals of exactly
the same size.
With an interval scale, equal differences between
numbers on the scale reflect equal differences in
magnitude.
Ratios of magnitudes are not meaningful.
Example:
Fahrenheit temperature scale
20 40
Not Twice as hot
Statistics for the
Social Sciences
Scales of measurement
Categorical variables
Nominal scale
Ordinal scale
Quantitative variables
Interval scale
Ratio scale
Statistics for the
Social Sciences
Scales of measurement
Ratio scale: An interval scale with the additional feature of
an absolute zero point.
With a ratio scale, ratios of numbers DO reflect ratios of
magnitude.
It is easy to get ratio and interval scales confused
Consider the following example: Measuring your height with
playing cards


Statistics for the
Social Sciences
Scales of measurement
Ratio scale
8 cards high
Statistics for the
Social Sciences
Scales of measurement
Interval scale
5 cards high
Statistics for the
Social Sciences
Scales of measurement
Interval scale Ratio scale
8 cards high 5 cards high
0 cards high
means no
height
0 cards high
means
as tall as
the table
Statistics for the
Social Sciences
Distributions
A picture of the distribution is usually helpful
Gives a good sense of the properties of the distribution
Many different ways to display distribution
Table
Frequency distribution table
Stem and leaf plot
Graphs
Statistics for the
Social Sciences
Frequency distribution
X f p % c%
12
11
10
9
8
7
6
5
4
3
2
The
values of
the
variable
The number
of tokens of
each variable
The proportion
of tokens at each
value
N=total
p = f/N
The percentage
of tokens at each
value
Cumulative
percentage
Statistics for the
Social Sciences
Frequency Tables
Provide a listing of individuals having each of the
different values for a particular variable.
e.g., stress ratings of 151 students:
4,7,7,7,8,8,7,8,9,4,7,3,6,9,10,5,7,10,6,8,7,8,7,8,7,4,5,10,10,0,9,8,3,7,9
,7,9,5,8,5,0,4,6,6,7,5,3,2,8,5,10,9,10,6,4,8,8,8,4,8,7,3,8,8,8,8,7,9,7,5,6
,3,4,8,7,5,7,3,3,6,5,7,5,7,8,8,7,10,5,4,3,7,6,3,9,7,8,5,7,9,9,3,1,8,6,6,4,
8,5,10,4,8,10,5,5,4,9,4,7,7,7,6,6,4,4,4,9,7,10,4,7,5,10,7,9,2,7,5,9,10,3,
7,2,5,9,8,10,10,6,8,3

Statistics for the
Social Sciences
Steps for Making a
Frequency Table
Make a list down the page of each possible
value, from highest to lowest
Go one by one through the scores, making
a mark for each next to its value on the list
Make a table showing how many times
each value on your list is used
Figure the percentage of scores for each
value
Statistics for the
Social Sciences
A Frequency Table
Stress
Rating

Frequency

Percent

10 14 9.3
9 15 9.9
8 26 17.2
7 31 20.5
6 13 8.6
5 18 11.9
4 16 10.6
3 12 7.9
2 3 2.0
1 1 0.7
0 2 1.3

Statistics for the
Social Sciences
Grouped Frequency Table
A frequency table that uses intervals


Stress
Rating Interval

Frequency

Percent

10-11 14 9
8-9 41 27
6-7 44 29
4-5 34 23
2-3 15 10
0-1 3 2

Statistics for the
Social Sciences
Frequency Graphs
Histogram
Statistics for the
Social Sciences
Frequency Graphs
Frequency polygon

Statistics for the
Social Sciences
Shapes of Frequency
Distributions
Unimodal, bimodal, and
rectangular
Statistics for the
Social Sciences
Shapes of Frequency
Distributions
Symmetrical and skewed distributions
Statistics for the
Social Sciences
Shapes of Frequency
Distributions
Normal and kurtotic distributions
Statistics for the
Social Sciences
Controversies and Limitations
Failure to use equal interval sizes
Exaggeration of proportions
Statistics for the
Social Sciences
Descriptive statistics
In addition to pictures of the distribution, numerical
summaries are also typically presented.
Numeric Descriptive Statistics
Shape: skew and kurtosis
Measures of Center: mean, median, mode
Measures of Variability (Spread): next time

Statistics for the
Social Sciences
Center
It is often very useful to be able to summarize or describe
the distribution with a single numerical value.
Select a value that is the most representative of the entire
distribution, that is of all of the individuals.
This is what we mean by central tendency.
There are three main measures of center
Mean (M)
Median (Mdn)
Mode
Note: Average may refer to each of these three measures
Statistics for the
Social Sciences
The Mean
The most commonly used measure of center
The arithmetic average
Computing the mean


=
X
N
The formula for the population
mean is (a parameter):
The formula for the sample
mean is (a statistic):

X =
X
n
Add up all of
the Xs
Divide by the
total number in
the population
Divide by the
total number in
the sample
Statistics for the
Social Sciences
A weighted mean
Suppose that you combine two groups together.
How do you compute the new group mean?
New Group Group 1

X
1
=110
Group 2

X
2
=140

X
N
=
110+140
2
=125
Statistics for the
Social Sciences
A weighted mean
Suppose that you combine two groups together.
How do you compute the new group mean?
Group 1 Group 2 New Group

X
1
=110

X
2
=140

X
N
=
X
1
n
1
+ X
2
n
2
n
1
+ n
2

=
(110*7) + (140*3)
7+ 3
=119
110
110
110
110
110
110
110 140
140
140
Statistics for the
Social Sciences
Characteristics of a mean


suppose that one of the girl scouts discovered that she
had really made $23 instead of $30. so now the total is
119-7=112 112/7 = $16 (instead of $17)
Change/add/delete a given score, then the mean
will change.
Statistics for the
Social Sciences
Characteristics of a mean
Change/add/delete a given score, then the mean
will change.
Add/subtract a constant to each score, then the
mean will change by adding(subtracting) that
constant.

suppose that you want to factor out a $2 camping fee
for each girl scout. Subtract 2 from each amount. Now
the total is $105, so the mean is 105/7 = $15. But
notice you could have just subtracted $2 from the
previous mean of $17 and arrived at the same answer.

Statistics for the
Social Sciences
Characteristics of a mean

Multiply (or divide) each score by a constant, then
the mean will change by being multiplied by that
constant.
suppose that the troop sponsor agreed to match the money made by
each girl scout. That is they agree to give each girl scout an
additional amount of money equal to however much they make on
the sale. So now the total is $238, and the mean for each girl is
238/7 = $34

Change/add/delete a given score, then the mean
will change.
Add/subtract a constant to each score, then the
mean will change by adding(subtracting) that
constant.
Statistics for the
Social Sciences
The median
The median is the score that divides a distribution exactly
in half. Exactly 50% of the individuals in a distribution
have scores at or below the median.
Case1: Odd number of scores in the distribution
$12
$25
$30
$6
$18
$15
$13
Step1: put the scores in order
Statistics for the
Social Sciences
The median
The median is the score that divides a distribution exactly
in half. Exactly 50% of the individuals in a distribution
have scores at or below the median.
Case1: Odd number of scores in the distribution
$12 $25 $30 $6 $18 $15 $13
Step1: put the scores in order
Step2: find the middle score
Thats the median
Statistics for the
Social Sciences
The median
The median is the score that divides a distribution
exactly in half. Exactly 50% of the individuals in
a distribution have scores at or below the median.
Step1: put the scores in order
Step2: find the middle two
scores
Thats the median
Step3: find the arithmetic
average of the two middle
scores
$12 $25 $30 $18 $15 $13 $6 $18

15+18
2
=16.5
Case2: Even number of scores in the distribution
Statistics for the
Social Sciences
The mode
The mode is the score or category that has the
greatest frequency.
So look at your frequency table or graph and pick the
variable that has the highest frequency.
1
2
3
1 2 3 4 5 6 7 8 9
1
2
3
1 2 3 4 5 6 7 8 9
so the mode is 5 so the modes
are 2 and 8
Note: if one were bigger
than the other it would be
called the major mode and
the other would be the
minor mode
1
2
3
1 2 3 4 5 6 7 8 9
4
major mode
minor
mode
Statistics for the
Social Sciences
Which center when?
Depends on a number of factors, like scale of
measurement and shape.
The mean is the most preferred measure and it is closely
related to measures of variability
However, there are times when the mean isnt the
appropriate measure.


Statistics for the
Social Sciences
Which center when?
Use the median if:
there are a few extreme scores in the distribution (skewed
distributions with long tails)
there are undetermined values - if for some reason you dont
know the value of one (or more) of your items (e.g., the person
died before answering your question)
your distributions are open-ended - by this we mean that
there is no upper or lower limit on the possible values of your
variable (e.g. your top answer on your questionnaire is 5 or
more)
If your data are on an ordinal scale (rankings), then use the
median.

Statistics for the
Social Sciences
Which center when?
=
symmetric distribution
mean = median mode =
mean > median mode
positively skewed distribution
>
negatively skewed distribution
mean < median mode <
bimodal distribution
mean median , 2 modes
What is the impact of shape on center?
Statistics for the
Social Sciences
Measures of Central Tendency The
Mean
Sum of all the scores divided by the number of
scores




Mean of 7,8,8,7,3,1,6,9,3,8
X = 7+8+8+7+3+1+6+9+3+8 = 60
N = 10
Mean = 60/10 = 6
=
X

N
Statistics for the
Social Sciences
Measures of Central Tendency The
Mode
Most common single number in a
distribution
Mode of 7,8,8,7,3,1,6,9,3,8 = 8
Measure of central tendency for nominal
variables
Statistics for the
Social Sciences
Measures of Central Tendency The
Median
The middle score when all scores are
arranged from lowest to highest
Median of 7,8,8,7,3,1,6,9,3,8
1 3 3 6 7 7 8 8 8 9

median

Median is the average (mean) of the 5
th
and 6
th

scores, so the median is 7
Statistics for the
Social Sciences
Measures of Spread
The Variance
The average of each scores squared
difference from the mean
Steps for computing the variance:
1. Subtract the mean from each score
2. Square each of these deviation scores
3. Add up the squared deviation scores
4. Divide the sum of squared deviation
scores by the number of scores
Statistics for the
Social Sciences
Project #1
What:
Read and summarize a journal article
See PIP packet for more details and lab section
webpages
Statistics for the
Social Sciences
Descriptive statistics
In addition to pictures of the distribution,
numerical summaries are also typically presented.
Numeric Descriptive Statistics
Shape: skew and kurtosis
Measures of Center: mean, median, mode
Measures of Variability (Spread): range, Inter-quartile range,
standard deviation (& variance)

Statistics for the
Social Sciences
Variability of a distribution
Variability provides a quantitative measure of the degree to
which scores in a distribution are spread out or clustered
together.
In other words variabilility refers to the degree of differentness of
the scores in the distribution.
High variability means that
the scores differ by a lot
Low variability means that the scores
are all similar
Statistics for the
Social Sciences
Range
The simplest measure of variability is the range,
which weve already mentioned in our earlier
discussions.
Range = Maximum value - minimum value
there is a big drawback:
the statistic is based solely on the two most extreme values in
the distribution
Statistics for the
Social Sciences
Interquartile range
An alternative measure of variability is the inter-quartile
range.
Median (50%tile) equals the point at which exactly half the
distribution exists on one side and the other half on the other side.
Considering the same logic
What does the 25%tile represent?
The 75%?
Statistics for the
Social Sciences
Interquartile range
The inter-quartile range is the distance between the first
quartile and the third quartile. So this corresponds to the
middle 50% of the scores of our distribution.
25%
25%
25%
25%
25%tile 75%tile
median
IQR
Statistics for the
Social Sciences
Interquartile range
The inter-quartile range focuses on the middle half of all of
the scores in the distribution.
Thus it is more representative of the distribution as a whole
compared to the range
Extreme scores (i.e., outliers) will not influence the measure
(sometimes referred to as being robust).
However, this still means that all of the scores in the distribution are
not represented in the measure (only the middle 50%).
Statistics for the
Social Sciences
Standard deviation
The standard deviation is the most popular and
most important measure of variability.
In essence, the standard deviation measures how far off
all of the individuals in the distribution are from a
standard, where that standard is the mean of the
distribution. Essentially, the average of the deviations.

Statistics for the
Social Sciences
Computing standard deviation (population)
Step 1: To get a measure of the deviation we need to
subtract the population mean from every individual in our
distribution.
Our population
2, 4, 6, 8
1 2 3 4 5 6 7 8 9 10

=
X
N
=
2+ 4 + 6+ 8
4
=
20
4
= 5.0
2 - 5 = -3

X - = deviation scores
-3
Statistics for the
Social Sciences
Step 1: To get a measure of the deviation we need to
subtract the population mean from every individual in our
distribution.
Our population
2, 4, 6, 8
1 2 3 4 5 6 7 8 9 10

=
X
N
=
2+ 4 + 6+ 8
4
=
20
4
= 5.0
2 - 5 = -3
4 - 5 = -1

X - = deviation scores
-1
Computing standard deviation (population)
Statistics for the
Social Sciences
Step 1: To get a measure of the deviation we need to
subtract the population mean from every individual in our
distribution.
Our population
2, 4, 6, 8
1 2 3 4 5 6 7 8 9 10

=
X
N
=
2+ 4 + 6+ 8
4
=
20
4
= 5.0
2 - 5 = -3
4 - 5 = -1
6 - 5 = +1

X - = deviation scores
1
Computing standard deviation (population)
Statistics for the
Social Sciences
Step 1: To get a measure of the deviation we need to
subtract the population mean from every individual in our
distribution.
Our population
2, 4, 6, 8
1 2 3 4 5 6 7 8 9 10

=
X
N
=
2+ 4 + 6+ 8
4
=
20
4
= 5.0
2 - 5 = -3
4 - 5 = -1
6 - 5 = +1
8 - 5 = +3

X - = deviation scores
3
Notice that if you add up
all of the deviations they
must equal 0.
Computing standard deviation (population)
Statistics for the
Social Sciences
Step 2: So what we have to do is get rid of the negative
signs. We do this by squaring the deviations and then
adding them together to get the sum of the squared
deviations (SS).

SS = (X - )
2

2 - 5 = -3
4 - 5 = -1
6 - 5 = +1
8 - 5 = +3
X - = deviation scores
= (-3)
2
+ (-1)
2
+ (+1)
2
+ (+3)
2

= 9 + 1 + 1 + 9 = 20
Computing standard deviation (population)
Statistics for the
Social Sciences
Step 3: Now we have the sum of squares (SS), but to get the Variance
which is simply the average of the squared deviations
we want the population variance not just the SS, because the SS
depends on the number of individuals in the population, so we want
the mean
So to get the mean, we need to divide by the number of individuals in the
population.
variance = o
2
= SS/N
Computing standard deviation (population)
Statistics for the
Social Sciences
Step 4: However the population variance isnt exactly what we want,
we want the standard deviation from the mean of the population. To
get this we need to take the square root of the population variance.

o
2
=
X
( )
2

N
standard deviation = o =
Computing standard deviation (population)
Statistics for the
Social Sciences
To review:
Step 1: compute deviation scores
Step 2: compute the SS
either by using definitional formula or the computational
formula
Step 3: determine the variance
take the average of the squared deviations
divide the SS by the N
Step 4: determine the standard deviation
take the square root of the variance
Computing standard deviation (population)
Statistics for the
Social Sciences
The basic procedure is the same.
Step 1: compute deviation scores
Step 2: compute the SS
Step 3: determine the variance
This step is different
Step 4: determine the standard deviation
Computing standard deviation (sample)
Statistics for the
Social Sciences
Computing standard deviation (sample)
Step 1: Compute the deviation scores
subtract the sample mean from every individual in our distribution.
Our sample
2, 4, 6, 8
1 2 3 4 5 6 7 8 9 10

X =
X
n
=
2+ 4 + 6+ 8
4
=
20
4
= 5.0
X - X = deviation scores
2 - 5 = -3
4 - 5 = -1
6 - 5 = +1
8 - 5 = +3
X
Statistics for the
Social Sciences
Step 2: Determine the sum of the squared deviations (SS).
Computing standard deviation (sample)
2 - 5 = -3
4 - 5 = -1
6 - 5 = +1
8 - 5 = +3
= (-3)
2
+ (-1)
2
+ (+1)
2
+ (+3)
2

= 9 + 1 + 1 + 9 = 20
X - X = deviation scores
SS = (X - X)
2

Apart from notational differences the procedure is
the same as before
Statistics for the
Social Sciences
Step 3: Determine the variance
Computing standard deviation (sample)
Population variance = o
2
= SS/N
Recall:

X
1
X
2
X
3
X
4
The variability of the samples is
typically smaller than the
populations variability
Statistics for the
Social Sciences
Step 3: Determine the variance
Computing standard deviation (sample)
Population variance = o
2
= SS/N
Recall:
The variability of the samples is
typically smaller than the
populations variability
Sample variance = s
2


=
SS
n 1
( )
To correct for this we divide by (n-1) instead of just n
Statistics for the
Social Sciences
Step 4: Determine the standard deviation

s
2
=
X X
( )
2

n 1
standard deviation = s =
Computing standard deviation (sample)
Statistics for the
Social Sciences
Characteristics of a standard deviation


May change the mean and (if adding or subtracting) the
number of scores (n or N)
Change/add/delete a given score, then the standard
deviation will change.
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

All of the scores change by the same constant.
X
old
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

All of the scores change by the same constant.
X
old
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

All of the scores change by the same constant.
X
old
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

All of the scores change by the same constant.
X
old
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

All of the scores change by the same constant.
But so does the mean
X
new
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
X
old
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
X
old
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
X
old
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
X
old
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
X
old
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
X
old
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
X
old
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

It is as if you just pick up the distribution and move it over, but the
spread (variability) stays the same
X
new
X
old
Statistics for the
Social Sciences
Characteristics of a standard deviation
Change/add/delete a given score, then the standard
deviation will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.

Looking at a numerical example.
Original sample
2, 4, 6, 8
New sample
1, 3, 5, 7
Original mean
5
New mean
4
2 - 5 = -3
4 - 5 = -1
6 - 5 = +1
8 - 5 = +3
1 - 4 = -3
3 - 4 = -1
5 - 4 = +1
7 - 4 = +3
Original SS
20
Original SS
20
Statistics for the
Social Sciences
Characteristics of a standard deviation

Multiply (or divide) each score by a constant, then
the standard deviation will change by being
multiplied by that constant.
Change/add/delete a given score, then the mean
will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.


20 21 22 23 24
X
21 - 22 = -1
23 - 22 = +1
(-1)
2

(+1)
2

s =

X X
( )
2

n 1
= 2 =1.41
Statistics for the
Social Sciences
Characteristics of a standard deviation

Multiply (or divide) each score by a constant, then
the standard deviation will change by being
multiplied by that constant.


Change/add/delete a given score, then the mean
will change.
Add/subtract a constant to each score, then the
standard deviation will NOT change.
42 - 44 = -2
46 - 44 = +2
(-2)
2

(+2)
2

s =

X X
( )
2

n 1
= 8 = 2.82
40 42 44 46 48
X
S
old
=1.41
Statistics for the
Social Sciences
When to use which
Extreme scores: range is most affected, IQR is least
affected
Sample size: range tends to increase as n increases, IQR &
s do not
The range does not have stable values when you repeatedly
sample from the same population, but the IQR & S are
stable and tend not to fluctuate.
With open-ended distributions, one cannot even compute
the range or S, so the IQR is the only option
Statistics for the
Social Sciences
Measures of Spread
The Variance
Formula for the variance:


o
2
=
(X )
2

N
=
SS
N
Statistics for the
Social Sciences
Measures of Spread
The Standard Deviation
Most common way of describing the spread
of a group of scores
Steps for computing the standard deviation:
1. Figure the variance
2. Take the square root

Statistics for the
Social Sciences
Measures of Spread
The Standard Deviation
Formula for the standard deviation:


o = o
2
=
(X )
2

N
=
SS
N
Statistics for the
Social Sciences
Z Scores
Number of standard deviations a score is
above or below the mean
Formula to change a raw score to a Z score:


Z =
(X )
o
Statistics for the
Social Sciences
Z Scores
Formula to change a Z score to a raw score:


Distribution of Z scores
Mean = 0
Standard deviation = 1
X = (Z)(o) +
Statistics for the
Social Sciences
Locating a score
Where is our raw score within the distribution?
The natural choice of reference is the mean (since it is usually easy
to find).
So well subtract the mean from the score (find the deviation score).

X
The direction will be given to us by the negative or
positive sign on the deviation score
The distance is the value of the deviation score
Statistics for the
Social Sciences
Locating a score

X


=100
X
1
= 162
X
2
= 57
X
1
- 100

= +62
X
2
- 100

= -43
Reference
point
Direction
Statistics for the
Social Sciences
Locating a score

X


=100
X
1
= 162
X
2
= 57
X
1
- 100

= +62
X
2
- 100

= -43
Reference
point
Below
Above
Statistics for the
Social Sciences
Transforming a score

z =
X
o
The distance is the value of the deviation score
However, this distance is measured with the units of
measurement of the score.
Convert the score to a standard (neutral) score. In this case a
z-score.
Raw score
Population mean
Population standard deviation
Statistics for the
Social Sciences
Transforming scores


=100
X
1
= 162
X
2
= 57

o = 50

z =
X
o
X
1
- 100

= +1.20
50
X
2
- 100

= -0.86
50
A z-score specifies the precise location
of each X value within a distribution.
Direction: The sign of the z-score (+
or -) signifies whether the score is
above the mean or below the mean.
Distance: The numerical value of the
z-score specifies the distance from the
mean by counting the number of
standard deviations between X and .
Statistics for the
Social Sciences
Transforming a distribution
We can transform all of the scores in a distribution
We can transform any & all observations to z-scores if we know
either the distribution mean and standard deviation.
We call this transformed distribution a standardized distribution.
Standardized distributions are used to make dissimilar distributions
comparable.
e.g., your height and weight
One of the most common standardized distributions is the Z-
distribution.
Statistics for the
Social Sciences
Properties of the z-score distribution


=0

transformation

z =
X
o
150 50

z
mean
=
100100
50
= 0

o = 50

=100
X
mean
= 100
Statistics for the
Social Sciences
Properties of the z-score distribution


=0


o = 50
transformation

z =
X
o
150 50
X
mean
= 100

z
mean
=
100100
50

z
+1std
=
150100
50
= 0
= +1

=100
X
+1std
= 150
+1
Statistics for the
Social Sciences
Properties of the z-score distribution


o =1

=0


o = 50
transformation

z =
X
o
150 50
X
mean
= 100
X
+1std
= 150

z
mean
=
100100
50

z
+1std
=
150100
50

z
1std
=
50100
50
= 0
= +1
= -1

=100
X
-1std
= 50
+1 -1
Statistics for the
Social Sciences
Properties of the z-score distribution
Shape - the shape of the z-score distribution will be exactly the same as
the original distribution of raw scores. Every score stays in the exact
same position relative to every other score in the distribution.
Mean - when raw scores are transformed into z-scores, the mean will
always = 0.
The standard deviation - when any distribution of raw scores is
transformed into z-scores the standard deviation will always = 1.




Statistics for the
Social Sciences

150 50


o =1

=0
+1 -1
From z to raw score
We can also transform a z-score back into a raw score if we know the
mean and standard deviation information of the original distribution.
Z = (X - ) --> (Z)( o) = (X - ) --> X = (Z)( o) +
o

transformation

X = Zo +

o = 50

=100
Z = -0.60 X = (-0.60)( 50) + 100 X = 70
Statistics for the
Social Sciences

Controversies and Limitations
The Tyranny of the Mean

Knowledge about the individual case is
lost when taking averages
Qualitative research methods
e.g., case studies, ethnography
Statistics for the
Social Sciences
The Mean and Standard Deviation in
Research Articles
Commonly reported in research articles


Statistics for the
Social Sciences

Das könnte Ihnen auch gefallen