Sie sind auf Seite 1von 43

Measures of Central Tendency

Levin and Fox Elementary Statistics In Social Research Chapter 3

Measures of central tendency:


Measures of central tendency: Measures of central tendency are numbers that describe what is average or typical in a distribution We will focus on three measures of central tendency: The Mode The Median The Mean (average) Our choice of an appropriate measure of central tendency depends on three factors: (a) the level of measurement, (b) the shape of the distribution, (c) the purpose of the research.

The Mode
The Mode: The mode is the most frequent, most typical or most common value or category in a distribution. Example: There are more protestants in the US than people of any other religion. The mode is always a category or score, not a frequency. The mode is not necessarily the category with the majority (that is, 50% or more) of cases. It is simply the category in which the largest number of cases falls.

The Mode
The Mode: Most frequent or most common value or category. category or score (not a frequency.) not necessarily majority Used to describe nominal variables!

Lets Practice!
Look at the figure below and identity the mode.

4%

A Review of Mode
The pie chart shows answers of 1998 GSS respondents to the question, Would you say your own health, in general, is excellent, good, fair, or poor? Note that the highest percentage (49%) of respondents is associated with the answer good. The answer good is the mode. Remember: The mode is used to describe nominal variables!

A Review of Mode
Another Mode Example: Our question is the following: What is the most common foreign language spoken in the United States today, as determined by the mode? To answer this question, lets look at a list of the ten most commonly spoken foreign languages in the United States and the number of people who speak each foreign language:

Ten Most Common Foreign Languages Spoken in the United States, 1990. Language
Spanish French German Italian Chinese

Number of Speakers
17,339,000 1,702,000 1,547,000 1,309,000 1,249,000

Tagalog
Polish Korean Vietnamese Portuguese

843,000
723,000 626,000 507,000 430,000
8

Source: U.S. Bureau of the Census, Statistical Abstract of the United States, 2000, Table 51.

A Review of Mode
Is the mode 17,339,000? NO! Recall: The mode is the category or score, not the frequency!! Thus, the mode is Spanish.

The Mode
Some additional points to consider about modes: Some distributions have two modes where two response categories have the highest frequencies. Such distributions are said to be bimodal. NOTE: When two scores or categories have the highest frequencies that are quite close, but not identical, in frequency, the distribution is still essentially bimodal. In these instances report both the true mode and the highest frequency categories.

10

Example of a Bimodal Frequency Distribution

11

The Median
The Median: The median is the score that divides the distribution into two equal parts so that half of the cases are above it and half are below it. The median can be calculated for both ordinal and interval levels of measurement, but not for nominal data. It must be emphasized that the median is the exact middle of a distribution. So, now lets look at ways we can find the median in sorted data:

12

The Mode and Median


The Mode: Most frequent or most common value or category. category or score (not a frequency.) not necessarily majority Used to describe nominal variables! The Median: - Divides the distribution into two equal (exact middle 50% above and below) - The median can be calculated for both ordinal and interval levels of measurement, but not for nominal data. - Need to sort data to calculate

13

In some cases, we can find the median by simple inspection.


Lets look at the responses (A) to the question: Think about the economy, how would you rate economic conditions in the country today? First, we sort the responses (B) in order from lowest to highest (or highest to lowest). Since we have an odd number of cases, lets find the middle case.
B
Poor Good A Jim Sue

Only Fair
Poor Total (N) Poor Poor Only Fair

Bob
Jorge 5 Jim Jorge Bob

Excellent Karen

Good
Total (N)

Sue
5
14

Excellent Karen

Calculating the median:


Jim Jorge Bob Sue Karen Poor Poor Only Fair Good Excellent

We can find the median through visual inspection and through calculation. We can also find the middle case when N is odd by adding 1 to N and dividing by 2: (N + 1) 2. Since N is 5, you calculate (5 + 1) 2 = 3. The middle case is, thus, the third case (Bob), the median response is Only Fair.

15

Calculating the median:


Another example: The following is a list of the number of hate crimes reported in the nine largest U.S. states for 1997.
State California Florida Virginia Number 1831 93 105

New Jersey
New York Ohio Pennsylvania Texas North Carolina TOTAL

694
853 265 168 333 42 N=9
16

Calculating the median:


Finding the Median Number of Hate Crimes 1. 2. Order the cases from lowest to highest. In this situation, we need the 5th case: (9 + 1) 2 = 5 Which is 265 (Interval data) Remember: (N + 1) 2.
State North Carolina Florida Virginia Number 42 93 105

Pennsylvania
Ohio Texas New Jersey New York California

168
265 333 694 853 1831

N=9

17

Finding the Median Number of Hate Crimes out of Eight States


1. Order the cases from lowest to highest.

2.

The median is always that point above which 50% of cases fall and below which 50% of cases fall.
For an even number of cases, there will be two middle cases. In this instance, the median falls halfway between both cases (216.5). However, the circumstances being explained should determine if you use the two middle cases or the point halfway between both cases for your explanation.

State North Carolina Florida Virginia Pennsylvania Ohio Texas New Jersey New York

Number 42 93 105 168 265 333 694 853

3. 4. 5.

18

The median in frequency distributions:


So now, lets find the median in frequency distributions: Often the data are arranged in frequency distributions. The procedure is a bit more involved: We have to find the category associated with the observation located in the middle of the distribution. To do this, we construct a cumulative percentage distribution. So, lets take a look at a frequency distribution

19

Table: Political Views of GSS Respondents, 1988


Political Views
Extremely Liberal Liberal Slightly Liberal

Frequency (f)
32 175 189

Cf
32 207 396

Percentage
2.4 12.9 13.9

C%
2.4 15.3 29.2

Moderate
Slightly Conservative Conservative Extremely Conservative

502
211 203 44

898
1109 1312 1356

37.0
15.6 15.0 3.2

66.2
81.8 96.8 100.00

Total

1356

100.00
20

Cumulative Percentage Distribution:


Cumulative Percentage Distribution: We construct a cumulative percentage distribution to help locate the middle of the distribution. The observation located in the middle of the distribution is the one that has the cumulative percentage value equal to 50%.

Notice that 29.2% of the observations are accumulated below the category of moderate and that 66.2% are accumulated up to and including the category moderate.
The median is the value of the category associated with this observation. This middle observation falls within the category moderate, so the median for this distribution is moderate.
21

Table: Political Views of GSS Respondents, 1988


Political Views
Extremely Liberal
Liberal

Frequency (f)
32
175

Cf
32
207

Percentage
2.4
12.9

C%
2.4
15.3

Slightly Liberal

189

396

13.9

29.2
66.2
29.2-66.2

Moderate
Slightly Conservative Conservative Extremely Conservative

502
211 203 44

898
1109 1312 1356

37.0
15.6 15.0 3.2

81.8 96.8 100.00

Total

1356

100.00
22

The Mean
The Mean:
The mean is what most people call the average. It find the mean of any distribution simply add up all the scores and divide by the total number of scores.

Here is formula for calculating the mean

X X=
N where X = mean (read as X bar)

= sum (expressed as the Greek letter sigma)

X = raw score in a set of scores N = total number of scores in a set


23

Finding the Mean


Communicable Diseases -> Tuberculosis (as of 22 March 2007) -> Case detection rate (MDG indicator 24) -> DOTS all new case detection rate (%) -> Total (Periodicity: Year, Applied Time Period: from 2005 to 2005)

2005
Bangladesh Bhutan Democratic People's Republic of Korea 37 44 103

India
Indonesia Maldives Myanmar

58
47 76 119

Nepal
Sri Lanka Thailand Timor-Leste
World Health Organization, 2008. All rights reserved

64
71 61 71
24

Finding the Mean


Finding the Mean: To identify the number of new tuberculosis cases found in 2006 by the WHO in this region, Add up the cases for all of the countries in the region and Divide the sum by the total number of cases.

X X=
N
Thus, the mean rate is (751 11) = 68.273.

25

Using a formula to calculate the mean:


The Usefulness of Formulas: The mean introduces the usefulness of a formula, which may be defined as a is a shorthand way to explain what operations we need to follow to obtain a certain result. Again, the formula that defines the mean is:

X X=
N where X = mean (read as X bar)

= sum (expressed as the Greek letter sigma)

X = raw score in a set of scores N = total number of scores in a set


26

Deviation:
Deviation: The deviation indicates the distance and direction of any raw score from the mean. To find the deviation of a particular score, we simply subtract the mean from the score:

Deviation = X - X
Where X = any raw score in the distribution

X mean of the distributi on

27

The Weighted Mean


When groups differ in size, you cant just sum their means and divide by the number of groups. Instead, you must weight each group mean by its size,

Xw
where

group

group

N total

group

mean of a particular group

N group number in a particular group N total number in all groups combined

weighted mean
28

Time to practice!
Reasons Why Homeowners get a Home Equity Line of Credit. Consolidate debts: 26 Invest in other real estate: 3 Home improvements/repairs: 45 Other purposes: 9 Purchase auto: 9 Pay for education or medical: 4

29

So what do you do? And then?

We want to know the mo, mdn, and

X
First, lets arrange the scores from highest to lowest.

Home improvements/ repairs Consolidate debts Other purposes Purchase auto Pay for education or medical Invest in other real estate Total

45

26 9 9 4 3 96
30

Whats the most frequent case (Mo)? Other purposes and Purchase auto because they both have the score of 9. What is the middlemost score (Mdn)? 9, because (N + 1) 2 or (6+1)2= 3.5

Home improvements/ repairs Consolidate debts Other purposes Purchase auto Pay for education or medical Invest in other real estate Total (N = 6)

45

26 9 9 4

What is the mean ( X )? 16, because the sum of the scores is 96 and we divide this by 6 to get 16.

96
31

So what does this tell us?


The mode is the peak of the curve. The mean is found closest to the tail, where the relatively few extreme cases will be found. The median is found between the mode and mean or is aligned with them in a normal distribution.

32

Did you know?


The shape or form of a distribution can influence the researchers choice of a measure of tendency. Why is that? Well, lets see

33

Chapter Three: Review

34

Review: The Mode


The Mode: The mode is the category with the largest frequency (or percentage) in the distribution. The mode is always a category or score, not a frequency. The mode is not necessarily the category with the majority (that is, 50% or more) of cases. It is simply the category in which the largest number (or proportion) of cases falls.

35

Review: The Median


The Median: The median is the score that divides the distribution into two equal parts so that half of the cases are above it and half are below it. The median can be calculated for both ordinal and interval levels of measurement, but not for nominal data. It must be emphasized that the median is the exact middle of a distribution.

36

Review: The median:


Jim Jorge Bob Sue Karen Poor Poor Only Fair Good Excellent

Calculating the median: We can find the median through visual inspection and through calculation. We can also find the middle case when N is odd by adding 1 to N and dividing by 2: (N + 1) 2. Since N is 5, you calculate (5 + 1) 2 = 3. The middle case is, thus, the third case (Bob), the median response is Only Fair.

37

Review: The Mean


The Mean:
The mean is what most people call the average. It find the mean of any distribution simply add up all the scores and divide by the total number of scores.

Here is formula for calculating the mean

X X=
N where X = mean (read as X bar)

= sum (expressed as the Greek letter sigma)

X = raw score in a set of scores N = total number of scores in a set


38

Review: Measures of Central Tendency


Reasons Why Homeowners get a Home Equity Line of Credit. Consolidate debts: 26 Invest in other real estate: 3 Home improvements/repairs: 45 Other purposes: 9 Purchase auto: 9 Pay for education or medical: 4

39

Review: Measures of Central Tendency

We want to know the mo, mdn, and

X
First, lets arrange the scores from highest to lowest.

Home improvements/ repairs Consolidate debts Other purposes Purchase auto Pay for education or medical Invest in other real estate Total

45

26 9 9 4 3 96
40

Whats the most frequent case (Mo)? Other purposes and Purchase auto because they both have the score of 9. What is the middlemost score (Mdn)? 9, because 9 + 9= 18 and if we divide 18 by 2, we get 9.

Home improvements/ repairs Consolidate debts Other purposes Purchase auto Pay for education or medical Invest in other real estate Total (N = 6)

45

26 9 9 4

What is the mean ( X )? 16, because the sum of the scores is 96 and we divide this by 6 to get 16.

96
41

Review: Shape of the Distribution


Choosing a Measure of Central Tendency The shape or form of a distribution can influence the researchers choice of a measure of tendency.

42

Review: Shape of the Distribution


The mode is the peak of the curve. The mean is found closest to the tail, where the relatively few extreme cases will be found. The median is found between the mode and mean or is aligned with them in a normal symmetrical/unimodal distribution.

43

Das könnte Ihnen auch gefallen