Introduction To Statistics53004300

Introduction to Statistics
Measures of Central Tendency
Two Types of Statistics

Descriptive statistics of a POPULATION Relevant notation (Greek):
mean N population size sum
Inferential statistics of SAMPLES from a population.

Assumptions are made that the sample reflects the population in an unbiased form. Roman Notation: X mean n sample size sum
Be careful though because you may want to use inferential statistics even when you are dealing with a whole population.
Measurement error or missing data may mean that if we treated a population as complete that we may have inefficient estimates.
It depends on the type of data and project. Example of Democratic Peace.
Also, be careful about the phrase descriptive statistics. It is used generically in place of measures of central tendency and dispersion for inferential statistics.
Another name is summary statistics, which are univariate:
Mean, Median, Mode, Range, Standard Deviation, Variance, Min, Max, etc.
Measures of Central Tendency

These measures tap into the average distribution of a set of scores or values in the data.
Mean Median Mode
What do you Mean?

The mean of some data is the average score or value, such as the average age of an MPA student or average weight of professors that like to eat donuts.
Inferential mean of a sample: X=(X)/n Mean of a population: =(X)/N
Problem of being mean

The main problem associated with the mean value of some data is that it is sensitive to outliers.
Example, the average weight of political science professors might be affected if there was one in the department that weighed 600 pounds.
Donut-Eating Professors
Professor Weight Weight Schmuggles 165 165
Bopsey
Pallitto Homer Schnickerson Levin Honkey-Doorey Zingers Boehmer Queenie Googles-Boop Calzone
213
189 187 165 148 251 308 151 132 199 227 194.6
213
410 610 165 148 251 308 151 132 199 227 248.3
The Median (not the cement in the middle of

the road)
Because the mean average can be sensitive to extreme values, the median is sometimes useful and more accurate. The median is simply the middle value among some scores of a variable. (no standard formula for its computation)
What is the Median?

Professor Schmuggles Bopsey Pallitto Homer Schnickerson Levin Honkey-Doorey Zingers Weight 165 213 189 187 165 148 251 308 Weight
Rank order and choose middle value. If even then average between two in the middle
132 148 151
165
165 187 189
Boehmer
Queenie Googles-Boop Calzone
151
132 199 227
199
213 227 251
194.6
308
Percentiles
If we know the median, then we can go up or down and rank the data as being above or below certain thresholds. You may be familiar with standardized tests. 90th percentile, your score was higher than 90% of the rest of the sample.
The Mode (hold the pie and the ala)

(What does ala taste like anyway??)
The most frequent response or value for a variable. Multiple modes are possible: bimodal or multimodal.
Figuring the Mode

Professor Schmuggles Bopsey Pallitto Homer Schnickerson Levin Honkey-Doorey Zingers Weight 165 213 189 187 165 148 251 308
What is the mode?
Answer: 165
Important descriptive information that may help inform your research and diagnose problems like lack of variability.
Boehmer
Queenie Googles-Boop Calzone
151
132 199 227
Measures of Dispersion
you cast)
(not something
Measures of dispersion tell us about variability in the data. Also univariate. Basic question: how much do values differ for a variable from the min to max, and distance among scores in between. We use:
Range Standard Deviation Variance
Remember that we said in order to glean information from data, i.e. to make an inference, we need to see variability in our variables.
Measures of dispersion give us information about how much our variables vary from the mean, because if they dont it makes it difficult infer anything from the data. Dispersion is also known as the spread or range of variability.
The Range (no Buffalo roaming!!)

r=hl
Where h is high and l is low
In other words, the range gives us the value between the minimum and maximum values of a variable.
Understanding this statistic is important in understanding your data, especially for management and diagnostic purposes.
The Standard Deviation

A standardized measure of distance from the mean. Very useful and something you do read about when making predictions or other statements about the data.
Formula for Standard Deviation
2 ( X X ) (n - 1)
=square root =sum (sigma) X=score for each point in data _ X=mean of scores for the variable n=sample size (number of observations or cases
X Smuggle Bopsey Pallitto Homer Schnickerson Levin Honkey-Doorey Zingers Boehmer Queeny Googles-boop Calzone Mean 165 213 189 187 165 148 251 308 151 132 199 227 194.6
X- mean -29.6 18.4 -5.6 -7.6 -29.6 -46.6 56.4 113.4 -43.6 -62.6 4.4 32.4
x-mean squared 875.2 339.2 31.2 57.5 875.2 2170.0 3182.8 12863.3 1899.5 3916.7 19.5 1050.8 2480.1
49.8
We can see that the Standard Deviation equals 165.2 pounds. The weight of Zinger is still likely skewing this calculation (indirectly through the mean).
Example of S in use
Boehmer- Sobek paper. One standard deviation increase in the value of X variable increases the Probability of Y occurring by some amount.
Table 2: Development and Relative Risk of Territorial Claim

Probability* % Change Baseline development pop density pop growth Capability Openness Capability and pop growth 0.0401 0.0024 0.0332 0.0469 0.0813 0.0393 0.0942
-94.3 -17.3 16.8 102.5 -2 134.8
% Change in prob after 1 sd change in given x variable, holding others at their means
Lets go to computers!
Type in data in the Excel sheet.
Variance
2=
2 ( X X ) (n - 1)
Note that this is the same equation except for
no square root taken. Its use is not often directly reported in research but instead is a building block for other statistical methods
Organizing and Graphing Data
Goal of Graphing?
1. Presentation of Descriptive Statistics 2. Presentation of Evidence 3. Some people understand subject matter better with visual aids 4. Provide a sense of the underlying data generating process (scatterplots)
What is the Distribution?

Gives us a picture of the variability and central tendency.
Can also show the amount of skewness and Kurtosis.
Graphing Data: Types
Creating Frequencies
We create frequencies by sorting data by value or category and then summing the cases that fall into those values. How often do certain scores occur? This is a basic descriptive data question.
Ranking of Donut-eating Profs. (most to least)

Zingers
Honkey-Doorey Calzone Bopsey
308
251 227 213
Googles-boop
Pallitto Homer Schnickerson Smuggle Boehmer Levin Queeny
199
189 187 165 165 151 148 132
Here we have placed the Professors into weight classes and depict with a histogram in columns.
Weight Class Intervals of Donut-Munching Professors 3.5 3 2.5 2 1.5 1 0.5 0 130-150 151-185 186-210 211-240 241-270 271-310 311+ Number
Here it is another histogram depicted as a bar graph.

Weight Class Intervals of Donut-Munching Professors 311+ 271-310 241-270 211-240 186-210 151-185 130-150 0 0.5 1 1.5 2 2.5 3 3.5 Number
Pie Charts:
Proportions of Donut-Eating Professors by Weight Class
130-150 151-185 186-210 211-240 241-270 271-310 311+
Actually, why not use a donut graph. Duh!

Proportions of Donut-Eating Professors by Weight Class
130-150 151-185 186-210 211-240 241-270 271-310 311+
See Excel for other options!!!!
Approval
100 10 20 30 40 50 60 70 80 90 0
Economic approval
Line Graphs: A Time Series
Month
Approval
19 81 19 82 19 83 19 84 19 85 19 86 19 87 19 88 19 89 19 90 19 91 19 92 19 93 19 94 19 95 19 96 19 97 19 98 19 99 20 00 20 01
Scatter Plot (Two variable)

Presidential Approval and Unemployment
100 80
Approval
60 Approve 40 20 0 0 2 4 6 Unemployment 8 10 12

Introduction To Statistics53004300

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Introduction To Statistics53004300

Hochgeladen von

Copyright:

Verfügbare Formate

Introduction to Statistics

Measures of Central Tendency

Two Types of Statistics

Inferential statistics of SAMPLES from a population.

Measures of Central Tendency

What do you Mean?

Inferential mean of a sample: X=(X)/n Mean of a population: =(X)/N

Problem of being mean

The Median (not the cement in the middle of

What is the Median?

132 148 151

The Mode (hold the pie and the ala)

Figuring the Mode

What is the mode?

The Range (no Buffalo roaming!!)

The Standard Deviation

Formula for Standard Deviation

Table 2: Development and Relative Risk of Territorial Claim

-94.3 -17.3 16.8 102.5 -2 134.8

Note that this is the same equation except for

Organizing and Graphing Data

What is the Distribution?

Can also show the amount of skewness and Kurtosis.

Graphing Data: Types

Ranking of Donut-eating Profs. (most to least)

Here it is another histogram depicted as a bar graph.

130-150 151-185 186-210 211-240 241-270 271-310 311+

Actually, why not use a donut graph. Duh!

130-150 151-185 186-210 211-240 241-270 271-310 311+

See Excel for other options!!!!

Line Graphs: A Time Series

Scatter Plot (Two variable)

Das könnte Ihnen auch gefallen