Lecture 3 - Probability Distributions

Statistical inference: probability distributions and confidence intervals
We are now familiar with descriptive statistics; but the main use of statistical methods is not description, but prediction
o i.e. we collect samples mostly to predict
characteristics of the whole population
The key instrument of extrapolation from sample to population is the analysis of probability distributions:
o by assuming that our variables have a certain
distribution (normal, uniform, etc.), we can use samples to infer population properties
In the following we examine the concept and uses of statistical distributions

2
Most utilised statistical distribution is the normal distribution (the Bell curve)
o also the most infamous due to certain misuses
o http://crab.rutgers.edu/~goertzel/normalcurve.htm
However, there is nothing intrinsically wrong with using probability distributions

o well, anything in the wrong hands (from a bread
knife to a fundamental law of nature proposed by a pacifist) may become a weapon
The first reason for popularity of the normal curve is descriptive; i.e. we use it to model distribution of certain traits that look bell-shaped
What traits are bell-shaped? Typically, traits that are optimised or established by biological or social processes, and thus have a tendency to occur at an expected value
o classic example: biological traits under natural selection o A reason Darwin applied the principle of optimisation to
natural processes is that optimisation was a current concept in Victorian society (especially in Economics)
Normal curve can describe features of optimised traits

o o o
mean value should be the most likely or frequently observed the furthest from the mean, the less likely a value should be sum of all cases (or probabilities) should be 100% (thats the whole sample)
What kind of curve/distribution produces this pattern? Lets try an exponential:

o o o
> x <- seq(-3, 3, 0.05) > y <- exp(x) > plot(x,y) # what does it look like?
Try others:
o o o
> y <- exp(-x) > y <- exp(x^2) > y <- exp(-x^2) #that works!
5
The normal distribution is just a modified version of our exponential The curve N(0,1) =
is the standard normal distribution with mean=0 sd=1 sum of frequencies=1
-3 -2 -1
0 +1 +2 +3
Distribution N(0, 1) is possibly the most used in statistical analyses
It says that for example: the probability of being well above average (+3 standard deviations above mean) is only 0.1% probability of being one standard deviation below average (-1 sd) is 0.1+2.1+13.6=15.8% (i.e. everything below -1)
However, real traits (body height, income, schooling years, number of social media accounts) may have a normal distribution (bell shape), but rarely with mean=0 and standard deviation=1
That is not a problem: we can standardise variables, i.e. transform them so that everything you measure has mean=0 and sd=1
How is this done? With z-scores
7
1) We take variable x and subtract the mean from each case

o if mean height is 180 cm, someone 170 cm tall
now measures 170-180=-10
2) We take all residuals (case minus mean) and divide by standard deviation
o if sd=10 and mean is 180cm, someone
measuring 190 cm deviates -10 cm/10 cm= -1 standard deviation below the mean
In summary, standardisation or calculation of z-scores is simply converting any measurements into standard deviation units z

-3 -2 -1 0 +1 +2 +3
So: if in a population
o mean height = 180 cm o standard deviation=10
and you are 170cm, then

o you measure 10 cm above the average o you measure z = (170 180)/10 = -1
This means that the probability of being shorter than 170 cm in this population is
o 0.1 + 2.1 + 13.6 = 15.8%
The reason for standardising is clear: it is the theoretical step that allows the application of the normal distribution to many quantifiable aspects of reality
9
We are interested in intervals of the normal curve, not points Why? What does it mean to ask what is the probability of being a millionaire in the UK? (or their frequency) o it does not mean the probability of having exactly 1 million (thats a single point in the curve) o it means everyone having over 1 million (and thats an interval of the curve) Cumulative probability is the probability of an interval of values
a lower interval
an upper interval
10
It is easy to estimate cumulative probability of being smaller than a value in RStudio o you provide individual (test) value, mean, and sd, and R calculates z-score and probability of the interval defined by that value Command pnorm(test value, mean, sd) calculates cumulative probability from left to right, i.e. from to a value x (thats the blue area) Example: if your height is 170 cm, average is 180 cm, and sd=10 cm, then probability of being shorter than 170 cm is o > pnorm(170,180,10) o [1] 0.1586553
a lower interval
11
pnorm can estimate upper intervals too (i.e. the probability of being over a given value)
Example: o what is the probability of being at least (i.e. taller than) 190 cm in the same population?
1) Probability of being smaller than 190 cm (the WHITE area) is > pnorm(190,180,10) [1] 0.8413447 i.e. 0.841=84.1% 2) Thus probability of being over 190 cm is the rest of the curve > 1-pnorm(190,180,10) [1] 0.1586553 i.e.: probability of being taller than 190 cm is 1 (100%) minus the probability of being smaller than 190 cm
an upper interval
12
Important: we can combine the two things to calculate probability of extreme values (i.e. too large or too small) So what is the probability of being shorter than 170cm OR taller than 190 cm, with N(180, 10)?
> 1pnorm(190, 180, 10)+pnorm(170, 180, 10)
(check why)
13
Now the most important case (well see why):
What about probability of not being extreme, i.e. of being between 170 cm and 190 cm? (This means less than 10 cm off average of 180 cm)
o > pnorm(190, 180, 10) pnorm(170, 180, 10)
14
Take the estimates of years at school by country (from the HDR2011 database); this is the variable schoolingyears:
How can we estimate the proportion of countries with children having a) less than 3 years of schooling? b) less than 5 years of schooling? c) at least 7 years of schooling? Hints: -You need to use function pnorm -To use pnorm you need the test value, the mean and the standard deviation of variable schooling years
15

Lecture 3 - Probability Distributions

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Lecture 3 - Probability Distributions

Hochgeladen von

Copyright:

Verfügbare Formate

Statistical inference: probability distributions and confidence intervals

characteristics of the whole population

In the following we examine the concept and uses of statistical distributions

However, there is nothing intrinsically wrong with using probability distributions

knife to a fundamental law of nature proposed by a pacifist) may become a weapon

Normal curve can describe features of optimised traits

What kind of curve/distribution produces this pattern? Lets try an exponential:

is the standard normal distribution with mean=0 sd=1 sum of frequencies=1

Distribution N(0, 1) is possibly the most used in statistical analyses

1) We take variable x and subtract the mean from each case

now measures 170-180=-10

and you are 170cm, then

> 1pnorm(190, 180, 10)+pnorm(170, 180, 10)

Now the most important case (well see why):

Das könnte Ihnen auch gefallen