Jolliffe Probability5

Probability and probability distributions
Ian Jolliffe University of Aberdeen
CLIPS module 3.4
What is meant by the probability of an event?
Both event and probability are intuitively understood by most people, but we need to establish certain rules.
Rain tomorrow, 3 or fewer cyclones next year, The analysed surface pressure at a grid-point has an error of more than 2 hPa, Crop yield will exceed a given threshold are all examples of events whose probabilities might be of interest.
Probabilities lie between 0 and 1. Zero probability implies that something is impossible. A probability of 1 means something is certain. What does an intermediate probability imply, for example if we say that the probability of rain tomorrow is 0.25?
Notation and terminology
Let A denote an event . The probability of that event is usually written P(A) or Pr(A). The complement of an event Ac is everything not in that event . The complement of rain tomorrow is no rain tomorrow; the complement of 3 or fewer cyclones is 4 or more cyclones. P(Ac) = 1 - P(A)
A probability of 0.25 (also expressed as 1/4, or as 25%) implies that we think that it is 3 times as likely not to rain as it is to rain. This is because

P(no rain) = 1 - P(rain) = 0.75 0.75/0.25 = 3.
A probability can often be thought of as a longterm proportion of times an event will occur.
Probability - long-term proportions or subjective
In our rain/no rain example we might know that for our station of interest it rains on 25% of days at this time of year. Hence P(rain) = 0.25. However sometimes events are unique - it is of interest to ask what is the probability that a particular tropical storm will make landfall on a particular stretch of coastline. There are no longterm data on which to base the probability. Subjectivity comes in.
Unions and intersections
The union of two events A and B consists of everything included in A or B or both. Let

A = {rain tomorrow} B = {rain the day after tomorrow} C = {3 or fewer cyclones} D = {4 or 5 cyclones}
Then
AB= {rain in the next 2 days} CD = {5 or fewer cyclones} P{CD} = P{C} + P{D}, because C and D are mutually exclusive (they dont overlap).
Unions and intersections
P{AB} P{A} + P{B} because A and B do overlap.

P{AB} = P{A} + P{B} - P{AB}.
AB is the intersection of A and B; it includes everything that is in both A and B, and is counted twice if we add P{A} and P{B}. In our example

AB = {rains tomorrow and the day after tomorrow}. CD is empty - it is impossible for C and D to occur simultaneously, so P{CD} = 0.
Conditional probability and independence
If we know that one event has occurred it may change our view of the probability of another event. Let
A = {rain today}, B = {rain tomorrow}, C = {rain in 90 days time}
It is likely that knowledge that A has occurred will change your view of the probability that B will occur, but not of the probability that C will occur. We write P(B|A) P(B), P(C|A) = P(C). P(B|A) denotes the conditional probability of B, given A. We say that A and C are independent, but A and B are not. Note that for independent events P(AC) = P(A)P(C).
Conditional probability - tornado forecasting
Consider the classic data set on the next Slide consisting of forecasts and observations of tornados (Finley, 1884). Let

F = {Tornado forecast} T = {Tornado observed}
Use the frequencies in the table to estimate probabilities its a large sample, so estimates should not be too bad.
Forecasts of tornados
Tornado No T Total Forecast forecast 28 23 51 72 100 2680 2703 2752 2803
Tornado observed No T observed Total
P(T) = 51/2803 = 0.0182 P(T|F) = 28/100 = 0.2800 P(T|Fc) = 23/2703 = 0.0085
Knowledge of the forecast changes P(T). F and T are not independent. P(T|F), P(F|T) are often confused but are different quantities, and can take very different values.
P(F|T) = 28/51 = 0.5490
P(TF) = 28/2803 = P(T) P(F|T) = P(F)P(T|F) P(F)P(T). The two formulae for the probability of an intersection always hold. If A, B are independent, then P(A|B) = P(A), P(B|A) = P(A), so P(AB) = P(A)P(B). P(B|A) = P(B)P(A|B)/P(A)
This is Bayes Theorem, though in the usual statement of the theorem P(A) is expanded in a more complicated-looking fashion.
Random variables
Often we take measurements which have different values on different occasions. Furthermore, the values are subject to random or stochastic variation - they are not completely predictable, and so are not deterministic. They are random variables. Examples are crop yield, maximum temperature, number of cyclones in a season, rain/no rain.
Continuous and discrete random variables
A continuous random variable is one which can (in theory) take any value in some range, for example crop yield, maximum temperature. A discrete variable has a countable set of values. They may be

counts, such as numbers of cyclones categories, such as much above average, above average, near average, below average, much below average binary variables, such as rain/no rain
Probability distributions
If we measure a random variable many times, we can build up a distribution of the values it can take. Imagine an underlying distribution of values which we would get if it was possible to take more and more measurements under the same conditions. This gives the probability distribution for the variable.
Discrete probability distributions
A discrete probability distribution associates a probability with each value of a discrete random variable.
Example 1. Random variable has two values Rain/No Rain. P(Rain) = 0.2, P(No Rain) = 0.8 gives a probability distribution. Example 2. Let X = Number of wet days in a 10 day period. P(X=0) = 0.1074, P(X=1) = 0.2684, P(X=2) = 0.3020, P(X=6) = 0.0055, ... (see Slide 24 for more on this example). Note that P(rain) + P(No Rain) = 1; P(X=0) + P(X=1) + P(X=2) + +P(X=6) + P(X=10) = 1.
Continuous probability distributions
Because continuous random variables can take all values in a range, it is not possible to assign probabilities to individual values. Instead we have a continuous curve, called a probability density function, which allows us to calculate the probability a value within any interval. This probability is calculated as the area under the curve between the values of interest. The total area under the curve must equal 1.
Example: probability distribution for maximum temperature
The next Slide shows an idealized probability density for maximum daily temperature at a station in a particular month. The total area under the curve is 1. The area under the curve to the left of 20 is the probability of a max. temperature less than 20C. The area between 25 and 30 the probability of a max. temp. between 25C and 30C. The area to right of 32 is the prob. of the max. temp. exceeding 32C.
Example: theoretical probability density for maximum temperature

04 03
) t( f
t
02 0.0 1.0 2.0 3.0 4.0
Families of probability distributions
The number of different probability distributions is unlimited. However, certain families of distributions give good approximations to the distributions of many random variables. Important families of discrete distributions include binomial, multinomial, Poisson, hypergeometric, negative binomial Important families of continuous distributions include normal (Gaussian), exponential, gamma, lognormal, Weibull, extreme value
Families of discrete distributions
We consider only two, binomial and Poisson. There are many more. Do not use a particular distribution unless you are satisfied that the assumptions which underlie it are (at least approximately) satisfied.
Binomial distributions
1.
2.
3. 4.
The data arise from a sequence of n independent trials. At each trial there are only two possible outcomes, conventionally called success and failure. The probability of success, p, is the same in each trial. The random variable of interest is the number of successes, X, in the n trials.
The assumptions of independence and constant p in 1, 3 are important. If they are invalid, so is the binomial distribution
Binomial distributions - examples
Example 2 on Slide 17 is an example of a binomial distribution with 10 trials and probability of success 0.2. It is unlikely that the binomial distribution would be appropriate for the number of wet days in a period of 10 consecutive days, because of non-independence of rain on consecutive days. It might be appropriate for the number of frost-free Januarys, or the number of crop failures, in a 10-year period, if we can assume no inter-annual dependence and no trend in p, the frost-free probability, or crop failure probability.
Poisson distributions
Poisson distributions are often used to describe the number of occurrences of a rare event. For example The number of tropical cyclones in a season The number of occasions in a season when river levels exceed a certain value The main assumptions are that events occur
at random (the occurrence of an event doesnt change the probability of it happening again) at a constant rate
Poisson distributions also arise as approximations to binomials when n is large and p is small.
Poisson distributions an example
Suppose that we can assume that the number of cyclones, X, in a particular area in a season has a Poisson distribution with a mean (average) of 3. Then P(X=0) = 0.05, P(X=1) = 0.15, P(X=2) = 0.22, P(X=3) = 0.22, P(X=4) = 0.17, P(X=5) = 0.10, Note:

There is no upper limit to X, unlike the binomial where the upper limit is n. Assuming a constant rate of occurrence, the number of cyclones in 2 seasons would also have a Poisson distribution, but with mean 6.
Normal (Gaussian) distributions
Normal (also known as Gaussian) distributions are by far the most commonly used family of continuous distributions. They are bell-shaped see Slide 20 - and are indexed by two parameters:

The mean m the distribution is symmetric about this value The standard deviation s this determines the spread of the distribution. Roughly 2/3 of the distribution lies within 1 standard deviation of the mean, and 95% within 2 standard deviations.
Normal distributions - examples
The example on Slide 20 is a normal distribution with mean 27 and standard deviation 3. The probability of a temperature below 20C is 0.01; the probability of a temperature between 25C and 30C is 0.59; the probability of exceeding 32C is 0.05. Normal distributions are relevant when variables have roughly symmetric bell-shaped distributions. They tend to be more appropriate for variables which are sums or averages over several days or months than for individual measurements.
Deviations from normality - skewness
Some variables deviate from normality because their distributions are symmetric but too flat or too longtailed. A more common type of deviation is skewness, where one tail of the distribution is much longer than the other. Positive skewness, as illustrated in the next Slide is most common it occurs for windspeeds, and for rainfall amounts. Negatively-skewed distrbutions with longer tails to the left sometimes occur, for example surface pressure.
A positively-skewed Weibull distribution
0.3
0.2
f(x)
0.1 0.0 0 5 10
Families of skewed distributions
There are several families of skewed distributions, including Weibull, gamma and lognormal. Each family has 2 or more parameters which can be varied to fit a variety of shapes. One particular family (strictly 3 families) consists of so-called extreme value distributions. As the name suggests, these can be used to model extremes over a period, for example, maximum windspeed, minimum temperature, greatest 24-hr. rainfall, highest flood
Other probability distributions
We have sketched a few of the main probability distributions, but there are many others. Examples which dont fit standard patterns include
Proportion of sky covered by cloud may have large probability values near 0 and 1, with lower probabilities in between U-shaped rather than bell-shaped Daily rainfall is neither (purely) discrete, nor continuous. Positive values are continuous, but there is also a non-zero (discrete) probability of taking the value zero.

Jolliffe Probability5

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Jolliffe Probability5

Hochgeladen von

Copyright:

Verfügbare Formate

Probability and probability distributions

Ian Jolliffe University of Aberdeen

CLIPS module 3.4

What is meant by the probability of an event?

What is meant by the probability of an event?

Notation and terminology

What is meant by the probability of an event?

P(no rain) = 1 - P(rain) = 0.75 0.75/0.25 = 3.

Probability - long-term proportions or subjective

Unions and intersections

Unions and intersections

P{AB} P{A} + P{B} because A and B do overlap.

Conditional probability and independence

A = {rain today}, B = {rain tomorrow}, C = {rain in 90 days time}

Conditional probability - tornado forecasting

F = {Tornado forecast} T = {Tornado observed}

Tornado observed No T observed Total

Conditional probability - tornado forecasting

P(T) = 51/2803 = 0.0182 P(T|F) = 28/100 = 0.2800 P(T|Fc) = 23/2703 = 0.0085

P(F|T) = 28/51 = 0.5490

Conditional probability - tornado forecasting

Continuous and discrete random variables

Discrete probability distributions

Continuous probability distributions

Example: probability distribution for maximum temperature

Example: theoretical probability density for maximum temperature

Families of probability distributions

Families of discrete distributions

Binomial distributions - examples

Poisson distributions an example

Normal (Gaussian) distributions

Normal distributions - examples

Deviations from normality - skewness

A positively-skewed Weibull distribution

Families of skewed distributions

Other probability distributions

Das könnte Ihnen auch gefallen