Sie sind auf Seite 1von 32

Probability and probability distributions

Ian Jolliffe University of Aberdeen

CLIPS module 3.4

What is meant by the probability of an event?

Both event and probability are intuitively understood by most people, but we need to establish certain rules.
Rain tomorrow, 3 or fewer cyclones next year, The analysed surface pressure at a grid-point has an error of more than 2 hPa, Crop yield will exceed a given threshold are all examples of events whose probabilities might be of interest.

What is meant by the probability of an event?

Probabilities lie between 0 and 1. Zero probability implies that something is impossible. A probability of 1 means something is certain. What does an intermediate probability imply, for example if we say that the probability of rain tomorrow is 0.25?

Notation and terminology

Let A denote an event . The probability of that event is usually written P(A) or Pr(A). The complement of an event Ac is everything not in that event . The complement of rain tomorrow is no rain tomorrow; the complement of 3 or fewer cyclones is 4 or more cyclones. P(Ac) = 1 - P(A)

What is meant by the probability of an event?

A probability of 0.25 (also expressed as 1/4, or as 25%) implies that we think that it is 3 times as likely not to rain as it is to rain. This is because

P(no rain) = 1 - P(rain) = 0.75 0.75/0.25 = 3.

A probability can often be thought of as a longterm proportion of times an event will occur.

Probability - long-term proportions or subjective

In our rain/no rain example we might know that for our station of interest it rains on 25% of days at this time of year. Hence P(rain) = 0.25. However sometimes events are unique - it is of interest to ask what is the probability that a particular tropical storm will make landfall on a particular stretch of coastline. There are no longterm data on which to base the probability. Subjectivity comes in.

Unions and intersections

The union of two events A and B consists of everything included in A or B or both. Let

A = {rain tomorrow} B = {rain the day after tomorrow} C = {3 or fewer cyclones} D = {4 or 5 cyclones}

Then
AB= {rain in the next 2 days} CD = {5 or fewer cyclones} P{CD} = P{C} + P{D}, because C and D are mutually exclusive (they dont overlap).

Unions and intersections

P{AB} P{A} + P{B} because A and B do overlap.


P{AB} = P{A} + P{B} - P{AB}.

AB is the intersection of A and B; it includes everything that is in both A and B, and is counted twice if we add P{A} and P{B}. In our example

AB = {rains tomorrow and the day after tomorrow}. CD is empty - it is impossible for C and D to occur simultaneously, so P{CD} = 0.

Conditional probability and independence

If we know that one event has occurred it may change our view of the probability of another event. Let

A = {rain today}, B = {rain tomorrow}, C = {rain in 90 days time}

It is likely that knowledge that A has occurred will change your view of the probability that B will occur, but not of the probability that C will occur. We write P(B|A) P(B), P(C|A) = P(C). P(B|A) denotes the conditional probability of B, given A. We say that A and C are independent, but A and B are not. Note that for independent events P(AC) = P(A)P(C).

Conditional probability - tornado forecasting

Consider the classic data set on the next Slide consisting of forecasts and observations of tornados (Finley, 1884). Let

F = {Tornado forecast} T = {Tornado observed}

Use the frequencies in the table to estimate probabilities its a large sample, so estimates should not be too bad.

Forecasts of tornados
Tornado No T Total Forecast forecast 28 23 51 72 100 2680 2703 2752 2803

Tornado observed No T observed Total

Conditional probability - tornado forecasting

P(T) = 51/2803 = 0.0182 P(T|F) = 28/100 = 0.2800 P(T|Fc) = 23/2703 = 0.0085

Knowledge of the forecast changes P(T). F and T are not independent. P(T|F), P(F|T) are often confused but are different quantities, and can take very different values.

P(F|T) = 28/51 = 0.5490

Conditional probability - tornado forecasting

P(TF) = 28/2803 = P(T) P(F|T) = P(F)P(T|F) P(F)P(T). The two formulae for the probability of an intersection always hold. If A, B are independent, then P(A|B) = P(A), P(B|A) = P(A), so P(AB) = P(A)P(B). P(B|A) = P(B)P(A|B)/P(A)

This is Bayes Theorem, though in the usual statement of the theorem P(A) is expanded in a more complicated-looking fashion.

Random variables

Often we take measurements which have different values on different occasions. Furthermore, the values are subject to random or stochastic variation - they are not completely predictable, and so are not deterministic. They are random variables. Examples are crop yield, maximum temperature, number of cyclones in a season, rain/no rain.

Continuous and discrete random variables

A continuous random variable is one which can (in theory) take any value in some range, for example crop yield, maximum temperature. A discrete variable has a countable set of values. They may be

counts, such as numbers of cyclones categories, such as much above average, above average, near average, below average, much below average binary variables, such as rain/no rain

Probability distributions

If we measure a random variable many times, we can build up a distribution of the values it can take. Imagine an underlying distribution of values which we would get if it was possible to take more and more measurements under the same conditions. This gives the probability distribution for the variable.

Discrete probability distributions

A discrete probability distribution associates a probability with each value of a discrete random variable.
Example 1. Random variable has two values Rain/No Rain. P(Rain) = 0.2, P(No Rain) = 0.8 gives a probability distribution. Example 2. Let X = Number of wet days in a 10 day period. P(X=0) = 0.1074, P(X=1) = 0.2684, P(X=2) = 0.3020, P(X=6) = 0.0055, ... (see Slide 24 for more on this example). Note that P(rain) + P(No Rain) = 1; P(X=0) + P(X=1) + P(X=2) + +P(X=6) + P(X=10) = 1.

Continuous probability distributions

Because continuous random variables can take all values in a range, it is not possible to assign probabilities to individual values. Instead we have a continuous curve, called a probability density function, which allows us to calculate the probability a value within any interval. This probability is calculated as the area under the curve between the values of interest. The total area under the curve must equal 1.

Example: probability distribution for maximum temperature

The next Slide shows an idealized probability density for maximum daily temperature at a station in a particular month. The total area under the curve is 1. The area under the curve to the left of 20 is the probability of a max. temperature less than 20C. The area between 25 and 30 the probability of a max. temp. between 25C and 30C. The area to right of 32 is the prob. of the max. temp. exceeding 32C.

Example: theoretical probability density for maximum temperature


04 03

) t( f

t
02 0.0 1.0 2.0 3.0 4.0

Families of probability distributions

The number of different probability distributions is unlimited. However, certain families of distributions give good approximations to the distributions of many random variables. Important families of discrete distributions include binomial, multinomial, Poisson, hypergeometric, negative binomial Important families of continuous distributions include normal (Gaussian), exponential, gamma, lognormal, Weibull, extreme value

Families of discrete distributions

We consider only two, binomial and Poisson. There are many more. Do not use a particular distribution unless you are satisfied that the assumptions which underlie it are (at least approximately) satisfied.

Binomial distributions
1.
2.

3. 4.

The data arise from a sequence of n independent trials. At each trial there are only two possible outcomes, conventionally called success and failure. The probability of success, p, is the same in each trial. The random variable of interest is the number of successes, X, in the n trials.

The assumptions of independence and constant p in 1, 3 are important. If they are invalid, so is the binomial distribution

Binomial distributions - examples

Example 2 on Slide 17 is an example of a binomial distribution with 10 trials and probability of success 0.2. It is unlikely that the binomial distribution would be appropriate for the number of wet days in a period of 10 consecutive days, because of non-independence of rain on consecutive days. It might be appropriate for the number of frost-free Januarys, or the number of crop failures, in a 10-year period, if we can assume no inter-annual dependence and no trend in p, the frost-free probability, or crop failure probability.

Poisson distributions

Poisson distributions are often used to describe the number of occurrences of a rare event. For example The number of tropical cyclones in a season The number of occasions in a season when river levels exceed a certain value The main assumptions are that events occur

at random (the occurrence of an event doesnt change the probability of it happening again) at a constant rate

Poisson distributions also arise as approximations to binomials when n is large and p is small.

Poisson distributions an example

Suppose that we can assume that the number of cyclones, X, in a particular area in a season has a Poisson distribution with a mean (average) of 3. Then P(X=0) = 0.05, P(X=1) = 0.15, P(X=2) = 0.22, P(X=3) = 0.22, P(X=4) = 0.17, P(X=5) = 0.10, Note:

There is no upper limit to X, unlike the binomial where the upper limit is n. Assuming a constant rate of occurrence, the number of cyclones in 2 seasons would also have a Poisson distribution, but with mean 6.

Normal (Gaussian) distributions

Normal (also known as Gaussian) distributions are by far the most commonly used family of continuous distributions. They are bell-shaped see Slide 20 - and are indexed by two parameters:

The mean m the distribution is symmetric about this value The standard deviation s this determines the spread of the distribution. Roughly 2/3 of the distribution lies within 1 standard deviation of the mean, and 95% within 2 standard deviations.

Normal distributions - examples

The example on Slide 20 is a normal distribution with mean 27 and standard deviation 3. The probability of a temperature below 20C is 0.01; the probability of a temperature between 25C and 30C is 0.59; the probability of exceeding 32C is 0.05. Normal distributions are relevant when variables have roughly symmetric bell-shaped distributions. They tend to be more appropriate for variables which are sums or averages over several days or months than for individual measurements.

Deviations from normality - skewness

Some variables deviate from normality because their distributions are symmetric but too flat or too longtailed. A more common type of deviation is skewness, where one tail of the distribution is much longer than the other. Positive skewness, as illustrated in the next Slide is most common it occurs for windspeeds, and for rainfall amounts. Negatively-skewed distrbutions with longer tails to the left sometimes occur, for example surface pressure.

A positively-skewed Weibull distribution

0.3

0.2

f(x)
0.1 0.0 0 5 10

Families of skewed distributions

There are several families of skewed distributions, including Weibull, gamma and lognormal. Each family has 2 or more parameters which can be varied to fit a variety of shapes. One particular family (strictly 3 families) consists of so-called extreme value distributions. As the name suggests, these can be used to model extremes over a period, for example, maximum windspeed, minimum temperature, greatest 24-hr. rainfall, highest flood

Other probability distributions

We have sketched a few of the main probability distributions, but there are many others. Examples which dont fit standard patterns include

Proportion of sky covered by cloud may have large probability values near 0 and 1, with lower probabilities in between U-shaped rather than bell-shaped Daily rainfall is neither (purely) discrete, nor continuous. Positive values are continuous, but there is also a non-zero (discrete) probability of taking the value zero.

Das könnte Ihnen auch gefallen