Sie sind auf Seite 1von 7

Stanford University STAT 116 Theory of Probability

Yongwhan Lim Autumn 2008-2009

1
1.1

Combinatorial Analysis
The Basic Principle of Counting

Theorem 1 (The Generalized Basic Principle of Counting). If r experiments that are to be performed are such that the rst one may result in any of n1 possible outcomes, and if for each of these n1 possible outcomes there are n2 possible outcomes of the second experiment, and if for each of the possible outcomes of the rst two experiments there are n3 possible outcomes of the third experiment, and if , then there is a total of n1 nr possible outcomes of the r experiments.

1.2

Combinations
n! (nr)!r!

Theorem 2.

represents the number of possible combinations of n objects taken r at


n r

a time: we denote this number by

=
n r

(n)r r!

for r n. for 1 r n.

Proposition 1 (Pascals Triangle). Theorem 3 (Binomial). (x + y)n =

n1 + n1 r1 r n n k nk . k=0 k x y

1.3

Permutation & Multinomial Coecients

n Theorem 4 (Multinomial Coecients). If r ni = n then n1 , ,nr denotes r n! ni ! , which i=1 i=1 is the number of possible divisions of n distinct objects into r distinct groups of respective sizes n1 , , nr (permutations of n objects, of which n1 are alike, n2 are alike, , nr are alike).

Theorem 5 (Multinomial). (

r i=1

xi )n =

r i=1

ni =n

n n1 , ,nr

r i=1

xni . i

Axioms of Probability

Notation Sample space is a set of all possible events, usually denoted by . Event is a statement of outcome, which is just an element A . The probability of event A is usually denoted by P (A). 1

2.1

Axioms of Probability

Axiom 1. 0 P (E) 1. Axiom 2. P (S) = 1. Axiom 3. If {Ei } is a sequence of mutually exclusive event then P ( Ei ) = i=1 i=1
i=1

P (Ei ).

2.2

Some Simple Propositions

Proposition 2. P (E c ) = 1 P (E). Proposition 3. If E F then P (E) P (F ). Proposition 4. P (E F ) = P (E) + P (F ) P (EF ). Proposition 5. P (n Ei ) = i=1
n r+1 r=1 (1) i1 <<ir

P (Ei1 Eir ).

2.3

Sample Spaces Having Equally Likely Outcomes

Proposition 6. P (A) = |A|/||. 2.3.1 Sampling with replacement

If n representatives are chosen from a sample of size s with replacement, there are total sn possible outcomes. 2.3.2 Sampling without replacement

If n representatives are chosen from a sample of size s without replacement, there are total (s)n = s(s 1) (s n + 1) possible outcomes. k! Theorem 6 (Stirlings). k! k k ek 2k for large k, meaning limk kk ek 2k = 1.

2.4

Probability as a continuous set function

Proposition 7. If {En , n 1} is either an increasing or a decreasing sequence of events then limn P (En ) = P (limn En ). Corollary 1 (Continuity from below/above). If A1 A2 and Ai = A then i=1 limi P (Ai ) = P (A). If B1 B2 and Bi = A then limi P (Bi ) = P (B). i=1 Proposition 8 (Booles Inequality). P ( Ai ) i=1
i=1

P (Ai ).
n i=1

Proposition 9 (Bonferronis Inequality). P (n Ai ) i=1

P (Ai ) (n 1).

2.5

Probability as a Measure of Belief

P (A) can be interpreted either as a long-run relative frequency or as a measure of ones degree of belief. Relatively frequency, which is measured by conducting large number of trials. Subjective probability, which is just a measure of belief. Statistical study is focused on the former notion.

3
3.1

Conditional Probability and Independence


Conditional Probabilities

Denition 1. P (E|F ) = P (EF )/P (F ) Proposition 10 (Multiplication). P (E1 En ) = P (E1 )P (E2 |E1 ) P (En |E1 En1 ).

3.2

Bayes Formula
P (E|Fi )P (Fi ).

Notation F1 , , Fr is a partition of E: that is, Fi Fj = if i = j and E = i Fi . Proposition 11. P (E) =


i

Proposition 12. P (E) = P (EF ) + P (EF c ) = P (E|F )P (F ) + P (E|F c )P (F c ); so, P (E) = P (E|F )P (F ) + P (E|F c )[1 P (F )]. Denition 2. The odds of an event A is Proposition 13 (Bayes). P (Fj |E) =
P (A) P (Ac )

P (A) ; 1P (A)

we have

P (H|E) P (H c |E)

P (H)P (E|H) . P (H c )P (E|H c )

P (EFj ) P (E)

P (E|Fj )P (Fj ) . i P (E|Fi )P (Fi )

3.3

Independent Events

Denition 3. Two events E and F are independent if P (EF ) = P (E)P (F ). Proposition 14. If E and F are independent then E and F c are indpendent. Denition 4. Three events E, F , and G are independent if P (EF G) = P (E)P (F )P (G), P (EF ) = P (E)P (F ), P (EG) = P (E)P (G), and P (F G) = P (F )P (G) Denition 5. The events E1 , , En are said to be independent if for any subset Ei1 , , Eir of them, P (Ei1 Eir ) = P (Ei1 ) P (Eir ).

3.4

P (|F ) is a Probability

Conditional probabilities satisfy all of the properties of ordinary probabilities. In fact, P (E|F ) satises the three axioms of a probability: Proposition 15. (i) 0 P (E|F ) 1, (ii) P (S|F ) = 1, and (iii) P (Ei |F ) = for mutually exclusive events {Ei }. Moreover, P (E c |F ) = 1 P (E|F ). 3 P (Ei |F )

4
4.1

Random Variables
Discrete Random Variables

Denition 6. The probability mass function p(a) of X is p(a) = P {X = a}. Proposition 16. p(a) is positive for at most countable number of values of a; that is, if X assume {xi } then p(xi ) 0 and p(x) = 0 if x {xi } and / p(xi ) = 1. Denition 7. The comulative distribution function F (a) is F (a) =
xa

p(x).

4.2

Expected Value
x:p(x)>0

Denition 8. The expectation or expected value of X is E[X] = case, E[X] = n xi p(xi ). i=1

xp(x); in nite

4.3

Expectation of a function of a random variable

Proposition 17. If X is a discrete random variable that takes on one of the values xi , i 1 with respective probabilities p(xi ) then for any real-valued function g, E[g(x)] = i g(xi )p(xi ). Corollary 2. E[aX + b] = aE[X] + b Denition 9. The quantity E[X n ] where n 1 is called the nth moment of X, which is just E[X n = x:p(x)>0 xn p(x).

4.4

Variance

Denition 10. Dene variance and standard devitation to be, respectively, V ar(X) = E[(X )2 ] and SD(X) = V ar(X). Proposition 18. V ar(X) = E[X 2 ] (E[X])2 ; V ar(aX + b) = a2 V ar(X); SD(aX + b) = |a|SD(X).

4.5

The Bernoulli and Binomial Random Variables

The random variable X whose probability mass function is given by p(i) = n pi (1 p)ni i for i = 0, , n is said to be a binomial random variable with parameters n and p. Its mean and variance are E[X] = np and V ar(X) = np(1 p). Proposition 19. If X is a binomial random variable with parameters (n, p), where 0 < p < 1, then as k goes from 0 to n, P {X = k} rst increases monotonically and then decreases monotonically, reaching its largest value when k is the largest integer less than or equal to (n + 1)p. Denition 11 (Indicator). The indicator random variable of event A, denoted by IA , is 1 if A occurs and 0 if A does not occur.

4.6

The Poisson Random Variable


i

The random variable X whose probability mass function is given by p(i) = e for i = i! 0, 1, 2, is said to be a Poisson random variable with parameter . Its mean and variance is E[X] = V ar(X) = . If a large number of independent trials are performed, each having a small probability of being successful, then the number of successful trials tha tresult will have a distribution that is approximately that of a Poisson random variable.

4.7
4.7.1

Other Discrete Probability Distributions


The Geometric Random Variable

The random variable X whose probability mass function is given by p(i) = p(1 p)i1 where i = 1, 2, is said to be a geometric random variable with parameter p. It represents the trial number of the rst success when each trial is independently a success with probability 1 p. Its mean and variance are E[X] = p and Var(X) = 1p . p2 4.7.2 The Negative Binomial Random Variable

i1 The random variable X whose probability mass function is given by p(i) = r1 pr (1 p)ir where i r is said to be a negative binomial random variable with parameters r and p. It represents the trial number of the rth success when each trial is independently a success r with probability p. Its mean and variance are E[X] = p and Var(X) = r(1p) . p2

4.7.3

The Hypergeometric Random Variable

(m)(N m) i ni where (N ) n i = 0, , m is said to be a hypergeometric random variable with parameters n, N, m. It represents the number of white balls selected when n balls are randomly chosen from an urn m that contains N balls, of which m are white. Writing p = N , its mean and variance are N n E[X] = np and Var(X) = N 1 np(1 p). The random variable X whose probability mass function is given by p(i) =

4.8

Properties of the cumulative distribution function

Some properties of the cumulative distribution function F are the following: (i) F is a nondecreasing function; (ii) limb F (b) = 1; (iii) limb F (b) = 0; (iv) F is right continuous; that is, for any b and any decreasing sequence bn , n 1, that converges to b, limn F (bn ) = F (b).
1 1 Proposition 20. P {X < b} = limn F (b n ); so, P {X = b} = F (b) limn F (b n ).

Continuous Random Variables

Notation The cumulative distribution function of random variable X is F (x) = P (X x). A density function is a real-valued function f which is non-negative and integrates to 1:

that is, f (u)du = 1. A distribution function F (or a random variable X) has probability x density function f if f is a density function such that F (x) = P (X x) = f (u)du. X is a continuous random variable if there is a nonnegative function f , probability density function of X, such that for any set B, P {X B} = B f (x)dx. Proposition 21. For continuous random variable X, P (X = x) = 0 for any point x.

5.1

Expectation and Variance of Continuous Random Variables


Denition 12. E[X] =

xf (x)dx

Proposition 22. If X is a continuous random variable with probability density function f (x) then for any real-valued function g, E[g(X)] = g(x)f (x)dx. Corollary 3. If a and b are constants then E[aX + b] = aE[X] + b. Denition 13. The variance of X is dened by V ar(X) = E[(X E[X])2 ] = E[X 2 ] (E[X])2 .

5.2

The Uniform Random Variable

A random variable X is said to be uniform over the interval (a, b) if its probability density 1 function is given by f (x) = ba for a x b and 0 otherwise and its cumulative distribution function is given by F (x) = 0 if x a, xa if a < x < b, and 1 if x b. Its expected value ba and variance are E[X] =
a+b 2

and Var(X) =

(ba)2 . 12

5.3

Normal Random Variable

A random variable X is said to be normal with parameters and 2 if its probability 2 2 1 density function is given by f (x) = 2 e(x) /2 for < x < . Its expected value and variance are E[X] = and Var(X) = 2 . If X is normal with mean and variance 2 then Z = X is normal with mean 0 and variance 1. Proposition 23. If X is N (, 2 ) and Y = aX + b with a = 0 then Y is N (a + b, a2 2 ). 5.3.1 The Normal Approximation to the Binomial Distribution

Theorem 7 (The DeMoivre-Laplace limit theorem). If Sn denotes the number of succcesses that occur when n independent trials, each resulting in a success with probability p, are S performed then, for any a < b, P {a n np b} (b) (a) as n (the normal
np(1p)

approximation will be quite good for n satisfying np(1 p) 10). 5.3.2 Lognormal Random Variable
1 e 2y 2
(ln y)2 2 2

If X is N (, 2 ) then Y = eX has a lognormal distribution, whose density is given by fY (y) = for y > 0. Its expected value and variance are E[Y ] = e+ . 6
2 2

and

Var(Y ) = (e

1)e

2+ 2

5.4

Exponential Random Variable

A random variable X is said to be exponential with parameter if its probability density function is given by f (x) = ex for x 0 and 0 otherwise and its cumulative distribution function is given by F (x) = 0 if x 0 and 1 ex if x 0. Its expected value and variance 1 1 are E[X] = and Var(X) = 2 . Proposition 24. The exponential random variable is memoryless; that is, P {X > s+t|X > t} = P {X > s}.

5.5

The Distribution of a Function of a Random Variable

Theorem 8. Let X be a continuous random variable having probability density function fX . Suppose that g(x) is a strictly monotone, dierentiable function of x. Then the random variable Y dened by Y = g(X) has a probability density function given by fY (y) = d fX [g 1 (y)]| dy g 1 (y)| if y = g(x) for some x and 0 if y = g(x) for all x, where g 1 (y) is dened to equal that value of x such that g(x) = y. Example 1. If X is a continuous random variable with density fX and Y = X 2 then Y has 1 density fY (y) = 2y [fX ( y) + fX ( y)]. Theorem 9. If X is a continuous random variable with density fX and Y = mX + b, where 1 m = 0, then fY (y) = |m| fX ( yb ). m

6
6.1

Other Topics
Expectations of Sums of Random Variables

n Proposition 25. If E[Xi ] is nite for all i = 1, , n then E[ n Xi ] = i=1 E[Xi ]. i=1 In other words, if event X can be written as a sum of indicator events Xi s then E[X] = n i=1 E[Xi ].

6.2

The Poisson Process

For t 0, N (t) denotes the number of events that occur in the time interval [0, t]. {N (t), t 0} is said to be a Poisson porcess with rate lambda if the following assumptions are met: (i) N (0) = 0; (ii) The numbers of events that occur in disjoint time intervals are independent; (iii) The distribution of the number of events that occur in a given interval depends only on the length of that interval and not on its location; (iv) P {N (h) = 1} = h + o(h); (v) P {N (h) 2} = o(h). Theorem 10. Let {N (t), t 0} be a Poisson process with rate . Then, for any t 0, t (t)n the random variable N (t) has a Poisson distribution with mean t: P {N (t) = n} = e n! where n = 0, 1, 2, . Proposition 26. For any 0 s t, N (t)N (s) counts the number of events in the interval (s, t]. N (t) N (s) has a Poisson distribution with mean (t s): P {N (t) N (s) = n} = e(ts) ((ts))n where n = 0, 1, 2, . n! 7

Das könnte Ihnen auch gefallen