Stat 116

Stanford University STAT 116 Theory of Probability
Yongwhan Lim Autumn 2008-2009
1
1.1
Combinatorial Analysis
The Basic Principle of Counting
Theorem 1 (The Generalized Basic Principle of Counting). If r experiments that are to be performed are such that the rst one may result in any of n1 possible outcomes, and if for each of these n1 possible outcomes there are n2 possible outcomes of the second experiment, and if for each of the possible outcomes of the rst two experiments there are n3 possible outcomes of the third experiment, and if , then there is a total of n1 nr possible outcomes of the r experiments.
1.2
Combinations
n! (nr)!r!
Theorem 2.
represents the number of possible combinations of n objects taken r at

n r
a time: we denote this number by
=
n r
(n)r r!
for r n. for 1 r n.
Proposition 1 (Pascals Triangle). Theorem 3 (Binomial). (x + y)n =
n1 + n1 r1 r n n k nk . k=0 k x y
1.3
Permutation & Multinomial Coecients
n Theorem 4 (Multinomial Coecients). If r ni = n then n1 , ,nr denotes r n! ni ! , which i=1 i=1 is the number of possible divisions of n distinct objects into r distinct groups of respective sizes n1 , , nr (permutations of n objects, of which n1 are alike, n2 are alike, , nr are alike).
Theorem 5 (Multinomial). (
r i=1
xi )n =
r i=1
ni =n
n n1 , ,nr
r i=1
xni . i
Axioms of Probability
Notation Sample space is a set of all possible events, usually denoted by . Event is a statement of outcome, which is just an element A . The probability of event A is usually denoted by P (A). 1
2.1
Axioms of Probability
Axiom 1. 0 P (E) 1. Axiom 2. P (S) = 1. Axiom 3. If {Ei } is a sequence of mutually exclusive event then P ( Ei ) = i=1 i=1
i=1
P (Ei ).
2.2
Some Simple Propositions
Proposition 2. P (E c ) = 1 P (E). Proposition 3. If E F then P (E) P (F ). Proposition 4. P (E F ) = P (E) + P (F ) P (EF ). Proposition 5. P (n Ei ) = i=1
n r+1 r=1 (1) i1 <<ir
P (Ei1 Eir ).
2.3
Sample Spaces Having Equally Likely Outcomes
Proposition 6. P (A) = |A|/||. 2.3.1 Sampling with replacement
If n representatives are chosen from a sample of size s with replacement, there are total sn possible outcomes. 2.3.2 Sampling without replacement
If n representatives are chosen from a sample of size s without replacement, there are total (s)n = s(s 1) (s n + 1) possible outcomes. k! Theorem 6 (Stirlings). k! k k ek 2k for large k, meaning limk kk ek 2k = 1.
2.4
Probability as a continuous set function
Proposition 7. If {En , n 1} is either an increasing or a decreasing sequence of events then limn P (En ) = P (limn En ). Corollary 1 (Continuity from below/above). If A1 A2 and Ai = A then i=1 limi P (Ai ) = P (A). If B1 B2 and Bi = A then limi P (Bi ) = P (B). i=1 Proposition 8 (Booles Inequality). P ( Ai ) i=1
i=1
P (Ai ).
n i=1
Proposition 9 (Bonferronis Inequality). P (n Ai ) i=1
P (Ai ) (n 1).
2.5
Probability as a Measure of Belief
P (A) can be interpreted either as a long-run relative frequency or as a measure of ones degree of belief. Relatively frequency, which is measured by conducting large number of trials. Subjective probability, which is just a measure of belief. Statistical study is focused on the former notion.
3
3.1
Conditional Probability and Independence

Conditional Probabilities
Denition 1. P (E|F ) = P (EF )/P (F ) Proposition 10 (Multiplication). P (E1 En ) = P (E1 )P (E2 |E1 ) P (En |E1 En1 ).
3.2
Bayes Formula
P (E|Fi )P (Fi ).
Notation F1 , , Fr is a partition of E: that is, Fi Fj = if i = j and E = i Fi . Proposition 11. P (E) =

i
Proposition 12. P (E) = P (EF ) + P (EF c ) = P (E|F )P (F ) + P (E|F c )P (F c ); so, P (E) = P (E|F )P (F ) + P (E|F c )[1 P (F )]. Denition 2. The odds of an event A is Proposition 13 (Bayes). P (Fj |E) =
P (A) P (Ac )
P (A) ; 1P (A)
we have
P (H|E) P (H c |E)
P (H)P (E|H) . P (H c )P (E|H c )
P (EFj ) P (E)
P (E|Fj )P (Fj ) . i P (E|Fi )P (Fi )
3.3
Independent Events
Denition 3. Two events E and F are independent if P (EF ) = P (E)P (F ). Proposition 14. If E and F are independent then E and F c are indpendent. Denition 4. Three events E, F , and G are independent if P (EF G) = P (E)P (F )P (G), P (EF ) = P (E)P (F ), P (EG) = P (E)P (G), and P (F G) = P (F )P (G) Denition 5. The events E1 , , En are said to be independent if for any subset Ei1 , , Eir of them, P (Ei1 Eir ) = P (Ei1 ) P (Eir ).
3.4
P (|F ) is a Probability
Conditional probabilities satisfy all of the properties of ordinary probabilities. In fact, P (E|F ) satises the three axioms of a probability: Proposition 15. (i) 0 P (E|F ) 1, (ii) P (S|F ) = 1, and (iii) P (Ei |F ) = for mutually exclusive events {Ei }. Moreover, P (E c |F ) = 1 P (E|F ). 3 P (Ei |F )
4
4.1
Random Variables
Discrete Random Variables
Denition 6. The probability mass function p(a) of X is p(a) = P {X = a}. Proposition 16. p(a) is positive for at most countable number of values of a; that is, if X assume {xi } then p(xi ) 0 and p(x) = 0 if x {xi } and / p(xi ) = 1. Denition 7. The comulative distribution function F (a) is F (a) =
xa
p(x).
4.2
Expected Value
x:p(x)>0
Denition 8. The expectation or expected value of X is E[X] = case, E[X] = n xi p(xi ). i=1
xp(x); in nite
4.3
Expectation of a function of a random variable
Proposition 17. If X is a discrete random variable that takes on one of the values xi , i 1 with respective probabilities p(xi ) then for any real-valued function g, E[g(x)] = i g(xi )p(xi ). Corollary 2. E[aX + b] = aE[X] + b Denition 9. The quantity E[X n ] where n 1 is called the nth moment of X, which is just E[X n = x:p(x)>0 xn p(x).
4.4
Variance
Denition 10. Dene variance and standard devitation to be, respectively, V ar(X) = E[(X )2 ] and SD(X) = V ar(X). Proposition 18. V ar(X) = E[X 2 ] (E[X])2 ; V ar(aX + b) = a2 V ar(X); SD(aX + b) = |a|SD(X).
4.5
The Bernoulli and Binomial Random Variables
The random variable X whose probability mass function is given by p(i) = n pi (1 p)ni i for i = 0, , n is said to be a binomial random variable with parameters n and p. Its mean and variance are E[X] = np and V ar(X) = np(1 p). Proposition 19. If X is a binomial random variable with parameters (n, p), where 0 < p < 1, then as k goes from 0 to n, P {X = k} rst increases monotonically and then decreases monotonically, reaching its largest value when k is the largest integer less than or equal to (n + 1)p. Denition 11 (Indicator). The indicator random variable of event A, denoted by IA , is 1 if A occurs and 0 if A does not occur.
4.6
The Poisson Random Variable

i
The random variable X whose probability mass function is given by p(i) = e for i = i! 0, 1, 2, is said to be a Poisson random variable with parameter . Its mean and variance is E[X] = V ar(X) = . If a large number of independent trials are performed, each having a small probability of being successful, then the number of successful trials tha tresult will have a distribution that is approximately that of a Poisson random variable.
4.7
4.7.1
Other Discrete Probability Distributions

The Geometric Random Variable
The random variable X whose probability mass function is given by p(i) = p(1 p)i1 where i = 1, 2, is said to be a geometric random variable with parameter p. It represents the trial number of the rst success when each trial is independently a success with probability 1 p. Its mean and variance are E[X] = p and Var(X) = 1p . p2 4.7.2 The Negative Binomial Random Variable
i1 The random variable X whose probability mass function is given by p(i) = r1 pr (1 p)ir where i r is said to be a negative binomial random variable with parameters r and p. It represents the trial number of the rth success when each trial is independently a success r with probability p. Its mean and variance are E[X] = p and Var(X) = r(1p) . p2
4.7.3
The Hypergeometric Random Variable
(m)(N m) i ni where (N ) n i = 0, , m is said to be a hypergeometric random variable with parameters n, N, m. It represents the number of white balls selected when n balls are randomly chosen from an urn m that contains N balls, of which m are white. Writing p = N , its mean and variance are N n E[X] = np and Var(X) = N 1 np(1 p). The random variable X whose probability mass function is given by p(i) =
4.8
Properties of the cumulative distribution function
Some properties of the cumulative distribution function F are the following: (i) F is a nondecreasing function; (ii) limb F (b) = 1; (iii) limb F (b) = 0; (iv) F is right continuous; that is, for any b and any decreasing sequence bn , n 1, that converges to b, limn F (bn ) = F (b).
1 1 Proposition 20. P {X < b} = limn F (b n ); so, P {X = b} = F (b) limn F (b n ).
Continuous Random Variables
Notation The cumulative distribution function of random variable X is F (x) = P (X x). A density function is a real-valued function f which is non-negative and integrates to 1:
that is, f (u)du = 1. A distribution function F (or a random variable X) has probability x density function f if f is a density function such that F (x) = P (X x) = f (u)du. X is a continuous random variable if there is a nonnegative function f , probability density function of X, such that for any set B, P {X B} = B f (x)dx. Proposition 21. For continuous random variable X, P (X = x) = 0 for any point x.
5.1
Expectation and Variance of Continuous Random Variables

Denition 12. E[X] =
xf (x)dx
Proposition 22. If X is a continuous random variable with probability density function f (x) then for any real-valued function g, E[g(X)] = g(x)f (x)dx. Corollary 3. If a and b are constants then E[aX + b] = aE[X] + b. Denition 13. The variance of X is dened by V ar(X) = E[(X E[X])2 ] = E[X 2 ] (E[X])2 .
5.2
The Uniform Random Variable
A random variable X is said to be uniform over the interval (a, b) if its probability density 1 function is given by f (x) = ba for a x b and 0 otherwise and its cumulative distribution function is given by F (x) = 0 if x a, xa if a < x < b, and 1 if x b. Its expected value ba and variance are E[X] =
a+b 2
and Var(X) =
(ba)2 . 12
5.3
Normal Random Variable
A random variable X is said to be normal with parameters and 2 if its probability 2 2 1 density function is given by f (x) = 2 e(x) /2 for < x < . Its expected value and variance are E[X] = and Var(X) = 2 . If X is normal with mean and variance 2 then Z = X is normal with mean 0 and variance 1. Proposition 23. If X is N (, 2 ) and Y = aX + b with a = 0 then Y is N (a + b, a2 2 ). 5.3.1 The Normal Approximation to the Binomial Distribution
Theorem 7 (The DeMoivre-Laplace limit theorem). If Sn denotes the number of succcesses that occur when n independent trials, each resulting in a success with probability p, are S performed then, for any a < b, P {a n np b} (b) (a) as n (the normal
np(1p)
approximation will be quite good for n satisfying np(1 p) 10). 5.3.2 Lognormal Random Variable
1 e 2y 2
(ln y)2 2 2
If X is N (, 2 ) then Y = eX has a lognormal distribution, whose density is given by fY (y) = for y > 0. Its expected value and variance are E[Y ] = e+ . 6
2 2
and
Var(Y ) = (e
1)e
2+ 2
5.4
Exponential Random Variable
A random variable X is said to be exponential with parameter if its probability density function is given by f (x) = ex for x 0 and 0 otherwise and its cumulative distribution function is given by F (x) = 0 if x 0 and 1 ex if x 0. Its expected value and variance 1 1 are E[X] = and Var(X) = 2 . Proposition 24. The exponential random variable is memoryless; that is, P {X > s+t|X > t} = P {X > s}.
5.5
The Distribution of a Function of a Random Variable
Theorem 8. Let X be a continuous random variable having probability density function fX . Suppose that g(x) is a strictly monotone, dierentiable function of x. Then the random variable Y dened by Y = g(X) has a probability density function given by fY (y) = d fX [g 1 (y)]| dy g 1 (y)| if y = g(x) for some x and 0 if y = g(x) for all x, where g 1 (y) is dened to equal that value of x such that g(x) = y. Example 1. If X is a continuous random variable with density fX and Y = X 2 then Y has 1 density fY (y) = 2y [fX ( y) + fX ( y)]. Theorem 9. If X is a continuous random variable with density fX and Y = mX + b, where 1 m = 0, then fY (y) = |m| fX ( yb ). m
6
6.1
Other Topics
Expectations of Sums of Random Variables
n Proposition 25. If E[Xi ] is nite for all i = 1, , n then E[ n Xi ] = i=1 E[Xi ]. i=1 In other words, if event X can be written as a sum of indicator events Xi s then E[X] = n i=1 E[Xi ].
6.2
The Poisson Process
For t 0, N (t) denotes the number of events that occur in the time interval [0, t]. {N (t), t 0} is said to be a Poisson porcess with rate lambda if the following assumptions are met: (i) N (0) = 0; (ii) The numbers of events that occur in disjoint time intervals are independent; (iii) The distribution of the number of events that occur in a given interval depends only on the length of that interval and not on its location; (iv) P {N (h) = 1} = h + o(h); (v) P {N (h) 2} = o(h). Theorem 10. Let {N (t), t 0} be a Poisson process with rate . Then, for any t 0, t (t)n the random variable N (t) has a Poisson distribution with mean t: P {N (t) = n} = e n! where n = 0, 1, 2, . Proposition 26. For any 0 s t, N (t)N (s) counts the number of events in the interval (s, t]. N (t) N (s) has a Poisson distribution with mean (t s): P {N (t) N (s) = n} = e(ts) ((ts))n where n = 0, 1, 2, . n! 7

Stat 116

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Stat 116

Hochgeladen von

Copyright:

Verfügbare Formate

Stanford University STAT 116 Theory of Probability

Yongwhan Lim Autumn 2008-2009

represents the number of possible combinations of n objects taken r at

a time: we denote this number by

Proposition 1 (Pascals Triangle). Theorem 3 (Binomial). (x + y)n =

Permutation & Multinomial Coecients

Some Simple Propositions

Sample Spaces Having Equally Likely Outcomes

Proposition 6. P (A) = |A|/||. 2.3.1 Sampling with replacement

Probability as a continuous set function

Proposition 9 (Bonferronis Inequality). P (n Ai ) i=1

Probability as a Measure of Belief

Conditional Probability and Independence

Notation F1 , , Fr is a partition of E: that is, Fi Fj = if i = j and E = i Fi . Proposition 11. P (E) =

P (H)P (E|H) . P (H c )P (E|H c )

P (E|Fj )P (Fj ) . i P (E|Fi )P (Fi )

Expectation of a function of a random variable

The Bernoulli and Binomial Random Variables

The Poisson Random Variable

Other Discrete Probability Distributions

The Hypergeometric Random Variable

Properties of the cumulative distribution function

Continuous Random Variables

Expectation and Variance of Continuous Random Variables

Denition 12. E[X] =

The Uniform Random Variable

Normal Random Variable

Exponential Random Variable

The Distribution of a Function of a Random Variable

The Poisson Process

Das könnte Ihnen auch gefallen