Probability Theory Presentation 06

BST 401 Probability Theory
Xing Qiu Ha Youn Lee
Department of Biostatistics and Computational Biology

University of Rochester
September 21, 2009
Qiu, Lee BST 401

Outline
1 Lebesgue-Stieltjes Measure and Distribution Functions
2 Measurable functions
Qiu, Lee BST 401

Lebesgue measure review, the construction
Let F0 be the field generated by the collection of all

intervals, µ0 be the usual length measure of intervals.
Extend µ0 to µ1 : G → R, where G is F0 plus limiting sets
of F0 , µ1 on these limiting sets are defined by exchange
the order of limit and measure.
Extend µ1 to µ∗ , which is an outer measure defined on 2Ω .
Unfortunately, µ∗ in general does not satisfy
countable-additivity.
Restrict µ∗ to the collection of measurable sets, denoted by
F ∗ , which is a σ-algebra.
Qiu, Lee BST 401


Qiu, Lee BST 401


Qiu, Lee BST 401


Qiu, Lee BST 401

Lebesgue measure review, the main results
The Carathéodory extension theorem. There exists one

and only one way to extend a σ-finite measure µ0 on an
algebra F0 to, F ,the σ-algebra generated by F0 .
The measure approximation theorem. For any A ∈ F and
a given > 0, there exists a set B ∈ F0 such that
µ(A∆B) < .
Qiu, Lee BST 401

Lebesgue measure review, the main results
The Carathéodory extension theorem. There exists one

and only one way to extend a σ-finite measure µ0 on an
algebra F0 to, F ,the σ-algebra generated by F0 .
The measure approximation theorem. For any A ∈ F and
a given > 0, there exists a set B ∈ F0 such that
µ(A∆B) < .
Qiu, Lee BST 401

Generalizations
Slight generalization of Lebesgue Measure:

µ((a, b]) = F (b) − F (a), where F (·) is a non-decreasing,
continuous function.
Further generalization: F (·) just needs to be a
right-continuous function. So jumps are allowed.
Def. of right-continuity: F (xn ) ↓ F (x) when xn ↓ x.
Such an F is called a Stieltjes measure function. If µ is a
probability measure, it is called the distribution function of
µ. We will use these two terms interchangeably.
Theorem (1.5), pg. 440. Given such an F , there is a
measure µ s.t. µ((a, b]) = F (b) − F (a).
Qiu, Lee BST 401

Generalizations

Qiu, Lee BST 401

Generalizations

Qiu, Lee BST 401

Generalizations

Qiu, Lee BST 401

Generalizations

Qiu, Lee BST 401

Definitions
The converse of Theorem (1.5) is almost true. For most

measures we can define its distribution function. The only
exceptions: µ((a, b]) = ∞ for some finite interval.
Definition
A Lebesgue-Stieltjes measure on R is a measure µ : B → R
such that µ(I) < ∞ for each bounded interval.
Alternatively, we may define L-S measure by F (·) which

satisfies the non-decreasing and right-continuity conditions.
Qiu, Lee BST 401

Comments and examples
Page 1.4.5.
Qiu, Lee BST 401

Discrete measure
Page 26.
Let µ be a L-S measure that is concentrated on a
countable set S = {x1 , x2 , . . . , }.
Distribution function: step function.
µ can be extended to 2Ω .
Qiu, Lee BST 401

Discrete measure
Page 26.
Qiu, Lee BST 401

Discrete measure
Page 26.
Qiu, Lee BST 401

Restriction
In the discrete measure example, we can restrict µ to 2S

instead of B. It pretty much has the same property.
Remark: why we don’t say restrict µ on S?
Another restriction example: µ is concentrated on some
interval [a, b].
Construction of B[a, b]. Then we can restrict µ on this
σ-field without loosing its mathematical properties.
Qiu, Lee BST 401

Restriction

interval [a, b].
Qiu, Lee BST 401

Restriction

interval [a, b].
Qiu, Lee BST 401

Restriction

interval [a, b].
Qiu, Lee BST 401

L-S measure on Rn
Sketch of construction
The analogy of intervals: rectangles.
Open rectangles, closed rectangles, semi-closed
rectangles: just need to know their two vertices. Same
notation: (a, b]
The smallest σ-field containing all rectangles: B(Rn ).
A L-S measure on Rn is a measure µ : B(Rn ) → R such
that µ(I) < ∞ for each bounded rectangle I.
Qiu, Lee BST 401

L-S measure on Rn
notation: (a, b]
Qiu, Lee BST 401

L-S measure on Rn
notation: (a, b]
Qiu, Lee BST 401

L-S measure on Rn
notation: (a, b]
Qiu, Lee BST 401

L-S Measure on Rn (II)
Its distribution function is defined to be F = µ((−∞, x]).

These distribution functions are also
non-decreasing. For ~a ≤ ~b, that is, a1 ≤ b1 , a2 ≤ b2 , . . ., we
have: F (a) ≤ F (b).
Right-continuous. F is right continuous in all variables.
On the other hand, just as in the 1-dim case, for any
non-decreasing, right continuous function, there exist a
unique measure on B(Rn ) corresponding with it.
Qiu, Lee BST 401


Qiu, Lee BST 401


Qiu, Lee BST 401


Qiu, Lee BST 401


Qiu, Lee BST 401

Measure of a finite rectangle
This is a main difference.

µ((~a, ~b]) 6= F (~b) − F (~a).
Draw a two dimensional example to illustrate this point.
Page 442-443 describes an elaborated way of measuring a
rectangle by means of the distribution function, but that
formula is not often used. The reason is that later we will
see that we can define density function for most common,
useful distributions, and there is a easy way to calculate
measure of a rectangle (or an arbitrary region for that
matter) using integral.
Qiu, Lee BST 401


µ((~a, ~b]) 6= F (~b) − F (~a).
Qiu, Lee BST 401


µ((~a, ~b]) 6= F (~b) − F (~a).
Qiu, Lee BST 401

An example of measurable function
A probability example to show the motivation. Let

(Ω, F , P) be a probability space. To be more specific, let’s
say Ω = {HEAD, TAIL-a, TAIL-b},
F = {φ, Ω, {HEAD} , {TAIL-a, TAIL-b}},
µ({HEAD}) = 12 , µ({TAIL-a, TAIL-b}) = 12 .
µ is a probability measure. Interpretation: probability of
seeing HEAD or TAIL (including a,b, types) is both 1/2.
Now define a function h on Ω in this way: h(HEAD) = −1,
h(TAIL-a) = h(TAIL-b) = 1. (Such a function is sometimes
called a coding function).
Qiu, Lee BST 401


Qiu, Lee BST 401


Qiu, Lee BST 401

An example of measurable function (II)
This function codes HEAD, TAIL-a, TAIL-b into numbers,

which are much easier to process than plain English
descriptions!
One nice thing is, we can talk about P(h = −1) and
P(h = 1) instead of µ(HEAD), µ({TAIL-a, TAIL-b}).
h not only maps arbitrary events into tangible numbers, but
also maps a measure defined on an arbitrary space to a
measure defined for numbers.
Qiu, Lee BST 401


descriptions!
Qiu, Lee BST 401


descriptions!
Qiu, Lee BST 401

Another example (II)
More complex examples: measuring height of trees. Ω:

trees (descriptive). P: probability of the height of trees.
h : Ω → R maps every tree to a real number (of
centimeters, or inches, etc).
In this example, we may want to estimate probabilities in
this form: P(a < h(ω) ≤ b), i.e., , the probability of a
certain range of height.
A notation: for any function h : Ω → R and for a set A ⊂ R,
denote h−1 (A) to be {ω ∈ Ω|h(ω) ∈ A}. Draw a diagram to
show this set.
By this notation, P(a < h(ω) ≤ b) = P(h−1 ((a, b])).
Qiu, Lee BST 401


show this set.
Qiu, Lee BST 401


show this set.
Qiu, Lee BST 401


show this set.
Qiu, Lee BST 401

Non-measurable function
Now let us define another function g on Ω. g(HEAD) = −1,

g(TAIL-a) = 0, g(TAIL-b) = 1.
This function is bad because P(g = 1) and P(g = 0) are
not well defined!
Gambling interpretation: TAIL-a has certain probability that
is unmeasurable for us players. All we can observe are the
HEAD: lose one dollar; TAIL: most of time (TAIL-b) we win
one dollar, but sometimes the result is canceled by the
casino (TAIL-a).
Qiu, Lee BST 401


not well defined!
casino (TAIL-a).
Qiu, Lee BST 401


not well defined!
casino (TAIL-a).
Qiu, Lee BST 401

Random variables and Borel-measurable functions
Start with an arbitrary probability space: (Ω, F , P).

A function h : Ω → R is called a random variable if
h−1 ((a, b]) is always measurable, i.e., , P(a < h(x) ≥ b) is
always well defined.
From the Carathéodory extension theorem, we know we
can extend intervals to Borel sets. In other words, if h is a
random variable, then for any Borel set B, h−1 (B) is
measurable.
R can be extended to Rn . I.e., functions taking vector
values. These functions are called n-dimensional random
vectors.
Furthermore, if P is replaced by an arbitrary measure, h is
called an n-dimensional Borel-measurable function.
Qiu, Lee BST 401


measurable.
vectors.
Qiu, Lee BST 401


measurable.
vectors.
Qiu, Lee BST 401


measurable.
vectors.
Qiu, Lee BST 401


measurable.
vectors.
Qiu, Lee BST 401

Probability Theory Presentation 06

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Probability Theory Presentation 06

Hochgeladen von

Copyright:

Verfügbare Formate

BST 401 Probability Theory

Xing Qiu Ha Youn Lee

Department of Biostatistics and Computational Biology

September 21, 2009

Qiu, Lee BST 401

1 Lebesgue-Stieltjes Measure and Distribution Functions

Qiu, Lee BST 401

Let F0 be the field generated by the collection of all

Qiu, Lee BST 401

Let F0 be the field generated by the collection of all

Qiu, Lee BST 401

Let F0 be the field generated by the collection of all

Qiu, Lee BST 401

Let F0 be the field generated by the collection of all

Qiu, Lee BST 401

The Carathéodory extension theorem. There exists one

Qiu, Lee BST 401

The Carathéodory extension theorem. There exists one

Qiu, Lee BST 401

Slight generalization of Lebesgue Measure:

Qiu, Lee BST 401

Slight generalization of Lebesgue Measure:

Qiu, Lee BST 401

Slight generalization of Lebesgue Measure:

Qiu, Lee BST 401

Slight generalization of Lebesgue Measure:

Qiu, Lee BST 401

Slight generalization of Lebesgue Measure:

Qiu, Lee BST 401

The converse of Theorem (1.5) is almost true. For most

Alternatively, we may define L-S measure by F (·) which

Qiu, Lee BST 401

Qiu, Lee BST 401

Qiu, Lee BST 401

Qiu, Lee BST 401

Qiu, Lee BST 401

In the discrete measure example, we can restrict µ to 2S

Qiu, Lee BST 401

In the discrete measure example, we can restrict µ to 2S

Qiu, Lee BST 401

In the discrete measure example, we can restrict µ to 2S

Qiu, Lee BST 401

In the discrete measure example, we can restrict µ to 2S

Qiu, Lee BST 401

Qiu, Lee BST 401

Qiu, Lee BST 401

Qiu, Lee BST 401

Qiu, Lee BST 401

Its distribution function is defined to be F = µ((−∞, x]).

Qiu, Lee BST 401

Its distribution function is defined to be F = µ((−∞, x]).

Qiu, Lee BST 401

Its distribution function is defined to be F = µ((−∞, x]).

Qiu, Lee BST 401

Its distribution function is defined to be F = µ((−∞, x]).

Qiu, Lee BST 401

Its distribution function is defined to be F = µ((−∞, x]).

Qiu, Lee BST 401

This is a main difference.

Qiu, Lee BST 401

This is a main difference.

Qiu, Lee BST 401

This is a main difference.

Qiu, Lee BST 401