Sie sind auf Seite 1von 49

Mathematical Expectation

Dr. Sunu Wibirama

Basic Probability and Statistics


Department of Electrical Engineering and Information Technology
Faculty of Engineering, Universitas Gadjah Mada
Outline
Mean of a random variable
Variance and covariance
Overview of Correlation Coefficient
Random variable
Probability theory: the discipline concerned with
the study of uncertain (or random) phenomena.
Probability is the mathematical language adopted
for quantifying uncertainty.
Such phenomena, although not predictable in a
deterministic fashion, may present some regularities
and consequently be described mathematically by
idealized probabilistic models.
The theory of probability makes possible to infer
from these models the patterns of future behavior.
Random variable is useful to express random
quantities determined by outcome of experiment
Experiment with coins
Two coins are tossed 16 times and X is the number of heads
that occur per toss, then the values of X can be 0, 1, and 2.
Suppose the experiment yields no heads, one head, and
two heads a total of 4, 7, and 5 times, respectively. The
average number of heads per toss of the two coins is:

4 7 5
( 0) + (1) + ( 2) = 1.06
16 16 16

1.06 is the mean of random variable X, or the mean of the


probability distribution of X, or mathematical expectation /
expected value of random variable X
Random Variables
q Definition
Let X be a random variable with probability distribution
f(x). The mean or expected value of X is

X
= E(x) = xf (x) If X is discrete
x
Z 1
= E(x) = xf (x)dx If X is continuous
1
Comparison
General formula of mean:
n
xi 1
x= =x
i =1 n x n
Discrete Expected Value

= E ( X ) = x f ( x)
x
Continuous Expected Value

= E ( X ) = x f ( x ) dx

Example
Refer to the two-coin tossing experiment and the
probability distribution for the random variable x
Demonstrate that the formula for E(x) gives the mean
of the probability distribution for the discrete random
variable x.

Solution
If we were to repeat the two-coin tossing experiment a
large number of times say 400,000 times, we would
expect to observe:
x = 0 heads approximately 100,000 times,
x = 1 head approximately 200,000 times,
x = 2 heads approximately 100,000 times.
.. (contd)
Calculating the mean of these 400,000 values
of x, we obtain
P
x x 100, 000(0) + 200, 000(1) + 100, 000(2)
=
N 400, 000
1 1 1
= (0) + (1) + (2)
4
X 2 4
= f (x)x
x

Thus, the mean of x is 1


.: The average numbers of head per toss of
two coins is 1
Challenge
A lot containing 7 components is sampled by a
quality inspector; the lot contains 4 good
components and 3 defective components. A
sample of 3 is taken by the inspector.
Find the expected value of the number of good
components in this sample.
Example
A lot containing 7 components is sampled by a
quality inspector; the lot contains 4 good
components and 3 defective components. A
sample of 3 is taken by the inspector.
Find the expected value of the number of good
components in this sample.
Answer
A lot containing 7 components is sampled by a quality
inspector; the lot contains 4 good components and 3
defective components. A sample of 3 is taken by the inspector.
Find the expected value of the number of good components
in this sample.

Solution
Let X represent the number of good components in the
sample. The probability distribution of X is
4 3
x 3 x x = 0, 1, 2, 3.
f (x) = 7 ,
3
A few simple calculations yield f(0) = 1/35, f(1) = 12/35,
f(2) = 18/35, f(3) = 4/35. Therefore

1 12 18 4
= E(X) = (0) + (1) + (2) + (3)
35 35 35 35
= 1.7
Thus, if a sample of size 3 is selected at random over and
over again from a lot of 4 good components and 3
defective components, it would contain, on average, 1.7
good components.
Challenge
Let X be the random variable that denotes the
life in hours of a certain electronic device. The
probability density function is
20,000
f (x) = x3 , x > 100
0, elsewhere

Find the expected life of this type of device.


Example
Let X be the random variable that denotes the
life in hours of a certain electronic device. The
probability density function is
20,000
f (x) = x3 , x > 100
0, elsewhere

Find the expected life of this type of device.


Answer
Let X be the random variable that denotes the life in hours
of a certain electronic device. The probability density
function is
20,000
f (x) = x3 , x > 100
0, elsewhere
Find the expected life of this type of device.
Solution
Z 1 Z 1
20, 000 20, 000
= E(X) = x 3
dx = 2
dx
100 x 100 x
= 200
Therefore, we can expect this type of device to last, on
average, 200 hours
Mean of g(X)
q Definition
Let X be a random variable with probability distribution
f(x). The mean or expected value of the random
variable g(X) is

X
g(X) = E[g(X)] = g(x)f (x) If X is discrete
x
Z 1
g(X) = E[g(X)] = g(x)f (x)dx If X is continuous
1
Example
Suppose that the number of cars X that pass through a car
wash between 4:00 P.M. and 5:00 P.M. on any sunny
Friday has the following probability distribution:

x 4 5 6 7 8 9

P(X=x) 1/12 1/12 1/4 1/4 1/6 1/6

Let g(X) = 2X - 1 represent the amount of money in


dollars, paid to the attendant by the manager. Find the
attendant's expected earnings for this particular time
period.
Solution
E[g(X)] = E(2X 1)
9
X
= (2x 1)f (x)
x=4
1 1 1 1 1 1
= (7) + (9) + (11) + (13) + (15) + (17)
12 12 4 4 6 6
= 12.67
Challenge
Let X be a random variable with density function
x2
f (x) = 3 , 1<x<2
0, elsewhere

Find the expected value of g(X) = 4X + 3.


Example
Let X be a random variable with density function
x2
f (x) = 3 , 1<x<2
0, elsewhere

Find the expected value of g(X) = 4X + 3.


Answer
Let X be a random variable with density function
x2
f (x) = 3 , 1<x<2
0, elsewhere
Find the expected value of g(X) = 4X + 3.
Solution
Z 2
(4x + 3)x2
E(4X + 3) = xdx
1 3
Z 2
1
= (4x3 + 3x2 )dx
3 1
=8
Variance
Mean denotes where the
probability distribution is
centered
Mean does not give adequate
description of shape of
distribution
x
We need to characterize the
variability in the distribution

Reminder: variance of population

(xi )
n 2
=
2
n
(xi )2
2 =
i=1 n

i=1 N
Variance
q Definition
Let X be a random variable with probability distribution f(x)
and mean . The variance of X is
X
2
= E[(X )2 ] = (x )2 f (x) If X is
discrete
Zx1
2
= E[(X )2 ] = (x )2 f (x)dx If X is
1 continuous

The standard deviation of x is the positive square root of the


variance of x:
p
= 2
Example
Refer to the two-coin tossing experiment and the
probability distribution for the random variable x. Find
the variance and standard deviation of random
variable x.

Solution
If we were to repeat the two-coin tossing experiment a
large number of times say 400,000 times, we would
expect to observe:
x = 0 heads approximately 100,000 times,
x = 1 head approximately 200,000 times,
x = 2 heads approximately 100,000 times.
Calculating the mean of these 400,000 values of x,
we obtain
P
x x 100, 000(0) + 200, 000(1) + 100, 000(2)
=
N 400, 000
1 1 1
= (0) + (1) + (2)
4
X 2 4
= f (x)x
x

Thus, the mean of x is 1


The average numbers of head per toss is 1
Example (contd)
n Variance of random variable x
2
X
2
= E[(x )2 ] = (x )2 p(x)
x=0

1 1 1
= (0 1)2 + (1 1)2 + (2 1)2
4 2 4
1
=
2

Standard deviation:

r
p 1
= 2 = 0.707
2
Challenge
Let the random variable X represent the number of automobiles
that are used for official business purposes on any given workday.
The probability distribution for company A is

x 1 2 3

f(x) 0.3 0.4 0.3

and for company B is


x 0 1 2 3 4

f(x) 0.2 0.1 0.3 0.3 0.1

Show that the variance of the probability distribution for company


B is greater than of company A
Example
Let the random variable X represent the number of automobiles
that are used for official business purposes on any given workday.
The probability distribution for company A is

x 1 2 3

f(x) 0.3 0.4 0.3

and for company B is


x 0 1 2 3 4

f(x) 0.2 0.1 0.3 0.3 0.1

Show that the variance of the probability distribution for company


B is greater than of company A
Answer
For company A, we find that
= E(X) = (1)(0.3) + (2)(0.4) + (3)(0.3) = 2.0
3
X
2
= (x )2 f (x)
x=1
= (1 2)2 (0.3) + (2 2)2 (0.4) + (3 2)2 (0.3)
= 0.6

For company B, we have

= E(X) = (0)(0.2) + (1)(0.1) + (2)(0.3) + (3)(0.3) + (4)(0.1) = 2.0


3
X
2
= (x )2 f (x)
x=1
= (0 2)2 (0.2)(1 2)2 (0.1) + (2 2)2 (0.3) + (3 2)2 (0.3) + (4 2)2 (0.1)
= 1.6
Challenge
Proof that the variance of a random variable X is

2
= E(X 2 ) ()2
Challenge
Let the random variable X represent the number of
defective parts for a machine when 3 parts are sampled
from a production line and tested. The following is the
probability distribution of X,

x 0 1 2 3

f(x) 0.51 0.38 0.10 0.01

Calculate 2
Solution

= (0)(0.51) + (1)(0.38) + (2)(0.10) + (3)(0.01) = 0.61

E(X 2 ) = (0)(0.51) + (1)(0.38) + (4)(0.10) + (9)(0.01) = 0.87

Therefore
2
= 0.87 (0.61)2 = 0.4979
Challenge
The weekly demand for Pepsi, in thousands of liters,
from a local chain of efficiency stores, is a continuous
random variable X having the probability density

2(x 1), 1<x<2
f (x) =
0, elsewhere

Find the mean and variance of X


Solution
Z 2
5
= E(X) = 2 x(x 1)dx =
1 3
Z 2
2 2 17
E(X ) = 2 x (x 1)dx =
1 6

Therefore
2
2 17 5 1
= =
6 3 18
Covariance
Covariance: measurement of the nature
association between two dependent random
variables X and Y
If covariance sign is positive (+), then X ~ Y
1
If covariance sign is negative (-), then X ~
Y
If X and Y is statistically independent,
then covariance equals to 0
Covariance of X and Y :

XY = E[(X X )(Y Y )] or XY = E(XY ) X Y


Example
Given the following probability density distribution, find the
covariance of x and y:

x
f(x,y) 0 1 2 Row
totals
| 0 3/28 9/28 3/28 15/28

y| 1 6/28 6/28 12/28

| 2 1/28 1/28
Col. totals 10/28 15/28 3/28 1

First: find the mean of X and Y


37
x
Example (continued) f(x,y) 0 1 2 h(y)
| 0 3/28 9/28 3/28 15/28
y| 1 6/28 6/28 12/28
First :
Find the mean of | 2 1/28 1/28
X and Y g(x) 10/28 15/28 3/28 1

2 2 2
x = E ( X ) = x f ( x, y ) = x g ( x )
x = 0 y =0 x =0

= 0(10 / 28) + 1(15 / 28) + 2(3 / 28) = 3 / 4


2 2 2
y = E (Y ) = y f ( x, y) = y h( y)
x =0 y =0 y =0

= 0(15 / 28) +1(12 / 28) + 2(1 / 28) = 14 / 28 = 1 / 2 38


Example (continued) x
f(x,y) 0 1 2 h(y)
| 0 3/28 9/28 3/28 15/28
y| 1 6/28 6/28 12/28
Now find
the covariance of | 2 1/28 1/28
X and Y g(x) 10/28 15/28 3/28 1

2 2
E ( XY ) = x y f ( x, y)
x =0 y =0

= 0(0)(3 / 28) + 0(1)(6 / 28) + 0(2)(1 / 28)


+ (1)(0)(9 / 28) +1(1)(6 / 28) + 2(0)(3 / 28)
= 3 /14

XY = E(XY ) x y = 3 / 14 (3 / 4)(1 / 2) = 9 / 56
39
Correlation Coefficient
Note that value of co-variance does not indicate strength of the
relationship since it depends on the scale. The correlation
coefficient is a scale-free version.
Let X and Y be random variables with standard deviations X
and Y, and covariance XY . The correlation coefficient of X
and Y is:
XY
XY = 1 XY 1
X Y
The correlation coefficient is scale-free, and:

if XY = 0, then XY = 0
40
Real Implementation
Correlation coefficient can be used in template
matching algorithm for pattern recognition or
object tracking
Find the position of small patch of an image, in
the sequential frames. Best matching will be the
maximum value of correlation coefficient
Means and Variances of Linear Combinations of
Random Variables

1. E(aX + b) = aE(X) + b a and b are constant

2. E[g(X) h(X)] = E[g(X)] E[h(X)]

3. E[g(X, Y ) h(X, Y )] = E[g(X, Y )] E[h(X, Y )]

4. E(X, Y ) = E(X)E(Y )
2
5. aX+b = a2 2
X = a2 2

2
6. aX+bY = a2 2
X + b2 2
Y = 2ab XY
Chebyshev Theorem
Note: The late P.L. Chebyshev (1821-1894) was a Russian mathematician

A bunch of data with mean and standard


x
deviation s, at least 75% of them will be in the
interval of and at least 89% of them will in
x 2s
the interval of x 3s
Variability in pictures
Distribution of discrete data

a) A distribution with greater


standard deviation

b) A distribution with small


standard deviation

Area under the curve is 1,


then the area between any
Distribution of continuous data two numbers is the probability
of the random variable assuming
a value between these numbers
Chebyshevs Theorem for Random Variable
The probability that any random variable X
will assume a value within k standard
deviations of the mean is at least 1-1/k2
1
P ( k <X <+k ) 1
k2

k = 2, random variable X has a probability at least 3/4 which lie in


the interval 2
k = 3, random variable X has a probability at least 8/9 which lie in
the interval 3
Example
A random variable X has a mean = 8, a variance 2 =
9, and an unknown probability distribution, Find
(a) P ( 4 < X < 20)
(b) P (|X 8| 6)

Solution
15
P ( 4 < X < 20) = P [8 (4)(3) < X < 8 + (4)(3)]
16
P (|X 8| 6) = 1 P (|X 8| < 6)
=1 P( 6 < X 8 < 6)
1
=1 P [8 (2)(3) < X < 8 + (2)(3)]
4
The theorem does not tell you the exact probability distribution
NEXT : DISCRETE PROBABILITY
DISTRIBUTIONS
Covariance
The covariance of two random variable X and Y with
means X and Y, respectively, is given by

XY = E(XY ) X Y

Note
Nature of association between two random variables
The sign of covariance indicates whether the relationship
between two dependent random variables is positive or
negative
When X and Y are statistically independent, it can be
shown that the covariance is zero
Correlation coefficient
q Definition
Let X and Y be random variables with covariance XY
and standard deviations X and Y, respectively. The
correlation coefficient X and Y is

XY
XY =
X Y

and satisfies
1 XY 1

Das könnte Ihnen auch gefallen