Beruflich Dokumente
Kultur Dokumente
Jonathan Marchini
Continuous data
In previous lectures we have considered discrete
datasets and discrete probability distributions. In practice many datasets that we collect from experiments
consist of continuous measurements.
So we need to study probability models for continuous
data.
6
4
2
0
Frequency
10
1000
2000
3000
4000
5000
6000
6
4
2
0
Frequency
10
1100000
8
6
4
2
0
Frequency
10
12
1.0
1.2
1.4
Petal length
1.6
1.8
40
20
0
Frequency
60
100
200
Serum level
300
400
P(X)
10
X
15
20
0.02
0.01
0.00
density
0.03
0.04
60
80
100
X
120
140
0.00
50
100
150
50
100
150
= 130 = 10
= 100 = 15
0.04
0.00
0.04
density
0.08
0.08
0.00
density
0.04
density
0.04
0.00
density
0.08
= 100 = 5
0.08
= 100 = 10
50
100
X
150
50
100
X
150
P(Z < 0)
0
Symmetry P (Z < 0) = 0.5
P(Z < 1)
Calculating this area is not easy and so we use probability tables. Probability tables are tables of probabilities that have been calculated on a computer. All we
have to do is identify the right probability in the table
and copy it down!
Only one special Normal distribution, N(0, 1), has
been tabulated.
The N(0, 1) distribution is called
the standard Normal distribution.
z
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
0.0
0.5000
0.5398
0.5793
0.6179
0.6554
0.6915
0.7257
0.7580
0.7881
0.8159
0.8413
0.8643
0.01
5040
5438
5832
6217
6591
6950
7291
7611
7910
8186
8438
8665
0.02
5080
5478
5871
6255
6628
6985
7324
7642
7939
8212
8461
8686
0.03
5120
5517
5910
6293
6664
7019
7357
7673
7967
8238
8485
8708
0.04
5160
5557
5948
6331
6700
7054
7389
7704
7995
8264
8508
8729
0.05
5199
5596
5987
6368
6736
7088
7422
7734
8023
8289
8531
8749
0.06
5239
5636
6026
6406
6772
7123
7454
7764
8051
8315
8554
8770
0.07
5279
5675
6064
6443
6808
7157
7486
7794
8078
8340
8577
8790
0.08
5319
5714
6103
6480
6844
7190
7517
7823
8106
8365
8599
8810
0.09
5359
5753
6141
6517
6879
7224
7549
7852
8133
8389
8621
8830
Example 1
If Z N(0, 1) what is P (Z > 0.92)?
0.92
0.92
Example 2
If Z N(0, 1) what is P (Z > 0.5)?
0.5
0.5
Example 3
If Z N(0, 1) what is P (Z < 0.76)?
0.76
0.76
By symmetry
P (Z < 0.76) = P (Z > 0.76) = 1 P (Z < 0.76)
= 1 0.7764
= 0.2236
Example 4
If Z N(0, 1) what is P (0.64 < Z < 0.43)?
P(0.64 < Z < 0.43)
0.64
P(Z < 0.64)
0.64
0 0.43
P(Z < 0.43)
0 0.43
= 0.4053
Example 5
Consider P (Z < 0.567)?
From tables we know that P (Z < 0.56) = 0.7123
and P (Z < 0.57) = 0.7157
To calculate P (Z < 0.567) we interpolate between these
two values
P (Z < 0.567) = 0.3 0.7123 + 0.7 0.7157 = 0.71468
Standardization
All of the probabilities above were calculated for the
standard Normal distribution N(0, 1). If we want
to calculate probabilities from different Normal
distributions we convert the probability to one
involving the standard Normal distribution.
This process is called standardization.
3 => N(0, 4)
0
/ 2 => N(0, 1)
0 1.6
3.2
6.2
If X N(, 2) and Z =
then
Z N(0, 1)
Example 6
Suppose we know that the birth weight of babies is
Normally distributed with mean 3500g and standard
deviation 500g. What is the probability that a baby is
born that weighs less than 3100g?
That is X N(3500, 5002) and we want to calculate
P (X < 3100)?
We can calculate the probability through the process
of standardization.
Z ~ N(0, 1)
3100
3500
3100 3500 0
500
= 0.8
= 1 P (Z < 0.8)
= 1 0.7881
= 0.2119
where Z N(0, 1)
X N(80, 102)
Y N(78, 132)
then
X Y N(1 - 2, 12 + 22)
In this example,
D = X Y N(80 78, 102 + 132) = N (2, 269)
Z ~ N(0, 1)
P(Z < 0.122)
P(D < 0)
Z=D2
16.40
02 0
16.40
= 0.122
D2 02
<
P (D < 0) = P
269
269
= P (Z < 0.122)
!
Z N (0, 1)
then
X + Y N(1 + 2, 12 + 22)
aX N(a1, a212)
P (X > x) = 0.2
P (X < x) = 0.8
X ~ N(45, 400)
Z ~ N(0, 1)
45
x 45
20
= 0.84
= 0.8
x 45
P Z<
20
= 0.8
x 45
0.84
20
x 45 + 20 0.84 = 61.8
0.04
0.03
0.00
0.01
0.02
density
0.03
0.02
0.01
0.00
P(X = x)
0.04
Bin(300, 0.5)
100
120
140
160
X
180
200
100
120
140
160
X
180
200
In general
If X Bin(n, p) then
= np
2 = npq
where q = 1 p
away from 12
Example
Suppose X Bin(12, 0.5) what is P (4 X 7)?
For this distribution we have
= np = 6
2 = npq = 3
4
3.5
8
7.5
10 11 12
3.5 6 X 6 7.5 6
The exact answer is 0.733 so in this case the approximation is very good.
Example
A radioactive source emits particles at an average rate
of 25 particles per second. What is the probability that
in 1 second the count is less than 27 particles?
X = No. of particles emitted in 1s X Po(25)
X ~ N(25, 25)
Z ~ N(0, 1)
25
26.5
26.5 25
5
= 0.3
X 25 26.5 25
<
P (X < 26.5) = P
5
5
= P (Z < 0.3)
= 0.6179
where Z N(0, 1)