Sie sind auf Seite 1von 59

CE-564

Probability and Statistics


Fall 2018

Lecture-1
COURSE OVERVIEW
Major Knowledge Areas
 PROBABILITY
 Probability: Concepts of Probability and their relevance to
statistical analysis,
 Probability distributions relevant to transportation data analysis
 INFERENTIAL STATISTICS
 Data Collection: Survey planning and design
 Statistical distributions, confidence intervals, hypothesis testing
 CAUSAL STATISTICS
 ANOVA
 Regression analysis
Course philosophy; Basic theme and
Concepts
Course philosophy; Basic theme and
Concepts
 The elements in probability allow us
to draw conclusions about
characteristics of hypothetical data
taken from the population, based on
known features of the population.
 This type of reasoning is deductive in nature.
Course philosophy; Basic theme and
Concepts
 For a statistical problem, the sample
along with inferential statistics allows
us to draw conclusion about the
population, with inferential statistics
making clear use of elements of
probability.
 This reasoning is inductive in nature.
Week Topics to be covered
1 Course philosophy; Basic theme and Concepts
2 Probability: Concepts of Probability and their relevance to statistical analysis,
3 Probability distributions relevant to transportation data analysis.
4 Probability distributions types.
5 Probability distributions types--.Problems
6 Data Collection: Survey planning and design
7 Data Collection: statistical concepts--Problems
8 Traffic survey practice, inventory surveys, transport usage surveys, travel time and
congestion surveys, ---Test One--
9 matrix surveys, questionnaires and interviews, sources and use of secondary data,--
Project description
10 Statistics: Summary measures.
11 Statistical distributions, confidence intervals, hypothesis testing,
12 Contingency tables, correlation and linear regression,
13 ANOVA; basic concepts
14 ANOVA; applications
15 Multivariate analysis
16 Presentations; Course Conclusion
The background
FIGURE 1 — The history of road fatalities
COMMENTARY: There was a steady increase in the per capita road fatality
rate, with the exception of the Great Depression and the Second World War, until 1970. Since 1970,
the toll has trended downwards, although it has recently stalled.
Course philosophy; Basic theme and
Concepts
 The engineering approach
Probability
When we know the underlying model that governs an experiment,
we use probability to figure out the chance that different
outcomes will occur.
 For example, if we flip a fair coin 3 times, what is the probability
 of obtaining 3 heads?
 By definition, probability values are between 0 and 1.
 What does it mean if Outcome A of an experiment has
aprobability of 1/3rd of occurring?
 If the experiment is repeated a large number of times, Outcome
 A will occur 1/3rd of the time.
Statistics
Data analysis, random variables, stochastic
processes, probability, statistical modelling and processes,
inference, time series, reliability, multivariate, SPC, ……
- everywhere in modern engineering
-workplaces, applications, research
Maths
􀂃 Mathematical thinking is lifeblood of engineering
 engineering needs the most technical maths faster than
any other discipline and
 engineering needs the most maths generic skills faster
than
 any other discipline
􀂃 Maths is like language
 Specific & generic skills become part of person
 People forget how they acquired such skills
 Transferability needs more than specifics required
􀂃 Maths fitness is like physical fitness
 Underpins development of field-specific skills
 Necessary but not sufficient for excellence in specific
fields
Probability and Statistics
In Probability, we use our knowledge of the underlying model to
determine the probability that different outcomes will occur.
􀂃 How does statistics compare to probability?
 In statistics, we don’t know the underlying model governing
anexperiment.
 All we get to see is a sample of some outcomes of the
experiment.
 We use that sample to try to make inferences about the underlying
model governing the experiment.
 So a thorough understanding of probability is essential to
understanding statistics.
Probability and Statistics
 Collect, organize, and display data
 Use appropriate statistical methods to analyze data
 Develop inferences and predictions that are based on data
 Understand and apply basic concepts of probability
Probability and Statistics-Example
 Suppose for a manufacturing process we have an upper limit of 5%
defective items produced for the process to be “in control”.
 We take a sample of 100 items produced and find 10 defective
items. Is the manufacturing process in control?
 One way to do look at this is to say “if the process has 5%
defective items, what is the probability that there will be 10 or
more defective items in a sample of size 100?
 The probability of this outcome is called a P-value.
 In this case the P-value is only .0282. What does this mean?
 That this outcome would occur by chance only 2.82% of the time.
 Sowhat is the definition of a P-value?
 􀂃 The P-value is the probability of getting the
measured outcome if the assumed underlying model were
true.
Statistics
• Science of data collection,
summarization, presentation and
analysis for better decision making.

 How to collect data ?


 How to summarize it ?
 How to present it ?
 How do you analyze it and make
conclusions and correct decisions ?
Role of Statistics
 Many aspects of Engineering deals with data – Product and
process design
 Identify sources of variability
 Essential for decision making
Data Collection

 Observational study
 Observe the system
 Historical data
 The objective is to build a system model usually called
empirical models
 Design of experiment

 Plays key role in engineering design


Data Collection
 Sources of data collection:
 Observation of something you can’t control (observe a
system or historical data)
 Conduct an experiment
 Surveying opinions of people in social problem
Statistics
 Divided into :

 Descriptive Statistics

 Inferential Statistics
Forms of Data Description

 Point summary
 Tabular format
 Graphical format
 Diagrams
The application
 FIGURE 2 — Rural risk
The projections
 FIGURE 3 — Newer vehicles are safer
Vehicle safety standards and vehicle design will
be improved to further increase the
protection provided to occupants and minimise
the hazard to non-occupants struck by
a vehicle. This will include designing vehicles so
that they cause less damage to other
vehicles and road users in a crash.
Statistical analysis
 FIGURE 4 — Recent trends in fatalities among
vulnerable road users
Demand Modelling on rail Lines, Stations and Trains

Determining passengers demand


• same Station, same Line (Madrid, C1, path 0), same day Type (L) Estación: 18002 Línea: C1 Día: L

700

600

• Data getting-on 500


• Data getting-on
• Data getting-off • Data getting-off
400
___ Curve fitting getting-on ___ Curve fitting getting-onBajan

Nº de Viajeros
Travellers

Travellers
___ Curve fitting getting-off 300
___ Curve fitting getting-offsuben
200
Bajan Polinomio O(6)
R2 = 0.7561 PolynomialSuben
O(5)Polinomio O(6)
Polynomial O(4) 100

R2 = 0.2073
0
19 29 39 49 59 69 79 89 99
-100

Time period -200 Time period


Tram o Horario

• same Station, same Line (Madrid, C1, path 0), same day Type (L)
700
700

600

• Data getting-on 600


• Data getting-on
• Data getting-off Bajan
500

Suben
500 • Data getting-off Bajan
Travellers

___ Curve fitting getting-on Travellers ___ Curve fitting getting-on


Nº de Viajeros

400 Suben
Bajan Polinomio O(4)
Nº de Viajeros
400

300
___ Curve fitting getting-off
Suben Polinomio O(4) ___ Curve fitting getting-off
Bajan Polinomio O(6)
Suben Polinomio O(6)
300

200
Polynomial O(4) Polynomial O(5)
R2 = 0.6332 200
R2 = 0.8267

100
100

R2 = 0.1521
0 R2 = 0.2785
0
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100

Time period
-100
Time
Tramoperiod
horario
-100

Tramo horario
Demand Modelling on rail Lines, Stations and Trains

Determining passengers demand


Other curve fitting

Polynomial O(5) r2=0.83


Chebyshev O(5) r2=0.85
Getting-on

Chebyshev O(20) r2=0.87

Time period

• R. Kelly (2007). “The generation of profiles by formulae”. Traffic Engineering &


Control, Vol. 48, No.8, 368-371, 2007.
Lect-2
How do we define Probability?

Definition of probability by example:


l Suppose we have N trials and a specified event occurs r times.
example: the trial could be rolling a dice and the event could be rolling a 6.
define probability (P) of an event (E) occurring as:
P(E) = r/N when N 
examples:
six sided dice: P(6) = 1/6
for an honest dice: P(1) = P(2) = P(3) = P(4) =P(5) = P(6) =1/6
coin toss: P(heads) = P(tails) =0.5
P(heads) should approach 0.5 the more times you toss the coin.
For a single coin toss we can never get P(heads) = 0.5!
u By definition probability (P) is a non-negative real number bounded by 0P 1

if P = 0 then the event never occurs


A={1,2,3}
if P = 1 then the event always occurs intersection B={1,3,5}
Let A and B be subsets of S then P(A)≥0, P(B)≥0  union AB={1,3}
Events are independent if: P(AB) = P(A)P(B) AB= {1,2,3,5}
Coin tosses are independent events, the result of the next toss does not depend on previous toss.
Events are mutually exclusive (disjoint) if: P(AB) = 0 or P(AB) = P(A) + P(B)
In tossing a coin we either get a head or a tail.
Sum (or integral) of all probabilities if they are mutually exclusive must = 1.
27 P416 Lecture 1 R.Kass/Sp07
Example- Highway construction
Highway construction in a remote area is dependent on the availability
of construction workers in the area and on the weather
conditions. Suppose that event A represents “availability of
construction workers” and event B represents “favorable weather
conditions”, and that these events are independent of each other.
Previous data from the area indicates that P(A) = 0.8 and P(B) =
0.75. Suppose that we need to determine the probability of “availability
of construction workers or favorable weather conditions”, i.e.,
P(AUB). We will use the formula

28
Highway construction-Solution
P(AUB) = P(A) + P(B) - P(A∩B),and since events A and B are
independent, we can also write,
P(A∩B) = P(A)P(B) = 0.8x0.75 = 0.60 Thus,
P(AUB) = P(A) + P(B) - P(A)P(B)
= 0.8 + 0.75 - 0.8x0.75 = .95.
Notice that the probability P(A∩B) = P(“availability of construction
workers AND favorable weather conditions”) is only 60%, i.e.,
construction will be possible only 60% of the time.

29
Example Highway Construction
The result P(AUB) = 0.95 simply indicates that one of the two conditions
for highway constructions (either, availability of construction
workers OR favorable weather conditions, but not necessarily both)
are found 95% of the time. From the point of view of predicting
the ability to carry out construction activities the probability
P(A∩B) = 0.60 is more important

30
Example- Wave heights in a lake
In the process of re-designing a harbor in a lake, data is collected
on wind velocity in the area as well as water temperature to
check what effect these two variables have on wave height in
the harbor. Of interest for the designer are the conditions A
= “strong wind velocity” (registered when wind velocity is
larger than 15 mph) and
B = “warm waters” (registered when water temperature is
larger than 70oF). Records indicate that P(A) = 0.350, P(B)
= 0.150, and P(A∩B) = 0.052. Are the events A and B,
i.e., “strong wind velocity” and “warm waters”,
independent?

31
Wave Height- Solution

Solution. To check for independence, we need to


check that P(A∩B) = P(A)P(B).We know that
P(A∩B) = 0.052, and we find that P(A)P(B) = 0.0525.
We notice that P(A∩B) » P(A)P(B), and we
may conclude that the events A and B are indeed
independent.

32
Conditional probability
Conditional probability is the probability associated with an
event, say A, given the occurrence of a related event, say B.

For example, when throwing a fair die you may be


interested in determining what is the probability of getting
a 3 given that the number selected is odd. In this example
of conditional probability, the event of interest is A
=“getting a 3”, and the condition is B = “the number is
odd”.

33 Thursday, September 12, 2019


The notation for conditional
probability

The notation for conditional probability is


P(A|B) interpreted as “the probability of A
given B” or “the probability of the
occurrence of A given that B has occurred.”
The corresponding definition
P(A  B)
is
P(A B) = P(A  B) / P(B)

For the events A and B defined above,


we have = P(A =P(“3 B) and odd”) = 1/6, P(B) = P(“odd number”) =
3/6, thus,
P(A|B) = P(“3 given odd”) = (1/6)/(3/6) = 1/3.

34
Example 1. Highway closing under snow
conditions

Suppose we are interested in determining the


probability that a high-elevation highway will remain
open under snow conditions. Our records indicate
that for the 3-month winter period (n = 90 days),
snow in amounts significant enough to affect the traffic
conditions in the highway of interest is recorded
during 30 days in a typical year. Also, records show
that during 15 days in the winter period snow conditions
produce closure of the highway

35
Solution to Example 1
Let events A = “highway closure”
B = “significant snow observed in the winter months” and

The data indicates that P(B) = P(“significant snow observed in the winter months”) = 30/90
= 1/3,and P(A∩B) = P(“highway closure and significant snow observed”) = 15/90 = 1/6.
The conditional probability, P(A|B), is, therefore, P(“highway closure given snowy conditions”)
= P(A|B) = P(A∩B)/P(B) =(1/6)/(1/3) = 1/2.
This is This is interpreted as saying that, for that particular highway, if significant
snowfall is recorded, highway closure will occur about half of the time.

36
Theorems on conditional probability
The following are two important theorems related to
conditional probability:
(a) For any three events A1, A2, and A3 the following relationship
holds true:
P(A1∩A2 ∩ A3) =P(A1)P(A2|A1)P(A3|A1 ∩ A2).
(b) If an event A must result in one of the mutually exclusive events
A1, A2, …, An, then
P(A) = P(A1)P(A|A1) + P(A2)P(A|A2) + … + P(An)P(A | An)

37
Example 2. Defective computer chips.
Suppose you are in the process of fixing a computer
by replacing three identical computer chips and you have a
container with 20 computer chips from which to select the
replacements. The chips are selected at random.
5 of the computer chips in the container are defective. What is
the probability that you would select three defective chips for
your computer repair?

38
Solution to example 2
Let A1, A2, A3 be the events that you select a defective
computer chip in the 1st,2nd , and 3rd picks out of the
container. Thus, you are interested in calculating
P(A1∩A2∩A3) = P(A1)P(A2|A1)P(A3|A1 ∩ A2)

Where P(A1) is the probability that a defective chip is


selected in the first trial.
P(A2|A1) is the probability that the second chip is defective
given that the first was defective.
Finally, P(A3|A1 ∩ A2) is the probability that you get a defective chip
in the third pick given that first two chips were defective

39
Solution to example 2
1. Since there are 5 defective chips out of 20 chips,
P(A1) = 5/20 = ¼ = 0.25

2.Now, if the first chip was defective then there


remain 4 defective chips out of 19 chips in the
container, thus, P(A2|A1) = 4/19.
3. If chips 1 and 2 were defective,
there remain 3 defective chips out of
18 in the container, thus,
P(A3|A1∩A2) = 3/18 = 1/6.
40
Solution to example 2
P(A1∩A2 ∩ A3) =
P(A1)P(A2|A1)P(A3|A1 ∩ A2)
= (1/4)(4/19)(1/6) =
(1x4x1)/(4x19x6) = 1/114 =
0.00877

41
Condition 2-Conditional probability
If an event A must result in one of the mutually exclusive events A1, A2, …, An, then
P(A) = P(A1)P(A|A1) + P(A2)P(A|A2) + … + P(An)P(A | An)
The event A and its relation to the mutually exclusive events A1, A2, …, An, is illustrated in
the following Venn diagram:

Notice the equation for P(A) is equivalent to


P(A) = P(A∩A1) + P(A∩A2) + … + P(A∩An).

42
Example 3. Irrigation methods
While conducting a study on the effects of different irrigation methods on a given crop, you
define the following events:
· A1 = sprinkler irrigation
· A2 = steady furrow irrigation
· A3 = surge furrow irrigation
· A4 = drip irrigation

Based on your evaluation of 50 experimental plots, you find that


20 plots use sprinkler irrigation,
15 plots use steady furrow irrigation,
8 plots use surge furrow irrigation, and
7 plots use drip irrigation to irrigate the same type of crop. Thus,

P(A1) = 20/50 = 2/5, P(A2) = 15/50 = 3/10,


P(A3) = 8/50, and P(A4) = 7/50.

43
Solution to Example 3
You also find that the crop is successful if using sprinkler irrigation 85% of the time, if
using steady furrow irrigation 90% of the time, if using surge furrow irrigation 70%
of the time, and if using drip irrigation 60% of the time. Thus, if event A represents “a
successful
crop”, then we have that

P(A|A1) = 0.85, P(A|A2) = 0.90, P(A|A3) = 0.70, and P(A|A4) = 0.60.

Thus, the probability of a successful crop (event A) in the experimental plots


that use four different irrigation systems is
P(A) = P(A1)P(A|A1) + P(A2)P(A|A2) + P(A3)P(A|A3) + P(A4)P(A|A4) =
(2/5)x0.85 + (3/10)x0.90 + (8/50)x0.70 + (7/50)x0.60 = 0.806

44
Example 4. Highway traveling
To reach Grenoble (France) from Turin (Italy) one can follow either of two routes. The first connects Turin
and Grenoble, whereas the second passes through Chambery (France), i.e., the second route is
Turin-Chambery-Grenoble. During extreme weather conditions in winter, travel between Turin
and Grenoble is not always possible because some parts of the highway may not be open to traffic.
Define the following events:
· A = the highway from Turin to Grenoble is open
· B = the highway from Turin to Chambery is open
· C = the highway from Chambery to Grenoble is open

45
Example 4
In anticipation of driving from Turin to Grenoble, a traveler listens to the next day’s weather
forecast. If snow is forecast for the next day over the southern Alps, one can assume (on the
basis of past records) that
P(A) = 0.6, P(B) = 0.7, P(C) = 0.4, P(C|A) = 0.5, and P(A|B∩C) = 0.4.
(a) What is the probability that the traveler will be able to reach Grenoble from Turin?
(b) What is the probability the traveler will be able to drive from Turin to Grenoble by way of
Chambery?
(c) Which route should be taken in order to maximize the chance of reaching Grenoble?

46
l Probability can be a discrete or a continuous variable.
Discrete probability: P can have certain values only.
examples:
tossing a six-sided dice: P(xi) = Pi here xi = 1, 2, 3, 4, 5, 6 and Pi = 1/6 for all xi.
tossing a coin: only 2 choices, heads or tails.
NOTATION
for both of the above discrete examples (and in general)
xi is called a
when we sum over all mutually exclusive possibilities:
random variable
 P xi  =1
i
Continuous probability: P can be any number between 0 and 1.
define a “probability density function”, pdf, f(x):
f xdx = dPx  a  x  dxwith  a a continuous variable
 Probability for x to be in the range a x  b is:
b
P(a  x  b) =  f xdx Probability=“area under the curve”
 a
Just like the discrete case the sum of all probabilities must equal 1.

 f xdx =1

 We say that f(x) is normalized to one.
Probability for x to be exactly some number is zero since:
x=a
  f xdx = 0
x=a
Note: in the above example the pdf depends on only 1 variable, x. In general, the pdf can depend on many
variables, i.e. f=f(x,y,z,…). In these cases the probability is calculated using from multi-dimensional integration.
47 P416 Lecture 1 R.Kass/Sp07

l Examples of some common P(x)’s and f(x)’s:
Discrete = P(x) Continuous = f(x)
binomial uniform, i.e. constant
Poisson Gaussian
exponential
chi square
l How do we describe a probability distribution?
u mean, mode, median, and variance

u for a normalized continuous distribution, these quantities are defined by:



 f xdx =1

Mean Mode Median Variance
average most probable 50% point width of distribution

 f x  a 

=  xf(x)dx =0 0.5 =  f (x)dx  =  f (x)x    dx


2 2
 x x= a
  

u for a discrete distribution, the mean and variance are defined by:
1 n 1 n
 =  xi  2 =  (xi  )2
n i=1 n i=1

48 P416 Lecture 1 R.Kass/Sp07


 
Some Continuous Probability Distributions
Remember: Probability is the area under these curves!
For many pdfs its integral can not be done in closed form, use a table to calculate probability.

For a Gaussian pdf


the mean, mode,
and median are
all at the same x.

For many pdfs


v=1 Cauchy (Breit-Wigner)
the mean, mode, v=gaussian
and median are
in different places.

u
Chi-square distribution Student t distribution
49 P416 Lecture 1 R.Kass/Sp07
l Calculation of mean and variance:
example: a discrete data set consisting of three numbers: {1, 2, 3}
average () is just:
n x 1 2  3
= i = =2
i=1 n 3
Complication: suppose some measurements are more precise than others.
Let each measurement xi have a weight wi associated with it then:
n n
  =  xi wi /  wi “weighted average”
i=1 i=1
variance (2) or average squared deviation from the mean is just:
n
2 1
 =  (xi  )2

n i=1 The variance
 is called the standard deviation describes
rewrite the above expression by expanding the summations:
2 1 
n n n  the width
  =  xi     2  xi 
2 2
of the pdf !
n i=1 i=1 i=1 
1 n 2 2
=  xi   2 2 This is sometimes written as:
n i=1
<x2>-<x>2 with <> average
1 n 2 2 of what ever is in the brackets
=  xi 
n i=1
Note: The n in the denominator would be n -1 if we determined the average () from the data itself.

 P416 Lecture 1 R.Kass/Sp07


50
Using the definition of  from above we have for our example of {1,2,3}:
n
2 1
 =  xi2  2 = 4.67  22 = 0.67
n i=1
The case where the measurements have different weights is more complicated:
n n n n
 =  w i (x i   )
2 2
=
/  w i2 2
 2
2
 wi xi /  w i
 Here  isi=1
the weighted mean i=1 i=1 i=1

If we calculated  from the data, 2 gets multiplied by a factor n/(n1).


Example: a continuous probability distribution,
This “pdf” has two modes!
f ( x ) = c sin 2
x for 0  x  2 , c = constant
It has same mean and median, but differ from the mode(s).

51 P416 Lecture 1 R.Kass/Sp07


For continuous probability distributions, the mean, mode, and median are
calculated using either integrals or derivatives:
f(x)=sin2x is not a true pdf since it is not normalized!
f(x)=(1/) sin2x is a normalized pdf (c=1/).

2

 xdx = 
2
Note : sin
2 0 2
 =  x sin xdx /  sin 2 xdx = 
2
0 0
  3
mode = sin 2 x = 0  ,
x 2 2
a 2 1
median =  sin xdx /  sin 2 xdx =
2
a =
0 0 2

u example: Gaussian distribution function, a continuous


 probability distribution

In this class you


should feel free to
use a table of integrals
and/or derivatives.

52 P416 Lecture 1 R.Kass/Sp07


Accuracy and Precision
Accuracy: The accuracy of an experiment refers to how close the experimental measurement
is to the true value of the quantity being measured.
Precision: This refers to how well the experimental result has been determined, without
regard to the true value of the quantity being measured.
Just because an experiment is precise it does not mean it is accurate!!
example: measurements of the neutron lifetime over the years:

The size of bar


This figure shows
reflects the
various measurements
precision of
of the neutron lifetime
the experiment
over the years.

Steady increase in precision of the neutron lifetime but are any of these measurements
accurate?

53 P416 Lecture 1 R.Kass/Sp07


Measurement Errors (or uncertainties)
Use results from probability and statistics as a way of calculating how “good” a measurement is.
most common quality indicator:
relative precision = [uncertainty of measurement]/measurement
example: we measure a table to be 10 inches with uncertainty of 1 inch.
relative precision = 1/10 = 0.1 or 10% (% relative precision)
Uncertainty in measurement is usually square root of variance:
 = standard deviation
 is usually calculated using the technique of “propagation of errors”.
Statistical and Systematic Errors
Results from experiments are often presented as:
N ± XX ± YY
N: value of quantity measured (or determined) by experiment.
XX: statistical error, usually assumed to be from a Gaussian distribution.
With the assumption of Gaussian statistics we can say (calculate) something about how
well our experiment agrees with other experiments and/or theories.
Expect ~ 68% chance that the true value is between N - XX and N + XX.
YY: systematic error. Hard to estimate, distribution of errors usually not known.
examples: mass of proton = 0.9382769 ± 0.0000027 GeV (only statistical error given)
mass of W boson = 80.8 ± 1.5 ± 2.4 GeV (both statistical and systematic error given)
54 P416 Lecture 1 R.Kass/Sp07
What’s the difference between statistical and systematic errors?
N ± XX ± YY
Statistical errors are “random” in the sense that if we repeat the measurement enough times:
XX 0 as the number of measurements increases
Systematic errors,YY, do not  0 with repetition of the measurements.
examples of sources of systematic errors:
voltmeter not calibrated properly
a ruler not the length we think is (meter stick might really be < meter!)
Because of systematic errors, an experimental result can be precise, but not accurate!
How do we combine systematic and statistical errors to get one estimate of precision?
BIG PROBLEM!
two choices:
tot = XX + YY add them linearly
tot = (XX2 + YY2)1/2 add them in quadrature
Some other ways of quoting experimental results
lower limit: “the mass of particle X is > 100 GeV”
upper limit: “the mass of particle X is < 100 GeV”
asymmetric errors: mass of particle
X = 10043 GeV

55 P416 Lecture 1 R.Kass/Sp07


Probability, Set Theory and Stuff
The relationships and results from set theory are essential to the understanding of probability.
Below are some definitions and examples that illustrate the connection between set theory,
probability and statistics.
We define an experiment as a process that generates “observations” and a sample space (S)
as the set of all possible outcomes from the experiment:
simple event: only one possible outcome
compound event: more than one outcome
As an example of simple and compound events consider particles (e.g. protons, neutrons) made
of u (“up”), d (“down”), and s (“strange”) quarks. The u quark has electric charge (Q) =2/3|e|
(e=charge of electron) while the d and s quarks have charge =-1/3|e|.
Let the experiment be the ways we combine 3 quarks to make a Q=0, 1, or 2 state.
Event A: Q=0 {ssu, ddu, sdu} note: a neutron is a ddu state
Event B: Q=1{suu, duu} note: a proton is a duu state
Event C: Q=2 {uuu}
For this example events A and B are compound while event C is simple.
The following definitions from set theory are used all the time in the discussion of probability.
Let A and B be events in a sample space S.
Union: The union of A & B (AB) is the event consisting of all outcomes in A or B.
Intersection: The intersection of A & B (AB) is the event consisting of all outcomes in A and B.
Complement: The complement of A (A´) is the set of outcomes in S not contained in A.
Mutually
56 exclusive: If A
P416 Lecture 1 & B have no outcomes in common they are mutually exclusive. R.Kass/Sp07
Probability, Set Theory and Stuff
The relationships and results from set theory are essential to the understanding of probability.
Below are some definitions and examples that illustrate the connection between set theory,
probability and statistics.
We define an experiment as a process that generates “observations” and a sample space (S)
as the set of all possible outcomes from the experiment:
simple event: only one possible outcome
compound event: more than one outcome
As an example of simple and compound events consider particles (e.g. protons, neutrons) made
of u (“up”), d (“down”), and s (“strange”) quarks. The u quark has electric charge (Q) =2/3|e|
(e=charge of electron) while the d and s quarks have charge =-1/3|e|.
Let the experiment be the ways we combine 3 quarks to make a Q=0, 1, or 2 state.
Event A: Q=0 {ssu, ddu, sdu} note: a neutron is a ddu state
Event B: Q=1{suu, duu} note: a proton is a duu state
Event C: Q=2 {uuu}
For this example events A and B are compound while event C is simple.
The following definitions from set theory are used all the time in the discussion of probability.
Let A and B be events in a sample space S.
Union: The union of A & B (AB) is the event consisting of all outcomes in A or B.
Intersection: The intersection of A & B (AB) is the event consisting of all outcomes in A and B.
Complement: The complement of A (A´) is the set of outcomes in S not contained in A.
Mutually
57 exclusive: If A
P416 Lecture 1 & B have no outcomes in common they are mutually exclusive. R.Kass/Sp07
Probability, Set Theory and Stuff
Example: Everyone likes pizza.
Assume the probability of having pizza for lunch is 40%, the probability of having pizza for
dinner is 70%, and the probability of having pizza for lunch and dinner is 30%. Also, this
person always skips breakfast. We can recast this example using:
P(A)= probability of having pizza for lunch =40%
P(B)= probability of having pizza for dinner = 70%
P(AB)=30% (pizza for lunch and dinner)
1) What is the probability that pizza is eaten at least once a day?
The key words are “at least once”, this means we want the union of A & B
P(AB)=P(A)+P(B)-P(AB) = .7+.4-.3 =0.8
2) What is the probability that pizza is not eaten on a given day? prop. c)
Not eating pizza (Z´) is the complement of eating pizza (Z) so P(Z)+P(Z´)=1
P(Z´)=1-P(Z) =1-0.8 = 0.2
prop. a)
3) What is the probability that pizza is only eaten once a day?
This can be visualized by looking at the Venn diagram and realizing we need to exclude the overlap
(intersection) region.
P(AB)-P(AB) = 0.8-0.3 =0.5 pizza for lunch

The non-overlapping blue area is pizza for lunch, no pizza for dinner.
The non-overlapping red area is pizza for dinner, no pizza for lunch.

58 P416 Lecture 1 pizza for dinner R.Kass/Sp07


Conditional Probability
Frequently we must calculate a probability assuming something else has occurred.
This is called conditional probability.
Here’s an example of conditional probability:
Suppose a day of the week is chosen at random. The probability the day is Thursday is 1/7.
P(Thursday)=1/7
Suppose we also know the day is a weekday. Now the probability is conditional, =1/5.
P(Thursday|weekday)=1/5
the notation is: probability of it being Thursday given that it is a weekday
Formally, we define the conditional probability of A given B has occurred as:
P(A|B)=P(AB)/P(B)
We can use this definition to calculate the intersection of A and B:
P(AB)=P(A|B)P(B)
For the case where the Ai’s are both mutually exclusive and exhaustive we have:
n
P( B) = P( B | A1 ) P( A1 )  P( B | A2 ) P( A2 )  P( B | An ) P( An ) =  P( B | Ai ) P( Ai )
i =1
For our example let B=the day is a Thursday, A1= weekday, A2=weekend, then:
P(Thursday)=P(thursday|weekday)P(weekday)+P(Thursday|weekend)P(weekend)
P(Thursday)=(1/5)(5/7)+(0)(2/7)=1/7
59 P416 Lecture 1 R.Kass/Sp07

Das könnte Ihnen auch gefallen