13 views

Uploaded by Sahil Kothari

Lecture on probabilistic reasoning

- (10-78) Probabilistic Methods in Geotechnical Engineering
- The Binomial Distribution
- Probabilty LECTURE (1)
- p s - (Jntumaterials.com)
- Course Syllabus CE463
- Transformations
- Student Answer Key
- Assignment 1
- Pts Pimp
- Syllabus 380 134
- Discrete Probability Distributions.docx
- Stat(Ian Castro)5
- 10.1.1.195.8221
- Tutorial 02 Probabilistic Analysis
- JIM 104_CH5_KI (2015-16).pdf
- Course Handout
- Notes 1 : Measure-theoretic foundations I
- syllabus
- 08 r059210501 Probability and Statistics
- sumexp.pdf

You are on page 1of 39

Probabilistic Reasoning

1. dene model of problem 2. derive posterior distributions and estimators 3. estimate parameters from data 4. evaluate model accuracy

Axioms of probability

Axioms (Kolmogorov): 0 P(A) 1 P(true) = 1 P(false) = 0 P(A or B) = P(A) + P(B) P(A and B)

Corollaries:

n

P (D = di ) = 1

i=1

The joint probability of a set of variables must also sum to 1. If A and B are mutually exclusive: P(A or B) = P(A) + P(B)

Rules of probability

conditional probability

P r(B ) > 0

discrete probability distribution joint probability distribution marginal probability distribution Bayes rule independence

Basic concepts

Making rational decisions when faced with uncertainty:

Probability the precise representation of knowledge and uncertainty Probability theory how to optimally update your knowledge based on new information Decision theory: probability theory + utility theory how to use this information to achieve maximum expected utility

Basic probability theory random variables probability distributions (discrete) and probability densities (continuous) rules of probability expectation and the computation of 1st and 2nd moments joint and multivariate probability distributions and densities covariance and principal components

Recipe for making a joint distribution of M variables: 1. Make a truth table listing all combinations of values of your variables (if there are M Boolean variables then the table will have 2M rows). 2. For each combination of values, say how probable it is. 3. If you subscribe to the axioms of probability, those numbers must sum to 1.

A

0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1

B C

0 1 0 1 0 1 0 1

Prob

0.30 0.05 0.10 0.05 0.05 0.10 0.25 0.10

0.05 0.25

0.05

B 0.30 All the nice looking slides like this one from are from Andrew Moore, CMU

Copyright 2001, Andrew W. Moore Probabilistic Analytics: Slide 41

Recipe for making a joint distribution of M variables: 1. Make a truth table listing all combinations of values of your variables (if there are M Boolean variables then the table will have 2M rows). 2. For each combination of values, say how probable it is. 3. If you subscribe to the axioms of probability, those numbers must sum to 1.

A

0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1

B C

0 1 0 1 0 1 0 1

Prob

0.30 0.05 0.10 0.05 0.05 0.10 0.25 0.10

0.05 0.25

0.05

0.30

Copyright 2001, Andrew W. Moore

Recipe for making a joint distribution of M variables: 1. Make a truth table listing all combinations of values of your variables (if there are M Boolean variables then the table will have 2M rows). 2. For each combination of values, say how probable it is. 3. If you subscribe to the axioms of probability, those numbers must sum to 1.

A

0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1

B C

0 1 0 1 0 1 0 1

Prob

0.30 0.05 0.10 0.05 0.05 0.10 0.25 0.10

0.05 0.25

0.05

0.30

Copyright 2001, Andrew W. Moore

One you have the JD you can ask for the probability of any logical expression involving your attribute

P( E ) "

rows matching E

! P(row )

OneP(Poor you have the JD can Male) =you 0.4654 ask for the probability of any logical expression involving your attribute

P( E ) "

rows matching E

rows matching E

One you have the JD can P(Poor) =you 0.7604 ask for the probability of any logical expression involving your attribute

P( E ) "

rows matching E

P( (row ) !P E) "

rows matching E

! P(ro

Inference Inference Inference with the with the Using the with Joint Joint Joint

PP ( row ) ! P () row ) ( row ! ! P ( E E ) # One you have the JD you can P ( E # (E ) " matching (row )E P ( E1 #1E2P )E rows ! matching E rows EP and E and 2) P ( E | E ) " " P (1E |2E )" P( E | 1E )2 " " " ask for the probability of any P( E ) P ( row ) ! P2() E2 ) P () row ) P (row logical expression involving P ( E ! ! your attribute

1 2 1 2 rows matching E1 and E2

1

1 2

rows matching E

Copyright 2001, Andrew W. Moore Copyright 2001, Andrew W. Moore Copyright 2001, Andrew W. Moore

probability density function (pdf) joint probability density marginal probability calculating probabilities using the pdf Bayes rule

Let X be a continuous random variable. If p(x) is a Probability Density Function for X then

x$a

# p( x)dx

50

= 0.36

Copyright 2001, Andrew W. Moore

age $ 30

# p(age)dage

Copyright 2001, Andrew W. Moore

It does not mean a probability! First of all, its not a value between 0 and 1. Its just a value, and an arbitrary one at that. The likelihood of p(a) can only bestomach compared relatively to other values p(b) Talking to your It Whats indicatesthe the gut-feel relative probability ofof the integrated density over a small delta: meaning p(x)?

If then

Expectations

E[X] = the expected value of random variable X = the average value wed see if we took a very large number of random samples of X

"

x $ #"

! x p( x) dx

Expectations

E[X] = the expected value of random variable X = the average value wed see if we took a very large number of random samples of X

"

$

E[age]=35.897

x $ #"

! x p( x) dx

= the first moment of the shape formed by the axes and the blue curve = the best value to choose if you must guess an unknown persons age and youll be fined the square of your error

Expectation of a function

!=E[f(X)] = the expected value of f(x) where x is drawn from Xs distribution. = the average value wed see if we took a very large number of random samples of f(X)

2

!%

x % $#

" f ( x) p( x) dx

E[ f ( x)] & f ( E[ X ])

Variance

'2 = Var[X] = the expected squared difference between x and E[X]

' %

2

x % $#

" (x $ ! )

p ( x) dx

Var[age] % 498.02

= amount youd expect to lose if you must guess an unknown persons age and youll be fined the square of your error, and assuming you play optimally

Standard Deviation

!2 = Var[X] = the expected squared difference between x and E[X]

! %

2

x % $#

" (x $ & )

p ( x) dx

Var[age] % 498.02

! % 22.32

= amount youd expect to lose if you must guess an unknown persons age and youll be fined the square of your error, and assuming you play optimally ! = Standard Deviation = typical deviation of X from its mean

! % Var[ X ]

Copyright 2001, Andrew W. Moore Probability Densities: Slide 21

Test report for rare disease is positive, 90% accurate Whats the probability that you have the disease? What if the test is repeated? This is the simplest example of reasoning by combining sources of information.

P (T |D) |D ) P (T

We want P(D|T) = The probability of the having the disease given a positive test Use Bayes rule to relate it to what we know: P(T|D)

likelihood

prior

normalizing constant

P (D) = 0.001

What about P(T)? Whats the interpretation of that?

likelihood prior

normalizing constant

P(T) is the marginal probability of P(T,D) = P(T|D) P(D) So, compute with summation

P (T ) =

all values of D

P (T |D)P (D)

)P (D ) P (T ) = P (T |D)P (D) + P (T |D

What are these?

P (T |D) = ) = P (T |D

0.9 ?

)? What about P (D

P (D|T ) =

Suppose we have a test to determine if you won the lottery. Its 90% accurate. What is P($ = true | T = true) then?

P (T |D)P (D) )P (D ) P (T |D)P (D) + P (T |D

P (D|T ) =

What if the test was the same, but disease wasnt so rare?

We can relax, P(D|T) = 0.0089, right? Just to be sure the doctor recommends repeating the test. How do we represent this?

P (D|T1 , T2 )

Again, we apply Bayes rule

P (T1 , T2 |D)P (D) P (D|T1 , T2 ) = P (T1 , T2 )

This also implies:

Plugging these in, we have

P (D|T1 , T2 ) =

0.9 0.9 0.001 P (D|T ) = = 0.075 0.9 0.9 0.001 + 0.1 0.1 0.999

Whats the chance of 1 false positive from the test? Whats the chance of 2 false positives?

The chance of 2 false positives is still 10x more likely than the a prior probability of having the disease.

If we rearrange slightly:

Its the posterior for the rst test, which we just computed

We can just plugin the value of the old posterior It plays exactly the same role as our old prior

This is how Bayesian reasoning combines old information with new information to update our belief states.

P (D|T ) =

reasoning in networks containing many variables, for which the graphical notations of chapter(2) will play a central role.

Example 1.2 (Hamburgers). Consider the following ctitious scientic information: Doctors nd that people with Kreuzfeld-Jacob disease (KJ) almost invariably ate hamburgers, thus p(Hamburger Eater|KJ ) = 0.9. The probability of an individual having KJ is currently rather low, about one in 100,000. 1. Assuming eating lots of hamburgers is rather widespread, say p(Hamburger Eater) = 0.5, what is the probability that a hamburger eater will have Kreuzfeld-Jacob disease? This may be computed as p(KJ |Hamburger Eater) = p(Hamburger Eater, KJ ) p(Hamburger Eater|KJ )p(KJ ) = p(Hamburger Eater) p(Hamburger Eater) (1.2.1) =

9 10

1 100000 1 2

= 1.8 10

(1.2.2)

2. If the fraction of people eating hamburgers was rather small, p(Hamburger Eater) = 0.001, what is the probability that a regular hamburger eater will have Kreuzfeld-Jacob disease? Repeating the above calculation, this is given by

9 10

1 100000 1 1000

1/100

(1.2.3)

This is much higher than in scenario (1) since here we can be more sure that eating hamburgers is related to the illness.

Example 1.3 (Inspector Clouseau). Inspector Clouseau arrives at the scene of a crime. The victim lies dead

Example 1.3 (Inspector Clouseau). Inspector Clouseau arrives at the scene of a crime. The victim lies dead in the room alongside the possible murder weapon, a knife. The Butler (B ) and Maid (M ) are the inspectors main suspects and the inspector has a prior belief of 0.6 that the Butler is the murderer, and a prior belief of 0.2 that the Maid is the murderer. These beliefs are independent in the sense that p(B, M ) = p(B )p(M ). (It is possible that both the Butler and the Maid murdered the victim or neither). The inspectors prior criminal knowledge can be formulated mathematically as follows: dom(B ) = dom(M ) = {murderer, not murderer} , dom(K ) = {knife used, knife not used} (1.2.4)

(1.2.5)

M M M M

(1.2.6)

In addition p(K, B, M ) = p(K |B, M )p(B )p(M ). Assuming that the knife is the murder weapon, what is the probability that the Butler is the murderer? (Remember that it might be that neither is the murderer). Using b for the two states of B and m for the two states of M , P X X p(B, m, K ) P p(K |B, m)p(B, m) p(B ) m p(K |B, m)p(m) P p( B | K ) = p(B, m|K ) = = Pm =P (1.2.7) p( K ) m,b p(K |b, m)p(b, m) b p( b) m p(K |b, m)p(m) m m DRAFT February 27, 2012 13

Probabilistic Reasoning

Example 1.5 (Aristotle : Resolution). We can represent the statement All apples are fruit by p(F = tr|A = tr) = 1. Similarly, All fruits grow on trees may be represented by p(T = tr|F = tr) = 1. Additionally we assume that whether or not something grows on a tree depends only on whether or not it is a fruit, p(T |A, F ) = P (T |F ). From this we can compute X X p(T = tr|A = tr) = p(T = tr|F, A = tr)p(F |A = tr) = p(T = tr|F )p(F |A = tr)

F F

In other words we have deduced that All apples grow on trees is a true statement, based on the information presented. (This kind of reasoning is called resolution and is a form of transitivity : from the statements A ) F and F ) T we can infer A ) T ).

= p(T = tr|F = fa) p(F = fa|A = tr) + p(T = tr|F = tr) p(F = tr|A = tr) = 1 | {z } | {z }| {z }

=0 =1 =1

(1.2.16)

Example 1.6 (Aristotle : Inverse Modus Ponens). According to Logic, from the statement : If A is true then B is true, one may deduce that if B is false then A is false. To see how this ts in with a probabilistic reasoning system we can rst express the statement : If A is true then B is true as p(B = tr|A = tr) = 1. Then we may infer

Next time

- (10-78) Probabilistic Methods in Geotechnical EngineeringUploaded byCarlos Alberto Perez Rodrigue
- The Binomial DistributionUploaded byK Kunal Raj
- Probabilty LECTURE (1)Uploaded byAwais Shahzad
- p s - (Jntumaterials.com)Uploaded byAyyappa Kattamuri
- Course Syllabus CE463Uploaded byashwinagrawal198231
- TransformationsUploaded byDaniel Lee Eisenberg Jacobs
- Student Answer KeyUploaded byeeeeewwwwwwwwssssssssss
- Assignment 1Uploaded bykwzeet
- Pts PimpUploaded byGowthamUcek
- Syllabus 380 134Uploaded bychou_timothy
- Discrete Probability Distributions.docxUploaded bySheena Mae Noga
- Stat(Ian Castro)5Uploaded byAlvin Romualdo
- 10.1.1.195.8221Uploaded byaaparumugam
- Tutorial 02 Probabilistic AnalysisUploaded byAnonymous BxEP3Qt
- JIM 104_CH5_KI (2015-16).pdfUploaded byHanya Dhia
- Course HandoutUploaded bykamya6999
- Notes 1 : Measure-theoretic foundations IUploaded bymasing4christ
- syllabusUploaded by2fast4lyf
- 08 r059210501 Probability and StatisticsUploaded byandhracolleges
- sumexp.pdfUploaded byPRASUN DE
- Mat AUploaded bywnjue001
- skittles part 4Uploaded byapi-405527325
- mit6 041f10 l10Uploaded byapi-246008426
- Lampiran Data KuisionerUploaded byrizkal rizaldi
- QB_1Uploaded bysayan013
- garg-aseUploaded byapi-3709702
- 5Uploaded byEnrique
- Decision AnalysisUploaded bythaamelody
- SYLLABUS Statistika IndustriUploaded byYosef Bayu
- l3 probability methodsUploaded byapi-287224366

- IEEE Vicon PaperUploaded byxebit
- SteganographyUploaded byShivam Trivedi
- Melihat Tanda Supplier BaikUploaded byMediantara Epulsa
- 07 - Cisco BGP-4 Command and Configuration Handbook (Parkhurst, IsBN# 158705017X)Uploaded byLe Khang
- BladeCenterInteroperabilityGuideUploaded byanon-140470
- Special Study ReportUploaded byDuc Huan Tran
- new paperUploaded bysanjivee_sachin
- Solution 4Uploaded byhvorfor
- ALGORITHMS_AND_FLOWCHARTS-1.pdfUploaded byMariaBufnea
- Resume Template 4Uploaded byVijayKumar Ambole
- KOBBYUploaded byDon Maestro
- 03 It ProcurementUploaded bymugilanit
- Emotion Detection using Raspberry PiUploaded byIRJET Journal
- paul - ps technical specialistUploaded byapi-337062211
- managerUploaded byapi-121297748
- UVM the FactoryUploaded byanubhaw2004
- Developing a PostgreSQL Procedural LanguageUploaded byeggyknap
- Verbatim 4 Quick StartUploaded byJD Meyer
- Android App for Teachers With Source Code _ Genuine CoderUploaded byshivam
- 3 Project PlanUploaded byDaniel Yung Sheng
- Pamela Zave 1997Uploaded byDanielle Matuda
- Are We Approaching an Economic SingularityUploaded bymatthewruber
- Assignment of StaUploaded bydk13071987
- hse-in025_-en-eUploaded byStevenOstaiza
- 6030 06a C3 January 2011 Mark SchemeUploaded byApocd
- FPM WiringUploaded byJaya Krishna Kuntamukkala
- Folleto Blockchain HACKMX Oct2017 v6Uploaded byEduardo
- IOS Apps With SwiftUploaded bymanish kumar agarwal
- BO XI R2 Installation GuideUploaded byrupeshvin
- 20111101-40-ITIL v3 Foundation Exam QuestionsUploaded bymadinewala