2 views

Uploaded by Will Corleone

- 1703.07770.pdf
- STATEK
- 2011
- IRJET-Traffic Data Analysis using Decision Tree and Naïve Bayes Classifier
- Naïve Bayes Model for Analysis of Voting Rate Failure in Election System
- Pair Ratio
- Data Preparation: Missing Values (Excel)
- Final Command
- Santamaria y Rasilla 2013 -Are Bordes Mousterian facies discrete groups in the Iberian Peninsula-.pdf
- Spps Solution
- DOMAIN INDEPENDENT MODEL FOR DATA PREDICTION AND VISUALIZATION
- Predictive Analysis of Electrical Power Failures in Sri Lanka using Big Data Technologies
- Cutting Process Control
- Chapter III Results and Discussion
- coastline change predict using weibul distribution GISdevelopment.doc
- SATELITE3.pdf
- Statistics Calculator Word
- Using JAGS in R With the Rjags Package
- dendogrm
- 4.Gaussain Mixture

You are on page 1of 3

Na Bayes Classier ve

Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu

We are given a set of documents x1 , . . . , xn , with the associated class labels y1 , . . . , yn . We want to learn a model that will predict the label y for any future document x. This task is known as classication. Naive Bayes is one classication method.

Let each document be represented by x = (c1 , . . . , cv ) the word count vector, otherwise known as bag of word representation. We assume within each class y, the probability of a document follows the multinomial distribution with parameter y :

v

p(x|y)

w=1

cw yw .

(1)

The log likelihood is log p(x|y) = x log y + const. (2) Note dierent classes have dierent y s. Also note that the multinomial distribution assume conditional independence of feature dimensions 1, . . . , v given the class y. We know this is not true in reality, and more sophisticated models would assume otherwise. For this reason, such assumption on independence of features is known as the na Bayes assumption. ve If we know p(x|y) and p(y) for all classes, classication is done via the Bayes rule: y = = = = arg max p(y|x)

y

y y

y

The process of computing the conditional distribution p(y|x) of the unknown variable (y) given observed variables (x) is called inference. Making classication predictions given p(x|y), p(y), and x is called inference. Where do we get p(x|y) and p(y)? These are the parameters of the model, and we learn them from the training set. Given a training set {(x1 , y1 ), . . . , (xn , yn )}, training or parameter learning involves nding the best parameters = {, 1 , . . . , C }. Our complete model is p(y = j) = j , and p(x|y = j) = Mult(x; j ) V xw w=1 jw . For simplicity we use the MLE here, but MAP is common too. We maximize the joint (log)

i=1 n

n

log

i=1 n

=

i=1

(11)

C j=1

s.t.

v w=1 jw

(12) (13)

= 1, j = 1 . . . C.

n i=1 [yi

= j]

n

i:yi =j xiw . V i:yi =j u=1 xiu

(14) (15)

These MLEs are intuitive: they are the class frequency in the training set, and the word frequency within each class. Note that the concepts of inference and parameter learning described above are fairly general. The only special thing is the na Bayes assumption (i.e., unigram language model for p(x|y)) which assumes ve conditional independence of features. This makes it a Na Bayes classier. ve

1.1

Consider binary classication where y = 0 or 1. Our classication rule with arg max can equivalently be expressed with log odds ratio f (x) p(y = 1|x) p(y = 0|x) = log p(y = 1|x) log p(y = 0|x) = (log 1 log 0 ) x + (log p(y = 1) log p(y = 0)). = log (16) (17) (18)

The decision rule is to classify x with y = 1 if f (x) > 0, and y = 0 otherwise. Note for given parameters, this is a linear function in x. That is to say, the Naive Bayes classier induces a linear decision boundary in feature space X . The boundary takes the form of a hyperplane, dened by f (x) = 0.

1.2

A generative model is a probabilistic model which describe the full generation process of the data, i.e. the joint probability p(x, y). Our Naive Bayes model consists of p(y) and p(x|y), and does just that: One can generate data (x, y) by rst sample y p(y), and then sample word counts from the multinomial p(x|y). There is another family of models known as discriminative models, which do not model p(x, y). Instead, they directly model the conditional p(y|x), which is directly related to classication. We will see our rst discriminative model when we discuss logistic regression.

Na Bayes Classier ve

1.3

A Bayes Network is a directed graph that represent a family of probability distributions. This is covered in detail in [cB] Chapter 8.1, 8.2. Outline: nodes: each node is a random variable. We have one y node, and v xw nodes. directed edges: No directed cycles allowed, i.e. must be a DAG. For naive Bayes, from y to xw . meaning: the joint probability on all nodes s1:K is factorized in a particularly form

K

p(s) =

i=1

v i=1

where pa(si ) are the parents of si . For naive Bayes, p(x1:v , y) = p(y) observed nodes: nodes with known values, e.g. x1:v . Shaded.

plate: a lazy way to duplicate the node (and associated edges) multiple times. Our x1:v can be condensed into a plate.

- 1703.07770.pdfUploaded byrajagudipati
- STATEKUploaded byAnnisa Fadhila Ramadhani
- 2011Uploaded byperplexe
- IRJET-Traffic Data Analysis using Decision Tree and Naïve Bayes ClassifierUploaded byIRJET Journal
- Naïve Bayes Model for Analysis of Voting Rate Failure in Election SystemUploaded byAnonymous lPvvgiQjR
- Pair RatioUploaded bymaddy_i5
- Data Preparation: Missing Values (Excel)Uploaded bySpider Financial
- Final CommandUploaded byAbid Hussain
- Santamaria y Rasilla 2013 -Are Bordes Mousterian facies discrete groups in the Iberian Peninsula-.pdfUploaded bySantukis
- Spps SolutionUploaded byAli Mcc
- DOMAIN INDEPENDENT MODEL FOR DATA PREDICTION AND VISUALIZATIONUploaded byijsret
- Predictive Analysis of Electrical Power Failures in Sri Lanka using Big Data TechnologiesUploaded byIJRIT Journal
- Cutting Process ControlUploaded byInternational Journal of Innovation Engineering and Science Research
- Chapter III Results and DiscussionUploaded byShairuz Caesar Briones Dugay
- coastline change predict using weibul distribution GISdevelopment.docUploaded byMeddyDanial
- SATELITE3.pdfUploaded bycesarinigillas
- Statistics Calculator WordUploaded bytamirat
- Using JAGS in R With the Rjags PackageUploaded byMurali Dharan
- dendogrmUploaded byFanani Dwik
- 4.Gaussain MixtureUploaded bysulthan_81
- R 233 Ussir Le DELF B1Uploaded byVireak Dy
- Dhs h2 Math p2 SolutionUploaded byjimmytanlimlong
- Ass3_Q1_VAR.m3Uploaded byJeremey Julian Ponrajah
- intro-class.pdfUploaded bymphil.ramesh
- Homework 06 AnswersUploaded byChristopher Williams
- Downloaded From CracksMind.comUploaded byPOLOPO POLUYH
- Solutions to HW2Uploaded bySwathi Shetty
- MSCUploaded byAriel Carandang
- PQT - NOV - DEC 2010Uploaded bylmsrt
- Test.docxUploaded byAizelle M. Taratara

- Bec v Reading Part 1Uploaded byJoah Fernandez
- The Parliament of IndiaUploaded byntneeraj1512
- Business_Analytics_Schedule_April_2019.xls.pdfUploaded byBryan Arias
- Project Risk ManagementUploaded byRasha Abduldaiem Elmalik
- Tasawwuf - ImpactUploaded byZayd Iskandar Dzolkarnain Al-Hadrami
- Customer Selection MatrixUploaded byr.jeyashankar9550
- student responseUploaded byapi-239756775
- 7.2 Algebra IUploaded byguanajuato_christopher
- 1 Sequential TheoryUploaded by9897856218
- SAEUploaded byJubin Varghese
- What is a Psychosocial Intervention - Mapping the Field in Sri LankaUploaded byagalappatti
- Reiki Principles - Just For Today I Will Not WorryUploaded byJane
- GRADISTATUploaded byMuhammad Addi Fanhas
- AADE 22.pdfUploaded bysaeed65
- Pa4101--f2f Hhh Syllabus (3)Uploaded byDavid Lee
- nullUploaded byapi-14486402
- Alhambra versión imprimibleUploaded bytintoreprofe
- Malaria IPUploaded byButterflyCalm
- Simulation and Stochastic ModelsUploaded bytirumalayasaswini
- Website Advertising Order Form Free Exciting Games [Full Banner: 468x60]Uploaded byBib Lee
- Business Plan - Schramko & Associates, LLC 2009Uploaded byTrayvone Maurice
- Rural MarketingUploaded bySanchit Valecha
- Stimulare senzorialaUploaded bydiana
- Crim 2 Unjust VexationUploaded byJumel Capurcos
- NI 43-101 Resources Technical Report El Toro Gold Project - Corporación del Centro SAC - PerúUploaded byCDCGOLD
- venus tabletUploaded byDimitris Leivaditis
- American Civil Liberties Union and Larry Schack v. The Florida Bar and the Florida Judicial Qualifications Commission, 999 F.2d 1486, 11th Cir. (1993)Uploaded byScribd Government Docs
- Techno TricksUploaded byShiva Kumaar
- GMAT Diagnostic Test GMAT Club v2.8Uploaded bygmatclub2
- Final Script !!Uploaded byaine21