Sie sind auf Seite 1von 4
2017.3.15, Intutvely specking Whats the ference between Bayesian Estimation and Maximum Likelihood Estimation? - Quora @ scan Owen ena 58 ce otow te Intuitively speaking, What is the difference between Bayesian Estimation and Maximum Likelihood Estimation? 7 Answers e albert Wu, Math Student, Machine Learning and Poker Enthuslast pstos Oct 28, 2015 -Upvted by Vind Novakovs, setae Go macnn hamag {In maximum likelihood estimation, we ind a point estimate for the parameters that ‘maximizes the likelihood. Basically, we have data Dand parameters 0, we need to ‘ind 6° that maximizes P(D|é). Then, we can use this particular set of parameters &* ‘to make predictions about future events {Tn Bayesian estimation, we come up with a distribution of possible parameters using Bayes rule: P(@|D) = APHR®) wnere P(6) is known as the prior, Then, to make predictions about future events, we need to integrate over thie distribution of possible 6 ‘Let me give an example to make this concrete Le’ say we have a coin that comes up ‘heads with some probability 9. We see twe heads come up. Our likelihood then ‘becomes P(D|@) = 6, whlch is clearly maximized when 8 = 1, So, out MLE sth the coin always comes up heads, and so we predict future coins will allcome up heads. We see why MLE can be. bitsily offen, it averftsthe data and does not generalize well,butitis good fora first estimate [Now letus think about the Bayesian approach. Now; let's say we initaly know (our prior) that our coin has a one-half chance of having 9 = 1/2and. one-half chance of having 8 — 1, Lev say we observe the same two heads fo coi fis, Now, let us caleulate: P(@ =1/2|D) « P(D|0)P(D) = 1/8 P(@= 11D) x P(D|e)P(D) = 1/2 ‘Normalizing, we see that we havea iS chance of having a faircoin, anda 4/8 cofhaving coin that always comes up heads. So, we estimate that anew coin flip ‘would come up heads 1/512} + 4/5(})=8/10 of che time. Based on our prior beliefs, we ‘would come up with different answers, But we sec that in some sense, the Bayesian estimate ls more "reasonable' than just using MLE ‘ne disadvantage is thatthe Bayesian estimates very difficult to compute because, inthe general case (continuous distribution over parameters instead of discrete), we ‘need to perform integration, which is computationally time-consuming. [Another popular estimation method (not mentioned inthe question) is called ‘maximum a posterior! (MAP) estimation, where we maximize the posterior, Le, (0D) « P(D|é)P(0). This sso a point estimate, but by allowing a prior, prevents us from saying some of the silly things that MLE does, eg, that ifwe see two coins that come up heads, we think the coin can only come up heads. Upvois 84) Dowevate Comments Answer Netfeatons @ Damdae ‘Answers From Your Feed Vow More > Va ahaa Pte acy anor © reir woos tant 2D veccropoptson @ ee ota tis at on Related Questons iat athe itive iterncebatwean AAT ac aa selimate ung Sajsian lence? int the tuton behind Bayesian tence? inet san itive explaraton ofthe maximum @ teal in gh mensional segs? (pecs nortnas? Statstios(acatamle dsc} Forth pont Poster QUAP) ater than be Mean Postar AP. eave? ore Relates Gueetons vestion Stats hp. quoca.cominuitvly-speaking- Whats. the-ierence-betwoon-Bayesian-Estimation-and-Maximum-Likellhood Estimation “4 2017.3.15, Intutvely specking Whats the ference between Bayesian Estimation and Maximum Likelihood Estimation? - Quora Answer Netfeatons @ Damdae Ask Qeston) Fe 19.2 Views View Upites 439 Veenstra, Does research in theory and practice of MI. ‘Good answers here. Illtake a slightly diferent approch, the MLE fad its Bayesian counterpar, the MAP estimator) are the modes ofthe likelihood (or posterior distribution), Hopefully they are the global maxima, (although in practice with a complicated set of dats, this isnot generally tus) ofthe likelihood (posteion surface, (Although when tying to find these maxima the log surface is usually used.) And in some ways it makes senseto use this asa solution, right? Afterall its the point of highest likelIhood. ‘The problem Is, what fthe surface Is multi-modal? How do, say, preditions compare? What ifthere are other interesting parts ofthe likelihood surface not taken Ino account by estimates of modes? More importantly (although I cant seem tind a reference tot right now) modes tend to be atypical pot wey don't well represent the distribution onthe whole, [Enter the true Bayesian estimate, and onthe frequentist side, the MELE (mean likelihood estimator. The latter is not well known. My PhD supervisor did some work with one of his students on it; see, eg, Mean Likelihood Fstimstion_). The Mol is ‘an integration problem from the frequentist viewpoint G0 In sense i's Bayesian ‘witha uniform prior, but its interpreted differently) These methods take into account information from the whole surface ofthe kelihood/posteror. alt: the danger with these types of inference (and prediction from them) Is that you really have to specify you likelihood (and prior, iyou're a Bayesian) in some manner so the data reasonably are ft by them. [know hard todo sometimes, But necessary ots 4) Domece. Comment Michael Hochster, Head of Research at Pandora pdt Ose 12, 2013 Upoted by Yuval Fist, lpoithnic Sofware Enger in NEBIR ang Machne Leaning a Sean Oven, Dretr Ona Seance @ Clscara, Roughly speaking, the posterior density of parameter is the prior times the Iikelthood. Maximum lkelthood estimation jgnores the prlor, 80 sos like belng & Bayesian but using some sor of fat prior. Alex Robertson, works at Counsy \Witen Now 3 212: Upoted by Yuval Feisen, Algom Software Enger in NURIR nd Meche Laem and Scan Oven, Drala, Cala Scanoo @ Cloucea ‘he previous answers here are all very good, but technical. lke to givean int ample. ‘Imagine you ae a doctor. You have a patient who shows an odd set of symptoms. You Took in your doctor book and decide the disease could be elther a common cold oF tapas. Your doctor book tells you tha fa patient has lupus then the probability show these symptoms is 90%, be will Taso states that ithe patient has a common cold then the probability that he will showihececumoiomeisoniviog Upvois 84) Dowevate Comments hp. quoca.cominuitvly-speaking- Whats. the-ierence-betwoon-Bayesian-Estimation-and-Maximum-Likellhood Estimation 4 2017.3.15, Intutvely specking Whats the ference between Bayesian Estimation and Maximum Likelihood Estimation? - Quora ‘ekor ‘Ask Gestion arch Out ‘Well, there are two approaches to take I you used maximum likelihood estimation ‘you would declare, “The patient has lupus. Lupusis the disease which maximizes the Iikelthood of presenting these symptoms.” However a cannier doctor would remember that lupus is very rare, This means that the prior probability of anyone having lupus is very low (8 per 100K people) ‘compared to common cold (whichis common). Using @ Bayesian estimate you should decide that the patient is more likely to have a common cold than lupus. Fayi Femi-Balogun, Computer Scientist, MSc Machine NLP Enthusiast sing at UCL, Nice answers but here is my attempt ata quick intuitive explanation. [Maximum Likelihood Estimate - What ate these of parameters that best explains the data? Meaning, given some data and names of2parameterslambda and gama), ‘what ar the values of lambda and gamma that can explain the data? To make anew predietion, we simply evaluate the pdf using the best paramete found. Bayesian Estimation - We have some knowledge about the problemata(ption), We also admit that there may be many values ofthe parameters that explain the data and 0 We look for" multiple parameters, eg, 5 lambdas and S gammas that do this. This gives us multiple models and as a result, multiple predictions, one for cach pai of pparameters(but the same pro). To predic for a new example, we have to compute a ‘weighted sum’ ofthese predictions Hadi Zare, learner {In maximum likelihood estimation (MLE), just we are looking the best matching with ‘our assumption and the observed data, the likelihood function and we are seeking to ‘maximize this matching to get the best solution s method suffers from overditing and sometimes glving no answer(zeto estimate) {or rarely events(parameters) situations. Bayesian estimation assumes that thete exists a previous knowledge about the ‘parameter to estimate and set this previous information in a probability distribution. Then the est mation just difrs from the MLE ina prio specication In is simplest MLE: Maximize the best matching between our assumption about the data and ‘observed data likelihood function, Bayes: Maximize the best matching between our assumption about the data and observed data likelihood function, PLUS our (or domain expert, ) previous knowledge about the parameter, prior distribution fous studies ot ‘Top Stories from Your Feed Isitjustme oris Python actually C++ vs Javavs Python who will amore difficult language than win? Java Angel Sosa obey Jackson ‘ten ab ago ten 3 eae Honestly Ihave been around allthis talk about python being easy js (c)c+#/Assemler fr along time. Not being basicaly by people who are not proper ‘opinionated there fst anything Java ot coders. Formeifind python morediffcult Python ean perform that isnot possible in A Answer Netfeatons @ Damdae IfPython "doesn't scale", why is Quora built on it? Douglas Green, 1S years experience 5a Web developer using PHP and ‘MysaL \iten Way 3, 2078 Upvotaaby Jessica cheng, aura daa Scest hp. quoca.cominuitvly-speaking- Whats. the-iference-betwoon-Bayesian-Estimation-and-Maximum-Likelhod-Estmation 2017.3.15, Intutvely specking Whats the ference between Bayesian Estimation and Maximum Likelihood Estimation? - Quora AskGveston) Fe Answer Netfeatons @ Damdae Python fori development”. would summarize his answer as Upvois 84) Dowevate Comments hp. quoca.cominuitvly-speaking- Whats. the-ierence-betwoon-Bayesian-Estimation-and-Maximum-Likellhood Estimation 48

Das könnte Ihnen auch gefallen