Willkommen bei Scribd!

Karussell überspringen

Présentation MAT003

Hochgeladen von

adrai_marc

0% fanden dieses Dokument nützlich (0 Abstimmungen)

12 Ansichten14 Seiten

Originaltitel

PrésentationMAT003

Copyright

Verfügbare Formate

PDF, TXT oder online auf Scribd lesen

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Dieses Dokument melden

Copyright:

Attribution Non-Commercial (BY-NC)

Verfügbare Formate

Als PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

0% fanden dieses Dokument nützlich (0 Abstimmungen)

12 Ansichten14 Seiten

Présentation MAT003

Hochgeladen von

adrai_marc

Copyright:

Attribution Non-Commercial (BY-NC)

Verfügbare Formate

Als PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

Zu Seite

Sie sind auf Seite 1von 14

Im Dokument suchen

MAT003 Project

Natural Language Processing and Statistics

A study about the Naive Bayes Classier

Adra Marc
1

MAT003 Project - NLP and Statistics

Introduction
Large amount of data on internet Information extraction : Data mining Classication : subdivision of NLP problems.

MAT003 Project - NLP and Statistics

What is a Classier?
Text Class

Function who gives a class as an output for each text given in input Text represented as a vector of features

MAT003 Project - NLP and Statistics

Naive Bayes Classier

Supervised Learning Prior information to compute the class probabilities using a pre-classied sample Rely on the independance between features - not veried in reality

MAT003 Project - NLP and Statistics 2 different types

Multinomial Bayes classier : word frequency Multivariate Bernoulli : word presence

Multinomial performs better

MAT003 Project - NLP and Statistics

Performances : How to assess it?

Evaluated by precision and recall:

And micro and macro F1:

MAT003 Project - NLP and Statistics

Performances
Study revealed by Yimmy Yang on the Reuters Datasets Good but not outstanding performances Remain surprising for a Classier that does't respect its assumptions

MAT003 Project - NLP and Statistics

To summarize

Learning using a corpus of texts Optimal if we can assume independency between features Good performances but can be improved

How can we improve this performance?

MAT003 Project - NLP and Statistics

Empirical Rules

Use the shape of the text to classify Require an important analysis of the texts to classify Efcient on spam ltering

MAT003 Project - NLP and Statistics

Poisson distribution and features weighting

Frequency of occurence of the word follows a Poisson distribution Weight the features with mutual information 10% improvement

MAT003 Project - NLP and Statistics

Bayesian Networks
F1 F2

Modelling relations between features Reduced model Increase the performance

MAT003 Project - NLP and Statistics

Bayesian Online Perceptron

Feature 2

Find separation hyperplanes Minimise the distance of misclassied items from the decision boundary Better performances than SVM

. . . . . . . . . . . . . .. .. . . . . . .. . . . . . . . . .
Feature 1

MAT003 Project - NLP and Statistics

Conclusion
Good original performance of the Naive Bayes Classier Simple to implement and to use Large panel of improvements Stay convenient for simple use

Thank you

Any Questions?
14

Das könnte Ihnen auch gefallen

Fear: Trump in the White House
Von Everand
Fear: Trump in the White House
Bob Woodward
Bewertung: 3.5 von 5 Sternen
3.5/5 (738)
A Man Called Ove: A Novel
Von Everand
A Man Called Ove: A Novel
Fredrik Backman
Bewertung: 4.5 von 5 Sternen
4.5/5 (4610)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Von Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Bewertung: 3.5 von 5 Sternen
3.5/5 (231)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Von Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Bewertung: 4.5 von 5 Sternen
4.5/5 (121)
Grit: The Power of Passion and Perseverance
Von Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Bewertung: 4 von 5 Sternen
4/5 (588)
Never Split the Difference: Negotiating As If Your Life Depended On It
Von Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Bewertung: 4.5 von 5 Sternen
4.5/5 (838)
The Little Book of Hygge: Danish Secrets to Happy Living
Von Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Bewertung: 3.5 von 5 Sternen
3.5/5 (400)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Von Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Bewertung: 4.5 von 5 Sternen
4.5/5 (266)
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Von Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Bewertung: 4 von 5 Sternen
4/5 (5794)
Rise of ISIS: A Threat We Can't Ignore
Von Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Bewertung: 3.5 von 5 Sternen
3.5/5 (137)
Her Body and Other Parties: Stories
Von Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Bewertung: 4 von 5 Sternen
4/5 (821)
A Tree Grows in Brooklyn
Von Everand
A Tree Grows in Brooklyn
Betty Smith
Bewertung: 4.5 von 5 Sternen
4.5/5 (1929)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Von Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Bewertung: 4 von 5 Sternen
4/5 (1090)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Von Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Bewertung: 3.5 von 5 Sternen
3.5/5 (2259)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Von Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Bewertung: 4.5 von 5 Sternen
4.5/5 (345)
Shoe Dog: A Memoir by the Creator of Nike
Von Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Bewertung: 4.5 von 5 Sternen
4.5/5 (537)
The Emperor of All Maladies: A Biography of Cancer
Von Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Bewertung: 4.5 von 5 Sternen
4.5/5 (271)
The Glass Castle: A Memoir
Von Everand
The Glass Castle: A Memoir
Jeannette Walls
Bewertung: 4.5 von 5 Sternen
4.5/5 (1713)
Team of Rivals: The Political Genius of Abraham Lincoln
Von Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Bewertung: 4.5 von 5 Sternen
4.5/5 (234)
John Adams
Von Everand
John Adams
David McCullough
Bewertung: 4.5 von 5 Sternen
4.5/5 (2409)
Principles: Life and Work
Von Everand
Principles: Life and Work
Ray Dalio
Bewertung: 4 von 5 Sternen
4/5 (599)
Yes Please
Von Everand
Yes Please
Amy Poehler
Bewertung: 4 von 5 Sternen
4/5 (1891)
The Light Between Oceans: A Novel
Von Everand
The Light Between Oceans: A Novel
M.L. Stedman
Bewertung: 4.5 von 5 Sternen
4.5/5 (789)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Von Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Bewertung: 4 von 5 Sternen
4/5 (895)
Wolf Hall: A Novel
Von Everand
Wolf Hall: A Novel
Hilary Mantel
Bewertung: 4 von 5 Sternen
4/5 (3811)
The Perks of Being a Wallflower
Von Everand
The Perks of Being a Wallflower
Stephen Chbosky
Bewertung: 4.5 von 5 Sternen
4.5/5 (2104)
The Woman in Cabin 10
Von Everand
The Woman in Cabin 10
Ruth Ware
Bewertung: 3.5 von 5 Sternen
3.5/5 (2322)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Von Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Bewertung: 4.5 von 5 Sternen
4.5/5 (474)
Sing, Unburied, Sing: A Novel
Von Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Bewertung: 4 von 5 Sternen
4/5 (1103)
The Art of Racing in the Rain: A Novel
Von Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Bewertung: 4 von 5 Sternen
4/5 (4200)
Angela's Ashes: A Memoir
Von Everand
Angela's Ashes: A Memoir
Frank McCourt
Bewertung: 4.5 von 5 Sternen
4.5/5 (440)
The Constant Gardener: A Novel
Von Everand
The Constant Gardener: A Novel
John le Carré
Bewertung: 3.5 von 5 Sternen
3.5/5 (104)
The Outsider: A Novel
Von Everand
The Outsider: A Novel
Stephen King
Bewertung: 4 von 5 Sternen
4/5 (1839)
On Fire: The (Burning) Case for a Green New Deal
Von Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Bewertung: 4 von 5 Sternen
4/5 (74)
Brooklyn: A Novel
Von Everand
Brooklyn: A Novel
Colm Tóibín
Bewertung: 3.5 von 5 Sternen
3.5/5 (1937)
The Yellow House: A Memoir (2019 National Book Award Winner)
Von Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Bewertung: 4 von 5 Sternen
4/5 (98)
The Unwinding: An Inner History of the New America
Von Everand
The Unwinding: An Inner History of the New America
George Packer
Bewertung: 4 von 5 Sternen
4/5 (45)
Little Women
Von Everand
Little Women
Louisa May Alcott
Bewertung: 4 von 5 Sternen
4/5 (104)
Assertion Inventory For Use in Assessment and Research
Dokument2 Seiten
Assertion Inventory For Use in Assessment and Research
nerissarvn
Noch keine Bewertungen
Bad Feminist: Essays
Von Everand
Bad Feminist: Essays
Roxane Gay
Bewertung: 4 von 5 Sternen
4/5 (1016)
Institutional Economics
Dokument12 Seiten
Institutional Economics
icha
100% (1)
Steve Jobs
Von Everand
Steve Jobs
Walter Isaacson
Bewertung: 4.5 von 5 Sternen
4.5/5 (806)
Manhattan Beach: A Novel
Von Everand
Manhattan Beach: A Novel
Jennifer Egan
Bewertung: 3.5 von 5 Sternen
3.5/5 (792)
Freud's Psychoanalytic Theory
Dokument6 Seiten
Freud's Psychoanalytic Theory
Nina londo
Noch keine Bewertungen
Sonnets of E.E Cummings
Dokument25 Seiten
Sonnets of E.E Cummings
Jonathan Dubé
Noch keine Bewertungen
Chapter 3 Preliminary Information Gathering and Problem Definition
Dokument15 Seiten
Chapter 3 Preliminary Information Gathering and Problem Definition
noor marliyah
Noch keine Bewertungen
Gesang Der Junglinge
Dokument4 Seiten
Gesang Der Junglinge
Joe Blow
Noch keine Bewertungen
My Principles in Life (Essay Sample) - Academic Writing Blog
Dokument3 Seiten
My Principles in Life (Essay Sample) - Academic Writing Blog
K wong
Noch keine Bewertungen
Basarab Nicolescu, The Relationship Between Complex Thinking and Transdisciplinarity
Dokument17 Seiten
Basarab Nicolescu, The Relationship Between Complex Thinking and Transdisciplinarity
Basarab Nicolescu
100% (2)
Peace and Conflict in Africa
Dokument64 Seiten
Peace and Conflict in Africa
Bekalu Wachiso
Noch keine Bewertungen
Political Doctrines
Dokument7 Seiten
Political Doctrines
John Allauigan Marayag
Noch keine Bewertungen
Implications of Second Language Acquisition Theory For The Classroom
Dokument21 Seiten
Implications of Second Language Acquisition Theory For The Classroom
MohammedAdelHussein
Noch keine Bewertungen
First Quarter Summative Exam in Oral Communication
Dokument5 Seiten
First Quarter Summative Exam in Oral Communication
CAI Vlogs
Noch keine Bewertungen
Process-Oriented Guided Inquiry Learning: Pogil and The Pogil Project
Dokument12 Seiten
Process-Oriented Guided Inquiry Learning: Pogil and The Pogil Project
ainuzzahrs
Noch keine Bewertungen
Race & Essentialism in Feminist Legal Theory (Angela Harris) Pp. 574-582
Dokument2 Seiten
Race & Essentialism in Feminist Legal Theory (Angela Harris) Pp. 574-582
TLSJurSemSpr2010
100% (1)
BATALLA, JHENNIEL - STS-Module 7
Dokument5 Seiten
BATALLA, JHENNIEL - STS-Module 7
Jhenniel Batalla
Noch keine Bewertungen
AIESEC Annual Report 2009 2010
Dokument44 Seiten
AIESEC Annual Report 2009 2010
zuzulik
Noch keine Bewertungen
4 DLL-BPP-Perform-Basic-Preventive-Measure
Dokument3 Seiten
4 DLL-BPP-Perform-Basic-Preventive-Measure
Mea Joy Albao Colana
Noch keine Bewertungen
The Zone of Proximal Development As An Overarching Concept A Framework For Synthesizing Vygotsky S Theories
Dokument14 Seiten
The Zone of Proximal Development As An Overarching Concept A Framework For Synthesizing Vygotsky S Theories
annaliesehatton3011
Noch keine Bewertungen
Tcallp Reporting Late Childhood
Dokument25 Seiten
Tcallp Reporting Late Childhood
namjxxnie
Noch keine Bewertungen
Subject Offered (Core and Electives) in Trimester 3 2011/2012, Faculty of Business and Law, Multimedia University
Dokument12 Seiten
Subject Offered (Core and Electives) in Trimester 3 2011/2012, Faculty of Business and Law, Multimedia University
muzhaffar_razak
Noch keine Bewertungen
Ethics and Mental Health
Dokument15 Seiten
Ethics and Mental Health
api-3704513
100% (1)
Accuscan Form 26720
Dokument2 Seiten
Accuscan Form 26720
isimon234
Noch keine Bewertungen
Burns Et Al., 2000
Dokument13 Seiten
Burns Et Al., 2000
Adhie Trey Prassetyo
Noch keine Bewertungen
1.1 Perspective Management-Dr - Meena
Dokument68 Seiten
1.1 Perspective Management-Dr - Meena
Harshal Tak
Noch keine Bewertungen
Jove Protocol 55872 Electroencephalographic Heart Rate Galvanic Skin Response Assessment
Dokument9 Seiten
Jove Protocol 55872 Electroencephalographic Heart Rate Galvanic Skin Response Assessment
Zim Shah
Noch keine Bewertungen
Ctrstreadtechrepv01988i00430 Opt
Dokument30 Seiten
Ctrstreadtechrepv01988i00430 Opt
pamecha123
Noch keine Bewertungen
Research and Practice (7 Ed.) 'Cengage, Australia, Victoria, Pp. 103-121
Dokument18 Seiten
Research and Practice (7 Ed.) 'Cengage, Australia, Victoria, Pp. 103-121
api-471795964
Noch keine Bewertungen
Cultural Barriers of Communication
Dokument2 Seiten
Cultural Barriers of Communication
André Rodríguez Rojas
Noch keine Bewertungen
Vishnu - Reflections - How To Overcome Lust
Dokument3 Seiten
Vishnu - Reflections - How To Overcome Lust
Gokula Swami
Noch keine Bewertungen
Lección 22 - How Well + Intensifiers
Dokument2 Seiten
Lección 22 - How Well + Intensifiers
Luis Antonio Mendoza Hernández
Noch keine Bewertungen