Sie sind auf Seite 1von 60

Pattern Recognition

()
Instructor
MinLingZhang()

Email:zhangml@seu.edu.cn
URL:http://cse.seu.edu.cn/PersonalPage/zhangml/

SoutheastUniversity
Soochow,FallSemester
Textbook
Richard O. Duda, Peter E. Hart,
David G. Stork

Pattern Classification,
2nd edition

John Wiley & Sons, 2001


(2)
, 2004

Pattern Recognition Soochow,FallSemester 2


Course Information
Credits
2creditswith36credithours
Week2 Week10,Thursday&Friday
Contents
Chapters15
Aboutscores
Attendance:10%
Quiz(2times):20%
FinalExam:70%
Pattern Recognition Soochow,FallSemester 3
References
Books
S.Theodoridis,K.Koutroumbas.Pattern Recognition,
4th edition.ElsevierPublishers,2009.
C.Bishop.Pattern Recognition and Machine Learning.
CambridgeUniversityPress,2007.
.().,2010.

WebResources
InternationalAssociationforPatternRecognition(IAPR)
PatternRecognitionJournal(PRJ)
Listofpatternrecognitionwebsites

Pattern Recognition Soochow,FallSemester 4


http://cse.seu.edu.cn/PersonalPage/zhangml/

Pattern Recognition Soochow,FallSemester 5


http://cse.seu.edu.cn/PersonalPage/zhangml/

Pattern Recognition Soochow,FallSemester 6


Remarks
Mathematicalbackground
Linearalgebra
Probabilitytheory
Statistics
Informationtheory

Carefullyreadand
Ourcourseisnt
comprehendmaterialsin
amathematical
one AppendixMathematical
Foundation

Pattern Recognition Soochow,FallSemester 7


Remarks (Cont.)
Nopain,Nogain
Classroom Reviewwhathavebeen
lecturesare
taughtwithatleast4~6
importantbut
notenough hoursperweek

TerminologiesandContents
Importantanddifficult
oneswillbeannotated Onlyforreference
andevenrevisited purpose
withChinese
Pattern Recognition Soochow,FallSemester 8
Chapter 1
Introduction

Pattern Recognition Soochow,FallSemester 9


The 3W of Pattern Recognition
What is Pattern Recognition (PR)?
WhatisPattern?
WhatisRecognition?
WhatisPatternRecognition?

Why do we need Pattern Recognition?


Thenecessityandimportance forpatternrecognition

HoW to perform Pattern Recognition?


Thebuildingblocks ofapatternrecognitionsystem

Pattern Recognition Soochow,FallSemester 10


What is Pattern?
"Tounderstandistoperceivepatterns"
Isaiah Berlin
Patternsareessentialforhuman
perceptionandunderstanding
Apattern istheoppositeofachaos;itisanentity
vaguelydefined,thatcouldbegivenaname.
Satoshi Watanabe
(Pattern)(Chaos)

Pattern Recognition Soochow,FallSemester 11


What is Pattern? (Cont.)
Some examples

Molecules Barcode Geometry

Fingerprint Footprint Myself

Pattern Recognition Soochow,FallSemester 12


What is Pattern? (Cont.)
Apattern istheoppositeofachaos;itisanentity
vaguelydefined,thatcouldbegivenaname.

Therearevariouskindsofpatterns
Visualpatterns()suchaseyes,nose,
mouth,face,fingerprint,etc.
Temporalpatterns()suchasspeech,
audios,videos,datastreams,etc.
Logicalpatterns()suchascharacters,
strings,images,etc.

Pattern Recognition Soochow,FallSemester 13


What is Recognition?
Identificationofapatternasamemberofacategory
wealreadyknow,orwearefamiliarwith
(Recognition)(Identification)
(Category)
Twotypesofrecognition
Categoriesareknown andthe
Classification () taskistoassignaproperclass
labelforeachpattern
Categoriesareunknown andthe
Clustering () taskistolearncategoriesand
groupthepatternsaccordingly

Pattern Recognition Soochow,FallSemester 14


Classification vs. Clustering
Classification: An example

Wealreadyknowthecategories
ofcharacters,andthenclassify
thehandwrittenonesinto
categoryAandcategoryB

Clustering: An example

Wedonotknowthecategories
ofsymbols,andthenlearnthe
categoriesandgroupthe
symbolsaccordingly

Pattern Recognition Soochow,FallSemester 15


What is Pattern Recognition?
Patternrecognitionistheprocedureofprocessing
andanalyzingdiverseinformation (numerical,literal,
logical)characterizingtheobjectsorphenomenon,
soastoprovidedescriptions,identifications,
classificationsandinterpretations forthem.

()

Pattern Recognition Soochow,FallSemester 16


What is Pattern Recognition? (Cont.)
APerceive+Process+PredictionView

Itisthestudyofhowmachinescan
Perceive:Observetheenvironment(i.e.
interactwiththerealworld)
Process:Learn todistinguishpatternsof
interestfromtheirbackground
Prediction: make soundandreasonable
decisionsaboutthecategoriesofthepatterns

Pattern Recognition Soochow,FallSemester 17


Why need pattern recognition?
Therealpowerofhumanthinkingisbasedon
recognizingpatterns.Thebettercomputersgetat
patternrecognition,themorehumanliketheywill
become.
RayKurzweil @NewYorkTimes,2003

Theproblemofsearchingforpatternsindataisa
fundamentaloneandhasalongandsuccessfulhistory.
ChristopheM.Bishop
Pattern recognition is needed in designing
almost all automated and intelligent systems!

Pattern Recognition Soochow,FallSemester 18


Applications of Pattern Recognition
1)CharacterRecognition
[]
Input:
imageswithcharacters Output:
(normallycontaminated theidentified
withnoise) characterstrings
(Earham encourag)

Usefulinscenariossuchasautomaticlicenseplate
recognition(ALPR), opticalcharacterrecognition
(OCR),etc.

Pattern Recognition Soochow,FallSemester 19


Applications of PR (Cont.)
2)SpeechRecognition
[]
Input:
acousticsignal
(e.g.soundwaves) Output:
contentsofthespeech

Usefulinscenariossuchas
speechtotext(STT), voice
command&control, etc.

Pattern Recognition Soochow,FallSemester 20


Applications of PR (Cont.)
3)FingerprintRecognition
[]
Input:
fingerprintsofsome
person
Output:
thepersonsidentity

Usefulinscenariossuchas
computerized accesscontrol,
criminalpursuit,etc.

Pattern Recognition Soochow,FallSemester 21


Applications of PR (Cont.)
4)SignatureIdentification
[]
Input:
signatureofsomeperson
(sequenceofdots)
Output:
thesignatorysidentity

Usefulinscenariossuchas
digitalsignatureverification,
creditcardantifraud, etc.

Pattern Recognition Soochow,FallSemester 22


Applications of PR (Cont.)
5)FaceDetection Usefulinscenariossuchasdigitalcamera
[] capturing,videosurveillance, etc.

Input: Output:
imageswithseveral locationsofthepeoples
people facesintheimage

Pattern Recognition Soochow,FallSemester 23


Applications of PR (Cont.)
6)TextCategorization
[]
Input:
document,webpages,
etc. Output:
categoryofthetext,such
aspolitical,economic,
military,sports,etc.

Usefulinscenariossuchas
informationretrieval,
documentorganization, etc.

Pattern Recognition Soochow,FallSemester 24


Applications of PR - More
Problem Input Output
Electrocardiogram (ECG) Types of cardiac
Detection and diagnosis
waveforms, Electroencephalogram conditions, classes of brain
of disease
(EEG) waveforms conditions
Natural resource Terrain forms, vegetation
Multispectral images
identification cover
Aerial reconnaissance Visual, infrared, radar images Tanks, airfields
Identification and Slides of blood samples, micro-
Type of cells
counting of cells sections of tissues
Inspection (PC boards,
Scanned image (visible, infrared) Acceptable/unacceptable
IC masks, textiles)
3-D images (structured light, laser, Identify objects, pose,
Manufacturing
stereo) assembly
Web search Key words specified by a user Text relevant to the user

Pattern Recognition Soochow,FallSemester 25


Why need pattern recognition? (Cont.)
Forhumans,patternrecognitionisnatural&easy
recognize a face understand spoken words
read handwritten characters identify items by feel

decide whether an apple is ripe by its smell

Forcomputers,patternrecognitionisnevereasy
All in all, pattern recognition is important,
useful, attractive, but rather challenging
Challenges Opportunities

Pattern Recognition Soochow,FallSemester 26


Basic Concepts
Model()
Descriptionswhicharetypicallymathematicalinform
[]
e.g. image matrix; sound waves frequency vector
Sample()
Representativesofthepatternswewanttoclassify
[]
e.g. fingerprint of a suspect; ECG of a patient
TrainingSet()
Asetofsamplesusedtotrainclassifiers
[]
Pattern Recognition Soochow,FallSemester 27
Basic Concepts (Cont.)
TestSet()
Asetofsamplestobeclassified,usuallybeingmutually
exclusivetotrainingset
[,]
Training set vs. Test set Homeworks vs. Exams

Feature()
Attributeswhichcharacterizepropertiesofthesamples
[]
e.g. to characterize a person, we may use features such
as height, weight, age, salary, occupation, etc.

Pattern Recognition Soochow,FallSemester 28


Basic Concepts (Cont.)
FeatureVector()
Vectorformedbyagroupoffeatures,usuallyincolumnform
[]

vector component
(initalic)
transpose
operator
featurevector
(inboldface) dimensionality
(numberoffeatures)

Pattern Recognition Soochow,FallSemester 29


Basic Concepts (Cont.)
FeatureSpace()
Spacecontainingallthepossiblefeaturevectors
()
e.g. the d-dimensional Euclidean space Rd

ScatterPlot()
Eachsampleisplottedasa
pointinthefeaturespace
(
)
a 2D scatter plot

Pattern Recognition Soochow,FallSemester 30


Basic Concepts (Cont.)
DecisionBoundary()
Boundariesinfeaturespacewhichseparatedifferentcategories
()

linear boundary quadratic boundary complex boundary

Pattern Recognition Soochow,FallSemester 31


How to do pattern recognition?
An Example
Thetask:Automatethe
processofsortingincoming
fish onaconveyorbelt
accordingtospecies

Separateseabass
fromsalmon
Threebasic
[ vs.] steps

Pattern Recognition Soochow,FallSemester 32


Example: Sea bass vs. Salmon (Cont.)
StepI:Preprocessing()
Goal:Preprocesstheimagecapturedbythecamera,such
thatsubsequentoperationscouldbesimplifiedwithout
losingrelevantinformation

Adjustthelevelofillumination
Routineimage
processing Denoising
Enhancethelevelofcontrast

Isolatedifferentfishesfromone
segmentation another
Isolatefishesfromthebackground
......

Pattern Recognition Soochow,FallSemester 33


Example: Sea bass vs. Salmon (Cont.)
StepII:FeatureExtraction()
Goal:Extractfeatures(withgooddistinguishingability)
fromthepreprocessedimagetobeusedforsubsequent
classification

Seabassis lengthcouldbeagoodcandidate
usuallylonger
than asalmon forfeatures

Seabassis lightnessoffishscales couldbe


usuallybrighter
than asalmon anothergoodcandidateforfeatures

......

Pattern Recognition Soochow,FallSemester 34


Example: Sea bass vs. Salmon (Cont.)
StepIII:Classification()
Goal:Todistinguishdifferenttypesofobjects(inthis
case,seabassvs.salmon)basedontheextractedfeatures

haxis:lengthoffish
vaxis:numberoffisheswith
acertainlength

Onaverage,seabassis
somewhat longerthansalmon

Toomuchoverlaps
poorseparationwith
thelengthfeature
histogramforlength

Pattern Recognition Soochow,FallSemester 35


Example: Sea bass vs. Salmon (Cont.)
haxis:lightnessoffishscales
vaxis:numberoffisheswith
acertainlightness

Onaverage,seabassis
much brighterthansalmon

Lessoverlaps better
separationwiththelightness
feature,butstillabit
histogramforlightness unsatisfactory

Whatifnoothersinglefeature Use more features


yieldsbetterperformance? at the same time!

Pattern Recognition Soochow,FallSemester 36


Example: Sea bass vs. Salmon (Cont.)
Usingtwofeatures
simultaneously

blackdots:salmonsamples
reddots:seabasssamples
Linear decisionboundary:

Much better than


scatterplotforthefeaturevectors single feature

Pattern Recognition Soochow,FallSemester 37


Example: Sea bass vs. Salmon (Cont.)
Linear decision
boundary:

Complex decision
boundary

Allthetrainingsamples(i.e.known Can we truly feel


patterns)havebeenseparatedperfectly satisfied?

Pattern Recognition Soochow,FallSemester 38


Example: Sea bass vs. Salmon (Cont.)
Generalization
Theultimategoal!
[/]
Thecentralaimofdesigningaclassifieristomakecorrect
decisionswhenpresentedwithnovel (unseen/test)patterns,
notontrainingpatternswhoselabelsarealreadyknown
e.g.itsuselesstoget100%accuracywhenansweringhomework
questionswhilegetlowaccuracywhenansweringexamquestions

Performance on
the training set
Tradeoff
Simplicity of
the classifier

Pattern Recognition Soochow,FallSemester 39


Related Fields to PR
PatternRecognition:Pattern Category
Hypothesis Testing ()
Null hypothesis Rejection or Not [ref. pp.628]
E.g.: To determine whether a drug is effective; Null hypothesis: it has not effect
Image Processing ()
Image Image
Often employed as

Associative Memory ()
Pattern Pattern preliminary steps in
Regression ()
pattern recognition

Pattern Real Value


Interpolation ()
Pattern (unexplored input range) Interpolated Value
Density Estimation ()
Patterns Probability density function (pdf) for different categories
Pattern Recognition Soochow,FallSemester 40
Pattern Recognition System

Apostprocessordecideontheappropriate
Inadditionto actionbasedontheclassification
theusual
bottomup Aclassifierusesextractedfeaturesto
flowofdata, assignthesensedobjecttoacategory
somesystems
alsoemploy
Afeatureextractormeasuresobject
feedback propertiesthatareusefulforclassification
fromhigher
levelsback Asegmentor isolatessensedobjectsfrom
downtolower thebackgroundorfromotherobject
levels(gray
arrows) Asensorconvertsphysicalinputs(e.g.
images,sounds)intodigitalsignaldata

Pattern Recognition Soochow,FallSemester 41


Design Cycle of PR System
ThedesignofaPRsystemusuallyentailsa
numberofdifferentactivities,suchasdata
collection,featurechoice,modelchoice,
classifiertraining,classifierevaluation.
Datacollectionaccountsforalargepart of
thecostofdevelopingaPRsystem
Featurechoiceandmodelchoicearehighly
domaindependent,wherepriorknowledge
() playsveryimportantrole
e.g.: lightness might be a good feature for
distinguishing sea bass and salmon; linear
model might be preferred than nonlinear ones
Variousactivitiesmayberepeatedinorder
toobtainsatisfactoryresults

Pattern Recognition Soochow,FallSemester 42


Important Issues in Pattern Recognition
Noise ModelSelection
() ()
Segmentation Overfitting
() ()
DataCollection Context
() ()
DomainKnowledge ClassifierEnsemble
() ()
FeatureExtraction CostsandRisks
() ()
PatternRepresentation ComputationalComplexity
() ()
MissingFeatures
()

Pattern Recognition Soochow,FallSemester 43


Noise
Generaldefinition
Anypropertyofthesensedpatternwhichisnotdueto
thetrueunderlyingmodelbutinsteadtointrinsic
randomnessoftheworldorthesensors
Varioustypesofnoiseexist
shadows,conveyorbeltmightshake,etc.
Noisecanreducethereliabilityofthefeature
valuesmeasured
Knowledgeofthenoiseprocesscanhelpimprove
performance

Pattern Recognition Soochow,FallSemester 44


Segmentation
Individualpatternshavetobesegmentedfor
subsequentpatternrecognitionoperations
Oneofthedeepestaswellashardestproblemsin
patternrecognition
Howcanwesegmenttheimageswithouthaving
categorizedthemfirstly?
Ontheotherhand,howcanwecategorizetheimages
withouthavingsegmentedthemfirstly?
Howdowe"group"togetherthepropernumber
ofelements
BEATS BE,BEAT,EAT,AT,EATS?

Pattern Recognition Soochow,FallSemester 45


Data Collection
Asmallsetoftypicalexamples Preliminary
studyofsystemfeasibility
Muchmoredata Assuregoodperformancein
thefieldedsystem
Howdoweknowthatwehavecollected:
Adequatelylargesetofexamplesfortrainingand
testingthesystem?
Representative setofexamplesfortrainingandtesting
thesystem?
Theefforts ofdatacollectioncouldberather
demanding
Pattern Recognition Soochow,FallSemester 46
Domain Knowledge
Thereisnotsufficientdatafortraining Incorporate
domainknowledge(a.k.a.priorknowledge)
TypeI: Incorporatedomainknowledgeonthepatterns
themselves Difficult!
Torecognizealltypesofchairs
Astoundingvarietyinnumberoflegs,material,shape,andsoon
Whatisthecommonness forchairswhichcouldberegardedas
domainknowledge?
TypeII: Incorporatedomainknowledgeonthepattern
generationprocedure
Opticalcharacterrecognition Assumehandwrittencharactersare
writtenasasequenceofstrokes
Firsttrytorecoverstrokerepresentations deducethecharacter
fromtheidentifiedstrokes

Pattern Recognition Soochow,FallSemester 47


Feature Extraction
Adomaindependentproblemwhichinfluencesthe
classifiersperformance
Goodextractedfeatures Makeclassificationeasier

Whatkindsoffeaturesarepromising?
DistinguishingCapability: Whosevaluesareverysimilarfor
objectsinthesamecategory,whileverydifferentforobjectsin
differentcategories

Whatifalargesetofcandidatefeaturesavailable?
Choosethosearesimpletoextract
Choosethosearerobusttonoise
Choosethosecanleadtosimplerdecisionboundaries

Pattern Recognition Soochow,FallSemester 48


Pattern Representation
Variouswaysforpatternrepresentation
Statistical: featurevector (themostpopular)
TemplateMatching: prototypetemplates
Syntactic: rulesorgrammars

DesiredProperties
Patternsfromthesameclassesshouldhavesimilarrepresentations
Patternsfromdifferentclassesshouldhavedissimilar
representations
Patternrepresentationsshouldbeinvarianttotransformationssuch
astranslations,rotations,resizes,reflections,nonrigiddeformations
Intraclassvariationshouldbesmall
Interclassvariationshouldbelarge

Pattern Recognition Soochow,FallSemester 49
Missing Features
Inpracticalproblems,valuesforcertainfeatures
maybemissing
Occlusionbetweenfishes fishwidthcantbemeasured
Howcouldwetrainclassifierswithmissingfeatures?
Navemethodcouldbeused,butmaynotbeoptimal
Assumingthevalueofmissingfeaturesiszero
Assigningtheaveragevalueofpatternsalreadyseenforthe
missingfeature
Sophisticatedmethodmightbebetter,butrequiresextra
effortsintermsofstorageandtime
Fillinthemissingvalueswithregressiontechniques

Pattern Recognition Soochow,FallSemester 50


Model Selection
Eachpatternrecognitionmethodemployscertain
modelhypothesis
Everypatternrecognitionproblemhasitsown
underlyingtruemodel
Fundamentalquestionsonmodelselection
Howdoweknowwhetherthehypothesizedmodelis
(relatively)consistentwiththeunderlyingtruemodel?
Howarewetoknowtorejectaclassofmodelsandtry
anotherone?
Canweautomatetheprocessofmodelselection,instead
oftrialanderror () whichisrandomandtedious?
Pattern Recognition Soochow,FallSemester 51
Overfitting
Wecangetperfectclassificationperformanceonthe
trainingdatabychoosingcomplexmodels
Complexmodelsaretunedtotheparticulartrainingsamples,rather
thanthecharacteristicsofthetruemodel
Modelsoverlycomplexthannecessaryleadtooverfitting
Goodperformanceonthetrainingdata,butpoorperformanceon
noveldata
Howcanwefindprincipledwaystoobtainbestcomplexity?

Pattern Recognition Soochow,FallSemester 52


Context
Context:Inputdependentinformation,otherthanfromthe
patternitself
contextoflanguage,contextofvideos,etc.

Thesamepatternwithindifferentcontextmighthave
differentmeanings
Usethecontextofaconversationtoinferthemeaningofthespeaker

Contextisveryhelpful!

How m ch info mation


are y u mi sing
Pattern Recognition Soochow,FallSemester 53
Classifier Ensemble
Classifierensembleaimstoimprovegeneralization
performancebyemployinganumberofclassifiers forthe
sametask
Toimprovetheperformanceofspeechrecognizer:combinethe
resultsofacousticrecognitionandlipreading
a.k.a.MulticlassifierSystem,MixtureofExperts,ClassifierFusion,etc.
Diverseensembletechniques:Bagging,Boosting,Randomsubspace,
etc.[ref. pp.475]

How to combine different classifiers?


Majority voting: vote for the category where most classifiers agree
Weighted voting: weight each vote by classifiers confidence
Stacking: learn the rule of combination (more complicated)

Pattern Recognition Soochow,FallSemester 54


Costs and Risks
Costisthelossaftermakingincorrectdecisions
Equalcost: InOCR,thecostofmistaking6as9mightbe
equaltothatofmistaking9as6
Unequalcost: InAIDSdiagnosis,thecostofmistakingpositive
()asnegative() wouldbemuchhigherthan thatof
mistakingnegativeaspositive

Riskistotalexpectedcostwhichwewanttooptimize
Errorrate(percentagesoftestpatternsbeingwronglyclassified)
Precision,Recall,AreaundertheROCcurve(AUC),etc.

Questionsoncostsandrisks
Howdoweincorporateknowledgeofcosts,e.g.unequalcost?
Canweestimatethelowest possibleriskofanyclassifier?

Pattern Recognition Soochow,FallSemester 55
Computational Complexity
Howdoesanalgorithmscalewith
Thenumberoffeatures(dimensionality)
Thenumberoftrainingpatterns
Thenumberofpossiblecategories

Bruteforce()approachesmightleadtoperfect
classification,butwithimpracticaltimeandstorage
requirements
InOCR,labelallpossible20x20binarypixelimageswithacategory
usesimpletablelookup()toclassifyincomingpatterns
Labelingeachofthe220x20(10120)patternsisprohibitive

Howcanwefindagoodtradeoffbetweencomputational
easeandclassifierperformance?
Pattern Recognition Soochow,FallSemester 56
Summary
WhatisPatternRecognition?
Pattern
Theoppositeofchaos
Variouskinds:visualpatterns,temporalpatterns,logicalpatterns,etc.
Recognition
Identificationofapatternasamemberofacategory
Classification:categoriesknown assignproperclasslabelforeach
pattern
Clustering:categoriesunknown learncategoriesandgrouppatterns
PatternRecognition
Perceive:observetheenvironment(i.e.interactwiththerealworld)
Process:learn todistinguishpatternsofinterest
Prediction:make soundandreasonabledecisionsaboutthecategories

Pattern Recognition Soochow,FallSemester 57


Summary (Cont.)
WhyPatternRecognition?
Patternrecognitionisneededindesigningalmostall
automatedandintelligentsystems
Applicationsofpatternrecognitionareubiquitous
Characterrecognition(images characters)
Speechrecognition(speech text)
Fingerprintrecognition(fingerprints personsidentity)
Signatureidentification(signature signatorysidentity)
Facedetection(images facelocations)
Textcategorization(documents semanticcategories)

Pattern Recognition Soochow,FallSemester 58


Summary (Cont.)
HowPatternRecognition?
Basicconcepts
model,sample,trainingset,testset,feature,featurevector,feature
space,scatterplot,decisionboundary
Anillustrativeexample:seabassvs.salmon
Generalization:Makecorrectdecisionsgivennovelpatterns
Relatedfields
hypothesistesting,imageprocessing,associativememory,
regression,interpolation,densityestimation
ComponentsofPatternRecognitionSystem
sensing segmentation featureextraction classification
postprocessing

Pattern Recognition Soochow,FallSemester 59


Summary (Cont.)
HowPatternRecognition?
DesignCycleofPatternRecognitionSystem
collectdata choosefeatures choosemodel trainclassifier
evaluateclassifier
ImportantIssues
Noise ModelSelection
Segmentation Overfitting
DataCollection Context
DomainKnowledge ClassifierEnsemble
FeatureExtraction CostsandRisks
PatternRepresentation ComputationalComplexity
MissingFeatures

Pattern Recognition Soochow,FallSemester 60

Das könnte Ihnen auch gefallen