Beruflich Dokumente
Kultur Dokumente
Lecture # 1
Introduction & Fundamentals
13
Learning Associations
• Basket analysis:
P (Y | X ) probability that somebody who
buys X also buys Y where X and Y are
products/services.
Example: P ( bread |cold drink) = 0.7
Market-Basket transactions
TID Items
1 Bread, Milk
2 Bread, Diaper, Cold Drink, Eggs
3 Milk, Diaper, Cold Drink
4 Bread, Milk, Diaper, Cold Drink
5 Bread, Milk, Diaper, Water
Classification
• Example: Credit
scoring
• Differentiating
between low-
risk and high-
risk customers
from their
income and
savings
Discriminant: IF
income > θ 1 AND
savings > θ2 15
Prediction: Regression
• Example: Price of a
used car
y= 0
• x : car attributes wx+w
y : price
y = g (x | θ )
g ( ) model,
θ parameters
17
Pattern
A pattern is an entity that can be given a
name
19
Recognition
• Identification of a pattern as a member of a
category
20
Classification
Classification
24
Unsupervised Learning
• Learning “what normally happens”
• No output
• Clustering: Grouping similar instances
• Other applications: Summarization,
Association Analysis
• Example applications
– Image compression: Color quantization
– Bioinformatics: Learning motifs
25
Reinforcement Learning
• Topics:
– Policies: what actions should an agent take in a
particular situation
– Utility estimation: how good is a state (used by
policy)
• Credit assignment problem (what as
responsible for the outcome)
• Applications:
– Game playing
– Robot in a maze
– Multiple agents, partial26 observability, ...
Model Choice
– What type of classifier shall we use? How shall we
select its parameters? Is there best classifier...?
– How do we train...? How do we adjust the
parameters of the model (classifier) we picked so
that the model fits the data?
Features
• Features: a set of variables believed to carry discriminating
and characterizing information about the objects under
consideration
• Feature vector: A collection of d features, ordered in some
meaningful way into a d- dimensional column vector, that
represents the signature of the object to be identified.
• Feature space: The d-dimensional space in which the feature
vectors lie. A d-dimensional vector in a d-dimensional space
constitutes a point in that space.
Feature
s
Features
• Feature choice
– Good Features
• Ideally, for a given group of patterns coming from the
same class, feature values should all be similar
• For patterns coming from different classes, the feature
values should be different.
– Bad Features
• irrelevant, noisy, outlier?
Features
Classification
Learn a method for predicting the instance class
33
Clustering
Find “natural” grouping of instances given un-
labeled data
34
Supervised Learning
x2
x1
Unsupervised Learning
x2
x1
Example Applications
37
Examples: Medicine
38
Examples: GIS
39
Examples: Industrial Inspection
Human operators are
expensive, slow and
unreliable
Make machines do the
job instead
40
Examples: HCI
Try to make human computer
interfaces more natural
Gesture recognition
Facial Expression Recognition
Lip reading
41
Examples: Sign Language/Gesture
Recognition
42
Examples: Facial Expression
Recognition
Implicit customer feedback
43
Examples: Facial Expression
Recognition
Implicit customer feedback
44
Examples: Biometrics
Biometrics - Authentication
techniques
Physiological Biometrics
Face, IRIS, DNA, Finger Prints
Behavioral Biometrics
Typing Rhythm, Handwriting,
Gait
45
Face Recognition
Test images
46
Examples: Biometrics – Face
Recognition
47
Faces and Digital Cameras
Automatic lighting
correction based
on face detection
Examples: Biometrics – Finger Print
Recognition
49
Examples: Biometrics – Signature
Verification
50
Examples: Robotics
51
Examples: Optical Character
Recognition
Convert document image into text
53
Examples: Optical Character
Recognition
License Plate Recognition
55
Examples: Optical Character
Recognition
56
Example: Hand-written digits
Data representation: Greyscale images Task:
Classification (0,1,2,3…..9) Problem
features:
• Highly variable inputs from same class
including some “weird” inputs,
• imperfect human classification,
• high cost associated with errors so “don’t
know” may be useful.
A classic example of a task that requires machine learning:
It is very hard to say what makes a 2
Safety and Security
62
Resources: Datasets
• UCI Repository:
http://www.ics.uci.edu/~mlearn/MLRepository.html
• Statlib: http://lib.stat.cmu.edu/
• Delve: http://www.cs.utoronto.ca/~delve/
63
Resources: Journals
• Journal of Machine Learning Research
www.jmlr.org
• Machine Learning
• IEEE Transactions on Neural Networks
• IEEE Transactions on Pattern Analysis and
Machine Intelligence
• Annals of Statistics
• Journal of the American Statistical
Association
• …. 64
Resources: Conferences
• International Conference on Machine Learning (ICML)
• European Conference on Machine Learning (ECML)
• Neural Information Processing Systems (NIPS)
• Computational Learning
• International Joint Conference on Artificial Intelligence (IJCAI)
• ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)
• IEEE Int. Conf. on Data Mining (ICDM)
65
Course Information
Course Information
Books
Pattern Recognition & Machine Learning, 1st Edition,
Chris Bishop
Pattern Classification, Duda, Hart & Stork
Introduction to Machine Learning, Alphaydin
Course Contents
Katydid or Grasshopper?
For any domain of interest, we can measure
features
Color {Green, Brown, Gray, Other} Has Wings?
Abdomen Thorax
Length Antennae
Length Length
Mandible
Size
Spiracle
Diameter
Leg Length
My_Collection
We can store features
in a database. Insect Abdomen Antennae Insect Class
ID Length Length
1 2.7 5.5 Grasshopper
2 8.0 9.1 Katydid
The classification 3 0.9 4.7 Grasshopper
10
9
8
7
Antenna Length
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10
Abdomen Length
Pigeon Problem 1
Examples of class A Examples of class
B
3 4 5 2.5
1.5 5 5 2
6 8 8 3
2.5 5 4.5 3
Pigeon Problem 1 What class is this
object?
Examples of class A Examples of class
B
3 4 5 2.5 8 1.5
6 8 8 3
4.5 7
2.5 5 4.5 3
Pigeon Problem 1 This is a B!
Examples of class A Examples of class
B
3 4 5 2.5 8 1.5
1.5 5 5 2
4 4 5 2.5 8 1.5
6 6 5 3
7 7
3 3 2.5 3
Pigeon Problem 2
Examples of class A Examples of class
B
5 5 2 5
So this one is an A.
6 6 5 3
7 7
3 3 2.5 3
Pigeon Problem 3
Examples of class A Examples of class
B
6 6
1 5 7 5
6 3 4 8
3 7 7 7
Pigeon Problem 3 It is a B!
Examples of class A Examples of class
B
6 6
4 4 5 6
The rule is as follows, if the
sum of the two bars is less
than or equal to 10, it is an
1 5 7 5
A. Otherwise it is a B.
6 3 4 8
3 7 7 7
Pigeon Problem 1
10
9
8
Examples of class A Examples of class 7
B 6
Left Bar
5
4
3
2
3 4 5 2.5 1
1 2 3 4 5 6 7 8 9 10
Right Bar
1.5 5 5 2
2.5 5 4.5 3
Pigeon Problem 2
10
9
8
Examples of class A Examples of class 7
B 6
Left Bar
5
4
3
2
4 4 5 2.5 1
1 2 3 4 5 6 7 8 9 10
Right Bar
5 5 2 5
Let me look it up… here it is..
the rule is, if the two bars
are equal sizes, it is an A.
Otherwise it is a B.
6 6 5 3
3 3 2.5 3
Pigeon Problem 3
100
90
80
Examples of class A Examples of class 70
B 60
Left Bar
50
40
30
20
4 4 5 6 10
10 20 30 40 50 60 70 80 90 100
Right Bar
1 5 7 5
6 3 4 8
The rule again:
if the square of the sum of the
two bars is less than or equal
to 10, it is an A. Otherwise it
3 7 7 7
is a B.
Grasshoppers Katydids
10
9
8
7
Antenna Length
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10
Abdomen Length
previously unseen instance = 11 5.1 7.0 ???????
6 points in space.
5
4
3
2
1
Katydids
1 2 3 4 5 6 7 8 9 10 Grasshoppers
Abdomen Length
Simple Linear Classifier
10
9
8
7 R.A. Fisher
1890-
6 1962
5 If previously unseen instance above the line
4 then
class is Katydid
3
else
2 class is
1 Grasshopper
Katydids
1 2 3 4 5 6 7 8 9 10 Grasshopper
s
Which of the “Pigeon Problems” can be solved by the
Simple Linear Classifier? 10
9
8
7
6
5
1) Perfect 4
2) Useless 3
3) Pretty Good 2
1
1 2 3 4 5 6 7 8
9 10
100 10
90 9
Problems that can 80 8
70 7
be solved by a linear 6
60
classifier are call 50 5
linearly separable. 40 4
30 3
20 2
10 1
10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 8
9 10
MULTI-CLASS CLASSIFICATION
Virginica
A Famous Problem
R. A. Fisher’s Iris Dataset.
• 3 classes
• 50 of each class
Setosa
The task is to classify Iris plants Versicolor
into one of 3 varieties using the
Petal Length and Petal Width.
Virginica
Setosa
Versicolo
r
Binary classification: Multi-class
classification:
x2 x2
x1 x1
x2
One-vs-all (one-vs-rest):
x1
x2 x2
x1
x1
x2
Class 1:
Class 2:
Class 3:
x1
SPLITTING OF TRAINING AND TEST
DATA
Dividing Up Data
• We need independent data sets to train, set
parameters, and test performance
• Thus we will often divide a data set into
three
– Training set
– Parameter selection set
– Test set
• These must be independent
• Data set 2 is not always necessary
Dataset
Inputs Labels
15 95 1
33 90 1
78 70 0
70 45 0
80 18 0
35 65 1
45 70 1
31 61 1
50 63 1
98 80 0
73 81 0
50 18 0
15 95 1
Inputs Labels 33 90 1
15 95 1
78 70 0
33 90 1
78 70 0 70 45 0
70 45 0 80 18 0
80 18 0 35 65 1
35 65 1 50:50
45 70 1 split
31 61 1
50 63 1 45 70 1
98 80 0 31 61 1
73 81 0 50 63 1
50 18 0
98 80 0
73 81 0
50 18 0
Prediction
0 1
0
TN FP
Truth
1 FN TP
Accuracy Measures
• Accuracy
– = (TP+TN)/(P+N)
• Sensitivity or true positive rate (TPR)
– = TP/(TP+FN) = TP/P
• Specificity or TNR
– = TN/(FP+TN) = TN/N
• Positive Predictive value (Precision) (PPV)
– = Tp/(Tp+Fp)
• Recall
– = Tp/(Tp+Fn)
Acknowledgements
Pattern Recognition & Machine Learning, 1st Edition, Chris Bishop
Introduction to Machine Learning, Alphaydin
Statistical Pattern Recognition: A Review – A.K Jain et al., PAMI (22) 2000
Pattern Recognition and Analysis Course – A.K. Jain, MSU
Material in these slides has been taken from, the following
129