Sie sind auf Seite 1von 5

Introduction to Pattern Recognition Introduction to Pattern Recognition

Introduction

What is Pattern Recognition?

Pattern recognition tries to answer one basic question:


Introduction to Pattern Recognition
What is it?

CS 650: Computer Vision Examples:


I What object is this based on shape, position?
I What kind of pixel is this based on local image properties?

Introduction to Pattern Recognition Introduction to Pattern Recognition


Introduction Introduction

Features and Patterns Features

I Cant really classify things I Has to be something you can quantify


I Have to classify descriptions (can be scaler, vector-valued, ordinal, label, etc.)
I Examples: height, weight, age, gender, etc.
I Questions:
Terminology: I How many features?
Things we can measure features I Which ones?
Collections of features pattern I Relative importance?

Introduction to Pattern Recognition Introduction to Pattern Recognition


Introduction Introduction

Classes General classification process

I Pattern recognition doesnt really try to produce general, I Extract features


open-ended descriptions I Assemble features into a feature vector or pattern
I Really more of which of these possibilities is it? I Assign to a class
I Finite set of possibilities are called classes I Might be a single classification
I Might be a specific yes/no (two-class problem) I Might rank possible classifications
I Might be multiple options (e.g., OCR) I Might produce (relative) probabilities
Introduction to Pattern Recognition Introduction to Pattern Recognition
Introduction Introduction

Training Training Sets

How do you make a computer learn? The set of training examples is called the training set

Possibilities: Questions:
I Give it rules (some AI approaches, expert systems, etc.) I How large / how many?
I Give it lots of examples (pattern recognition, machine learning, I Source?
neural networks) I Generality?
I How good?
Giving it lots of examples is called training

Introduction to Pattern Recognition Introduction to Pattern Recognition


Introduction Introduction

Training vs. Testing Supervised vs. Unsupervised

Pattern recognition systems/problems generally have two phases:


I Training phase (let it learn)
I Testing phase (put it to work) Supervised training: Give the system example patterns
and known classifications

Some systems may do on-the-fly (online) training if there is some Unsupervised training: Give the system example patterns
way to get feedback and let if figure out natural groupings
I After classification, let it know whether it got it right or
wronglearn for the next time
I Example: OCR/proofreading

Introduction to Pattern Recognition Introduction to Pattern Recognition


Introduction Feature Spaces, Prototypes, Minimum-Distance Classifiers

Approaches Patterns and Classes

A pattern is a vector of measured features:


I Statistical: training sets are samples from distributions, decisions
are based on statistically optimal classification x1
x2
I Structural:
x= .
..
I Pieces and parts (may be separate problem)
I Recognize the configuration xn

I Direct: iterative refinement of rules until it gets it right


(includes perceptrons, SVMs, neural approaches) Our goal is to assign each unclassified pattern x to one of a set of
classes {i }.
Introduction to Pattern Recognition Introduction to Pattern Recognition
Feature Spaces, Prototypes, Minimum-Distance Classifiers Feature Spaces, Prototypes, Minimum-Distance Classifiers

Feature Spaces Feature Spaces


Can think of each pattern as a point in a feature space:

Key ideas:
I Patterns from the same class should cluster together in feature
space
I Supervised training: learn the properties of the cluster
(distribution) for each class
I Unsupervised training: find the clusters from scratch

Introduction to Pattern Recognition Introduction to Pattern Recognition


Feature Spaces, Prototypes, Minimum-Distance Classifiers Decision Boundaries

Minimum-Distance Classifier Decision Boundaries


If we partition the feature space according to the nearest prototype,
we create decision boundaries:
I Idea:
Use a single prototype for each class i
(usually the classs mean mi )
I Training:
Just calculate each classs prototype (mean)
I Classification:
Assign unlabeled patterns to the nearest prototype in the feature
space

Introduction to Pattern Recognition Introduction to Pattern Recognition


Decision Boundaries Decision Boundaries

Simpler Calculation Linear Decision Boundaries


Key idea: Two-class case:
I We dont really need to know the distances to the 1 T
g1 (x) = xT m1 m m1
prototypeswe just want to know which is closest. 2 1
1
I Any monotonic function of the distance will also do: g2 (x) = xT m2 mT2 m2
scalar multiplication, log, exponent, reciprocal, negation, ... 2
Create a single decision function:
minimizing kx mi k
is the same as minimizing kx mi k
2 g(x) = g1 (x) g2 (x)
   
= (x mi )T (x mi ) 1 1
= xT m1 mT1 m1 xT m2 mT2 m2
= xT x 2xT mi + mTi mi 2 2
is the same as minimizing 2xT mi + mTi mi T 1 T T
= x (m1 m2 ) (m1 m1 m2 m2 )
is the same as maximizing xT mi 12 mTi mi 2
Introduction to Pattern Recognition Introduction to Pattern Recognition
Decision Boundaries Discriminants

Linear Decision Boundaries Discriminants


g(x) = wT x + w0
In general, a function used to test class membership is called a
where discriminant.
w = m1 m2
1 Three approaches for multiple classes:
w0 = (mT1 m1 mT2 m2 )
2 1. Construct a single discriminant for each class that separates i
from not i .
2. Construct a discriminant gij between each pair of classes i
and j :
g(x) > 0 Assign x to 1
gij (x) = gi (x) gj (x)
g(x) < 0 Assign x to 2
g(x) = 0 Undecided
3. Construct a single discriminant gi (x) for each class i , and
assign x to class i if gi (x) > gj (x) for all other classes j .

Introduction to Pattern Recognition Introduction to Pattern Recognition


Unsupervised Training Unsupervised Training

Unsupervised Training k -means

Motivation:
Goal: find natural groupings of patterns
I Minimum-distance classifiers assign patterns to the nearest
prototype
I Useful when you dont have a pre-labeled training set
I Each classs prototype should be at the mean of the classs
I More closely models neural organization training patterns
I Cant So...
I classify with labels either (since they werent learned) I Assign patterns to nearest prototype
I handle complicated distributions
I Update prototype to be the mean of the patterns assigned to it
I Clustering
I Repeat until convergence

Introduction to Pattern Recognition Introduction to Pattern Recognition


Unsupervised Training Unsupervised Training

k -means k -means
Requires:
I number of classes k Things to consider:
I minimum-distance classification I How do you know the number of classes?
I How do you seed the initial prototypes?
Algorithm:
I Zero-element clusters
Start with initial guesses at class prototypes (means) (jump to arbitrary new prototype and restart?)
Repeat I How good is the final clustering?
Assign each pattern to the nearest prototype mi (juggle, restart, and see if better)
Update each clusters prototype mi to be
the mean of the patterns assigned to it I Retry with more/fewer clusters?
until convergence or maximum number of iterations
Introduction to Pattern Recognition Introduction to Pattern Recognition
Unsupervised Training Unsupervised Training

Other Unsupervised Training Approaches Mixture Modelling


Alternatives:
I Self-Organizing Maps / Learning Vector Quantization Requires:
I number of classes k
I Like k-means but more gradual updating of the prototypes
I form of distributions p(x|i )
I Hierarchical (splitting)
I Top-down splitting of clusters until good enough I fraction of the training set comprised by each class

I Hierarchical (merging) Idea: for each possible set of parameters for the distributions, how
I Bottom-up merging of clusters until good enough well does the weighted sum (i.e., mixture) of their distributions match
the histogram of the training set?
I Cluster Swapping
I Moving of patterns from one cluster to another if nearer I Strength: handles all parameters and distributions
(like k-means), usually integrated into splitting or merging
I Weakness: complicated and not always solvable
approaches
I Mixture Modelling

Das könnte Ihnen auch gefallen