PR Slide Spring 2017

Introduction to Pattern Recognition Introduction to Pattern Recognition
Introduction
What is Pattern Recognition?
Pattern recognition tries to answer one basic question:

Introduction to Pattern Recognition
What is it?
CS 650: Computer Vision Examples:

I What object is this based on shape, position?
I What kind of pixel is this based on local image properties?

Introduction Introduction
Features and Patterns Features
I Cant really classify things I Has to be something you can quantify

I Have to classify descriptions (can be scaler, vector-valued, ordinal, label, etc.)
I Examples: height, weight, age, gender, etc.
I Questions:
Terminology: I How many features?
Things we can measure features I Which ones?
Collections of features pattern I Relative importance?

Classes General classification process
I Pattern recognition doesnt really try to produce general, I Extract features

open-ended descriptions I Assemble features into a feature vector or pattern
I Really more of which of these possibilities is it? I Assign to a class
I Finite set of possibilities are called classes I Might be a single classification
I Might be a specific yes/no (two-class problem) I Might rank possible classifications
I Might be multiple options (e.g., OCR) I Might produce (relative) probabilities
Training Training Sets
How do you make a computer learn? The set of training examples is called the training set
Possibilities: Questions:
I Give it rules (some AI approaches, expert systems, etc.) I How large / how many?
I Give it lots of examples (pattern recognition, machine learning, I Source?
neural networks) I Generality?
I How good?
Giving it lots of examples is called training

Training vs. Testing Supervised vs. Unsupervised
Pattern recognition systems/problems generally have two phases:

I Training phase (let it learn)
I Testing phase (put it to work) Supervised training: Give the system example patterns
and known classifications
Some systems may do on-the-fly (online) training if there is some Unsupervised training: Give the system example patterns
way to get feedback and let if figure out natural groupings
I After classification, let it know whether it got it right or
wronglearn for the next time
I Example: OCR/proofreading

Introduction Feature Spaces, Prototypes, Minimum-Distance Classifiers
Approaches Patterns and Classes
A pattern is a vector of measured features:

I Statistical: training sets are samples from distributions, decisions
are based on statistically optimal classification x1
x2
I Structural:
x= .
..
I Pieces and parts (may be separate problem)
I Recognize the configuration xn
I Direct: iterative refinement of rules until it gets it right

(includes perceptrons, SVMs, neural approaches) Our goal is to assign each unclassified pattern x to one of a set of
classes {i }.
Feature Spaces, Prototypes, Minimum-Distance Classifiers Feature Spaces, Prototypes, Minimum-Distance Classifiers
Feature Spaces Feature Spaces

Can think of each pattern as a point in a feature space:
Key ideas:
I Patterns from the same class should cluster together in feature
space
I Supervised training: learn the properties of the cluster
(distribution) for each class
I Unsupervised training: find the clusters from scratch

Feature Spaces, Prototypes, Minimum-Distance Classifiers Decision Boundaries
Minimum-Distance Classifier Decision Boundaries

If we partition the feature space according to the nearest prototype,
we create decision boundaries:
I Idea:
Use a single prototype for each class i
(usually the classs mean mi )
I Training:
Just calculate each classs prototype (mean)
I Classification:
Assign unlabeled patterns to the nearest prototype in the feature
space

Decision Boundaries Decision Boundaries
Simpler Calculation Linear Decision Boundaries

Key idea: Two-class case:
I We dont really need to know the distances to the 1 T
g1 (x) = xT m1 m m1
prototypeswe just want to know which is closest. 2 1
1
I Any monotonic function of the distance will also do: g2 (x) = xT m2 mT2 m2
scalar multiplication, log, exponent, reciprocal, negation, ... 2
Create a single decision function:
minimizing kx mi k
is the same as minimizing kx mi k
2 g(x) = g1 (x) g2 (x)

= (x mi )T (x mi ) 1 1
= xT m1 mT1 m1 xT m2 mT2 m2
= xT x 2xT mi + mTi mi 2 2
is the same as minimizing 2xT mi + mTi mi T 1 T T
= x (m1 m2 ) (m1 m1 m2 m2 )
is the same as maximizing xT mi 12 mTi mi 2
Decision Boundaries Discriminants
Linear Decision Boundaries Discriminants

g(x) = wT x + w0
In general, a function used to test class membership is called a
where discriminant.
w = m1 m2
1 Three approaches for multiple classes:
w0 = (mT1 m1 mT2 m2 )
2 1. Construct a single discriminant for each class that separates i
from not i .
2. Construct a discriminant gij between each pair of classes i
and j :
g(x) > 0 Assign x to 1
gij (x) = gi (x) gj (x)
g(x) < 0 Assign x to 2
g(x) = 0 Undecided
3. Construct a single discriminant gi (x) for each class i , and
assign x to class i if gi (x) > gj (x) for all other classes j .

Unsupervised Training Unsupervised Training
Unsupervised Training k -means
Motivation:
Goal: find natural groupings of patterns
I Minimum-distance classifiers assign patterns to the nearest
prototype
I Useful when you dont have a pre-labeled training set
I Each classs prototype should be at the mean of the classs
I More closely models neural organization training patterns
I Cant So...
I classify with labels either (since they werent learned) I Assign patterns to nearest prototype
I handle complicated distributions
I Update prototype to be the mean of the patterns assigned to it
I Clustering
I Repeat until convergence

k -means k -means
Requires:
I number of classes k Things to consider:
I minimum-distance classification I How do you know the number of classes?
I How do you seed the initial prototypes?
Algorithm:
I Zero-element clusters
Start with initial guesses at class prototypes (means) (jump to arbitrary new prototype and restart?)
Repeat I How good is the final clustering?
Assign each pattern to the nearest prototype mi (juggle, restart, and see if better)
Update each clusters prototype mi to be
the mean of the patterns assigned to it I Retry with more/fewer clusters?
until convergence or maximum number of iterations
Other Unsupervised Training Approaches Mixture Modelling

Alternatives:
I Self-Organizing Maps / Learning Vector Quantization Requires:
I number of classes k
I Like k-means but more gradual updating of the prototypes
I form of distributions p(x|i )
I Hierarchical (splitting)
I Top-down splitting of clusters until good enough I fraction of the training set comprised by each class
I Hierarchical (merging) Idea: for each possible set of parameters for the distributions, how
I Bottom-up merging of clusters until good enough well does the weighted sum (i.e., mixture) of their distributions match
the histogram of the training set?
I Cluster Swapping
I Moving of patterns from one cluster to another if nearer I Strength: handles all parameters and distributions
(like k-means), usually integrated into splitting or merging
I Weakness: complicated and not always solvable
approaches
I Mixture Modelling

PR Slide Spring 2017

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

PR Slide Spring 2017

Загружено:

Авторское право:

Доступные форматы

Introduction to Pattern Recognition Introduction to Pattern Recognition

What is Pattern Recognition?

Pattern recognition tries to answer one basic question:

CS 650: Computer Vision Examples:

Introduction to Pattern Recognition Introduction to Pattern Recognition

Features and Patterns Features

I Cant really classify things I Has to be something you can quantify

Introduction to Pattern Recognition Introduction to Pattern Recognition

Classes General classification process

I Pattern recognition doesnt really try to produce general, I Extract features

Training Training Sets

Introduction to Pattern Recognition Introduction to Pattern Recognition

Training vs. Testing Supervised vs. Unsupervised

Pattern recognition systems/problems generally have two phases:

Introduction to Pattern Recognition Introduction to Pattern Recognition

Approaches Patterns and Classes

A pattern is a vector of measured features:

I Direct: iterative refinement of rules until it gets it right

Feature Spaces Feature Spaces

Introduction to Pattern Recognition Introduction to Pattern Recognition

Minimum-Distance Classifier Decision Boundaries

Introduction to Pattern Recognition Introduction to Pattern Recognition

Simpler Calculation Linear Decision Boundaries

Linear Decision Boundaries Discriminants

Introduction to Pattern Recognition Introduction to Pattern Recognition

Unsupervised Training k -means

Introduction to Pattern Recognition Introduction to Pattern Recognition

Other Unsupervised Training Approaches Mixture Modelling

Вам также может понравиться