Вы находитесь на странице: 1из 3

Lazy Learning

(Or Learning from Your Neighbors)


Lazy Learning vs. Eager Learning
 Lazy learning vs. Eager learning
– Eager learning
 Given a set of training set, constructs a classification model before receiving new
(e.g., test) data to classify
 e.g. decision tree induction, Bayesian classification, rule-based classification
– Lazy learning
 Simply stores training data (or only minor processing) and waits until it is given a new
instance
 Lazy learners take less time in training but more time in predicting
 e.g., k-nearest-neighbor classifiers, case-based reasoning classifiers

Typical approaches of lazy learning:

–k-nearest neighbor approach

 Instances represented as points in a Euclidean space.

–Case-based reasoning

 Uses symbolic representations and knowledge-based inference Locally weighted


regression

k-Nearest-Neighbor Method

–first described in the early 1950s

–It has since been widely used in the area of pattern recognition.

–The training instances are described by n attributes.

–Each instance represents a point in an n-dimensional space.

–A k-nearest-neighbor classifier searches the pattern space for the k training instances that are
closest to the unknown instance.

 Requires 3 things:
o Feature Space (Training Data)
o Distance metric
 to compute distance between records
o The value of k
 the number of nearest neighbors to retrieve from which to get
majority class
 To classify an unknown record:
o Compute distance to other training records
o Identify k nearest neighbors
o Use class labels of nearest neighbors to determine the class label of unknown
record

 k = 1:

– Belongs to square class

 k = 3:

? – Belongs to triangle class

 k = 7:

– Belongs to square class

 Choosing the value of k:

o If k is too small, sensitive to noise points

o If k is too large, neighborhood may include points from other classes

o Choose an odd value for k, to eliminate ties

 Common Distance Metrics:

o Euclidean distance (continuos distribution)

Examples
Name Acid Durability Strength class
Type-1 7 7 Bad
Type-2 7 4 Bad
Type-3 3 4 Good
Type-4 1 4 Good
Test – data -----> acid durability = 3 , and strength = 7 , class= ?
 Calculated using distance measure

Name Acid durability Strength Class Distance


Type-1 7 7 Bad Sqrt((7-3)2+(7-7)2)=4
Type-2 7 4 Bad 5
Type-3 3 4 Good 3
Type-4 1 4 Good 3.6

Name Acid durability Strength Class Distance Rank


Type-1 7 7 Bad 4 3
Type-2 7 4 Bad 5 4
Type-3 3 4 Good 3 1
Type-4 1 4 Good 3.6 2
K =1
Name Acid durability Strength Class Distance Rank
Type-1 7 7 Bad 4 3
Type-2 7 4 Bad 5 4
Type-3 3 4 Good 3 1
Type-4 1 4 Good 3.6 2
Based on immediate neighbor, Good
K=2
Name Acid durability Strength Class Distance Rank
Type-1 7 7 Bad 4 3
Type-2 7 4 Bad 5 4
Type-3 3 4 Good 3 1
Type-4 1 4 Good 3.6 2
Based on two neighbor, Good
K=3
Name Acid durability Strength Class Distance Rank
Type-1 7 7 Bad 4 3
Type-2 7 4 Bad 5 4
Type-3 3 4 Good 3 1
Type-4 1 4 Good 3.6 2
Based on three neighbor, 2 Good and 1 Bad, majority---> Good

Вам также может понравиться