Академический Документы
Профессиональный Документы
Культура Документы
Machine Learning
Clustering
Lt Suryaprakash Kompalli
Senior Mentor, INSOFE
• Targeted marketing
– It is not uncommon to have over 100,000 segments in insurance
clustering
At each iteration, pick two data points that have least distance between them. Add the points
into a cluster.
Averaging may also be used instead of taking distance to the closest point.
BOS/NY/
DC/CHI/
MIA
DEN/SF/
LA/SEA
SF/LA/SEA
BOS/NY/
SF/LA SEA DC/CHI/
DEN
BOS/NY/
SF LA DEN
DC/CHI
•
X (3,1,4) X
(4,3,4)
(2,3,7)
Y Y Y
(4,0,6) (7,1,6)
Data: (3, 1, 4), (2, 3, 7), (4, 3, 4), (2, 1, 3), (2, 4, 5), (6,
1, 4), (5, 2, 5), (1, 4,4), (0, 5, 7), (4, 0, 6), (7, 1, 6)
The best place for students to learn Applied Engineering 21 http://www.insofe.edu.in
KD-Tree Clustering
From left to right: Original Image, Red: Region of interest selected by user,
Green: Known foreground object, Intermediate result, Final result
A color signature is a set of representative colors,not
necessarily a sub set of the input colors.
The set of background samples from any image
typically consists of a few hundreds of thousands of
colors.
Clustering reduces the background sample to its
representative colors, usually about a few hundreds.
Nice
UNDERSTANDING
DISTANCE
2 , 𝑦2)
(𝑥
Eucledian
| 𝑦 2 − 𝑦 1|
𝑥 2 − 𝑥 1|
|
1 , 𝑦1)
(𝑥 Manhattan
b c *m W* W* *m
dist( xi , x j )
a b c d
Weighted Hamming Distance Weighted Jaccard Distance
bc
dist( xi , x j )
a b c
Distance between different ordinals may not be same !!! On the right, distance
between state 1 and state 2 is much less than distance between state 1 and state 3.
E.g.: If ordinals represent education: class 5, Class 10, Degree. Distance between
class 5 and class 10 may be considered less than distance between class 10 and
degree
(-1,1) (1,1)
x x
(-1,-1) (1,-1)
Plot 63/A, 1st Floor, Road # 13, Film Nagar, Jubilee Hills, Hyderabad - 500 033
This presentation may contain references to findings of various reports available in the public domain. INSOFE makes no representation as to their accuracy or that the organization
subscribes to those findings.