Академический Документы
Профессиональный Документы
Культура Документы
Cluster Analysis
1
Copyright © Ivy Professional School - All Rights Reserved
2
Copyright © Ivy Professional School - All Rights Reserved
Euclidean Distance
3
Copyright © Ivy Professional School - All Rights Reserved
K-Means clustering – How it works
4
Copyright © Ivy Professional School - All Rights Reserved
Fix how many clusters – E.g. 2, assign centroids.
Measure distance to create clusters
5
Copyright © Ivy Professional School - All Rights Reserved
Average of the group as new centroid
6
Copyright © Ivy Professional School - All Rights Reserved
Reassign the data points into clusters
7
Copyright © Ivy Professional School - All Rights Reserved
Keep repeating the process till the
centroids don’t change anymore.
8
Copyright © Ivy Professional School - All Rights Reserved
Check if data can be clustered..!
Hopkins Test
9
Copyright © Ivy Professional School - All Rights Reserved
How many clusters?
10
Copyright © Ivy Professional School - All Rights Reserved
Goodness of Fit of the clusters
11
Copyright © Ivy Professional School - All Rights Reserved
Internal Validation
1. Connectivity - what extent items are placed in the same cluster as their nearest
neighbors in the data space. It has a value between 0 and infinity and should
be minimized.
3. Dunn index - It is the ratio between the smallest distance between observations
not in the same cluster to the largest intra-cluster distance. It has a value
between 0 and infinity and should be maximized.
12
Copyright © Ivy Professional School - All Rights Reserved
Stability Validation
13
Copyright © Ivy Professional School - All Rights Reserved
External Validation
14
Copyright © Ivy Professional School - All Rights Reserved
Visit Ivy’s Blog for Career Tips, Latest Info, Job Alerts -
www.ivyproschool.com/blog
Interact with us at -
15
Copyright © Ivy Professional School - All Rights Reserved