K Means Cluster Analysis in SPSS


Assign cases to a fixed number of groups ( clusters) whose characteristics are

not yet known but are based on a set of specified variables

Characteristics of a good cluster as possible

Efficient : Uses as few clusters as possible

Effective : Captures all statistically and commercially important clusters


Construction of initial cluster centers

Have the procedure select k well-spaced observations for the cluster centers

Assign cases to clusters based on distance from the cluster centers

Updates the locations of cluster centers based on the mean values of cases in
each cluster

These steps are repeated until any reassignment of cases would make the
clusters more internally variable or externally similar

Case Question

- Classify the records based on K-Means Cluster Analysis using 3 clusters

Data for SPSS.sav / Health related sav

First save the selected variables as standardized, done as below…

Analyze / Descriptives / Check Save as standardized values as variables

Now run K-Means cluster analysis…

Analyze / Classify / K-Means cluster analysis/ select all the standardized variables …, specify
the no. of clusters..

In the output , i.e., in Initial Cluster Centers , each value is the Z-score…
From ANOVA table output , higher the F-Value , the higher contribution to the cluster