Вы находитесь на странице: 1из 4

Compact Clustering of Multidimensional Vectors

An algorithm for storing multidimensional temporal and spatial vectors.

Copyright © 2008 Douglas D. Hammer, Principal Investigator

clearlineofsight@gmail.com
Introduction

This paper explores the process of partitioning multidimensional vectors into clusters
representative of both spatial and temporal relation. This process extends the k-means clustering
to a root k-means, where each cluster organizes multiple time scales in relation to their spatial
orientation. Were vectors of multiple time frames measured by their dimension or length
establish a radial distance metric, in addition to their unit vector or direction that establishes their
principal direction or axis of rotation. In the clustering process k clusters are chosen to represent
multiple time scales, and centroid or root mean distance is measured from its closest vectors
relative to the cluster's maximum dimensional root mean distance. This process partitions vectors
within similar time frames, in addition to their spatial axis of direction. As an illustration of this
in the cluster map presented on the front page nine yellow vectors share equal event horizons.
Near this cluster two other vectors occupy a similar time horizon with root mean distances of a
greater angle of separation or variance of direction, placing them in the larger cluster.

Vector Normalization and Direction

In order to place vectors on a two dimensional plane vector sums and normals are calculated to
find the each vector's unit vector. Normals a generally calculated using the sqrt of sums squared.
The reason for raising the vector to it's dimensional power and taking this root can be observed
when finding differences of vectors in the range of [-1,..+1]. The equation below is the vector
sum:

Our next equation calculates the vector normal or magnitude. Finding the vector normal is vital
is the calculation of vector products, and angles between different vectors.

In the third equation we calculate the unit vector that has a magnitude of [0,..1] and is
representative of vector direction. This value in the next section is used to find the principal axis
of rotation or theta, to calculate a radial transform based on the vector's dimension.
Event Horizons of Vector Translation

Taking two vectors v1=[0] and v2=[0,0,0] from their global origin both have the same magnitude
or vector normal and the same sum. As a consequence their unit vectors under translation sit on
the same axis, where their dimensional component has a radial coefficient of v1=1 and v2=3.
Dimension is the distance from the center or radius normalized to their cluster's maximum vector
dimension. This keeps vectors with similar dimension within the same cluster, irrespective of
their direction. Where each axis represents the unit vector with a magnitude of [0..1]. The unit
vector is an indication of direction, where it is translated onto its principal axis of rotation. Here
theta is 360o .

In the transform above r is the radius or the scaled distance from their cluster's maximum
dimension and n is the vector's dimension. Where the center represents a universal time scale of
zero, for vectors of any dimension, radiating outwards to an infinite time scale. The equation
below indicates each cluster kj maximum vector dimension phi, for all vectors partitioned to the
jth cluster of k.

Calculating Centroids or Root Mean Cluster Centers

Each cluster has a set of vectors that is the basis for its center, or sometimes called the centroid.
This value for the centroid is the standard mean or average. After successive iterations the
clustering converges to a specific set of vectors. During this convergence the center of each
cluster is calculated, at each iteration.

This process continues until all of the vectors are partitioned into their own cluster. After the
clustering stops new centers are calculated. The final clustering draws centers for each cluster
connected to the locations of each of its vectors' locations. In addition changing the value of r in
the translation allows us to zoom in on a specific time scale and search for related events in
nearby clusters. The clustering of multidimensional vectors with the process of rK-means is an
efficient tool for analyzing the relationships on multiple scales of space and time, that mimics k-
means when the maximum vector dimension is equal to two. rK-means can also scale to
multidimensional vector data for time and frequency analysis applications, that allows us to
understand events on multiple time scales, and build better models.
(a) (b)

(d) (c)

The image to the left is a cluster of one hundred vectors at 18% of the
maximum time scale. Figure (a) at 20% shows the birth of two clusters. Figure
(b) at 23% the two clusters begin to diverge as seen in (c) at 25% the clusters'
reach a maximum divergence. In figure (d) at 45% the second cluster splits
into three separate clusters. At closer inspection two of the four clusters in
figure (d) have four vectors to the far left of the figure with different time
scales, as distance from their respective centroids. The two clusters share
similar structure in respect to their directional vectors. This observation is only
revealed when all of the vectors are viewed at 45% of their maximum
dimension or time scale.

Вам также может понравиться