Вы находитесь на странице: 1из 19

Data abstraction

What?

Why?

What?

How?
Why?

What?

How?
Why?

What?

How?
Why?

What?

How?

Dataset types
Tables

Items

Attributes

Trees and
networks
Items/nodes

Fields

Geometry

Grids

Items

Positions

Links
Attributes

Positions

Attributes

Clusters, sets, lists

Items

Static vs. dynamic

Attribute types
Categorical:
Ordered:

Ordinal:
Quantitative:

Sequential:
Diverging:
Cyclic:

Data semantics
The semantics of the data is its real-world meaning.
The type of the data is its structural or mathematical interpretation
At the data level, its an item, a link or an attribute?
At the dataset level, how are these data types combined to create a table, a
tree, a field
At the attribute level, what kind of mathematical operations are
meaningful?
Sometimes, semantics can be inferred from the syntax of the data file or
from the names of variables included, but many often, they must be
explicitly provided along with the dataset in order to be interpreted
properly: the metadata.

Basic data types


Attributes
Items
Links
Grids
Positions

Basic data types: attributes


An attribute is an specific property that can be measured, observed
or recorded.
Synonyms: variable, dimension.
Examples: salary, price, number of sales, number of neurons,
temperature,

Basic data types: items and links


An item is an individual entity that is discrete.
Examples: a row in a simple table or a node in a network.
A link is a relationship between items, usually within a network.

Basic data types: grids and positions


A grid defines an strategy for sampling continuous data in terms of
geometric and topological relationships between its cells.
A position is spatial data, typically in 2D or 3D.
Examples: latitude and longitude coordinates for determining the
location over the earths surface, three numbers that define a
location within the volume measured by a medical scanner.

Dataset types
A dataset is a collection of information that is subject of analysis.
The four basic dataset types are:
Tables
Networks
Fields
Geometry
In practice, in the real-world we can find complex combinations of
these four basic datasets.
Other ways of grouping items are clusters, sets and lists.
In addition, these datasets could be fully available immediately from a
static file or it might be dynamic data processed gradually in the form
of a stream.

Dataset types: tables


A table is a set of items structured in rows and columns.
Each row maps to an item.
Each column maps to an attribute.
Each cell is defined by a pair (row, column) and contains a value for
that pair.

Dataset types: tables


Point Position

Point Position X

Item

Attribute
Point Position Y

Point Position Z

Unit

Name

Time

Object

Group

Index

150.2389984

100.3249969

122.8769989 um

1 Measurement Points 4

Surpass Scene

150.2689972

100.3529968

122.9680023 um

1 Measurement Points 4

Surpass Scene

150.8320007

99.69300079

123.0920029 um

1 Measurement Points 4

Surpass Scene

150.8470001

99.71900177

122.9970016 um

1 Measurement Points 4

Surpass Scene

152.9389954

99.30999756

123.8359985 um

1 Measurement Points 4

Surpass Scene

152.9309998

99.3239975

123.9349976 um

1 Measurement Points 4

Surpass Scene

153.3959961

97.98100281

121.0680008 um

1 Measurement Points 4

Surpass Scene

153.4340057

98.01799774

121.1529999 um

1 Measurement Points 4

Surpass Scene

153.9600067

97.41999817

120.9160004 um

1 Measurement Points 4

Surpass Scene

153.9940033

97.45899963

121.0019989 um

1 Measurement Points 4

Surpass Scene

10

154.5910034

96.68399811

121.8769989 um

1 Measurement Points 4

Surpass Scene

11

154.6000061

96.68699646

121.9769974 um

1 Measurement Points 4

Surpass Scene

12

154.9589996

100.0250015

119.9179993 um

1 Measurement Points 4

Surpass Scene

13

154.947998

100.0240021

120.0169983 um

1 Measurement Points 4

Surpass Scene

14

154.822998

101.8209991

122.7170029 um

1 Measurement Points 4

Surpass Scene

15

154.7890015

101.8010025

122.8089981 um

1 Measurement Points 4

Surpass Scene

16

155.8009949

100.2139969

120.6389999 um

1 Measurement Points 4

Surpass Scene

17

Dataset types: networks and trees


Networks (graphs) are used to define relationships between two or
more items.
A network item is called a node (vertex).
A link (edge) is a relationship between two items.
Both, items and links, can have associated attributes.
A tree is an specific network with a hierarchical structure.
Trees are acyclic networks.

Dataset types: fields


Datasets based on fields discretize a continuous domain.
Cells can also collect attributes.
We must deal with some mathematical problems associated to
continuous domains: sampling and interpolation.
Types:
Spatial fields.
Grids.

Dataset types: fields


Spatial fields

Continuous domain data are often


found like spatial fields, where the
cell structure of the field is based
on sampling at spatial positions.
Most of these datasets appear in
the context of tasks where the
goal is to understand its spatial
structure, mainly its shape.
Scientific visualization (scivis) vs.
Information visualization (infovis).

Aerodynamic analysis of one car from Aston Martin by the English company TotalSim.

Dataset types: fields


Grids
If data are sampled with regular intervals, the cells define a
uniform grid.
This way, there is no need to explicitly store the geometry nor the
grid topology.
A rectangular grid is created from a non uniform sampling.
Structured grids allow the representation of specific geometric
shapes.
Non structured grids are more flexible, but they demand more
storage resources because we must explicitly keep both geometry
and topology.

Dataset types: geometric data


Geometric dataset describe the
shape of the items with explicit
spatial positions.
Items can be points, one
dimensional lines or curves, two
dimensional surfaces or regions, or
three dimensional volumes.
Spatial data often incorporate
hierarchical structures with
different levels of detail.
These datasets do not necessarily
have attributes, in contrast to the
other three basic dataset types.

Dataset types: other combinations


A set is just a collection of items.
A list is a group of ordered items.
A cluster is a group of items based on attribute values with some kind of
similarity.
A path in a network is an ordered set of segments composed by the links that
connect the nodes.
A compound network is a is a network with an associated tree: all of the nodes in
the network are the leaves of the tree, and interior nodes in the tree provide a
hierarchical structure for the nodes that is different from network links between
them.
Una red compuesta es una red que tiene asociada un rbol: todos los nodos de la
red son las hojas del rbol, y los nodos internos del rbol proporcionan una
estructura jerrquica distinta de los enlaces de la red.
In practice, we can find complex combinations of the basic datasets, but anyway,
we must always specify the data abstraction required for answering the question:
What data do you want to see?

References
Tamara Munzner. Visualization Analysis and Design. A K Peters
Visualization Series. CRC Press. Nov. 2014.
Stuart K. Card, Jock Mackinlay and Ben Shneiderman. Readings in
Information Visualization: Using Vision to Think. Morgan Kaufmann,
1999.

Вам также может понравиться