Вы находитесь на странице: 1из 110

Geographic Information Systems

(GIS)
Chi-Farn Chen
CSRSR NCU
tel: 03-4227151-57624
fax: 03-4254908
e-mail: cfchen@csrsr.ncu.edu.tw

CSRSR NCU

Vector and Raster Data Model

CSRSR NCU

Model and GIS

representation of reality model


GIS itself is based on a model of complexity
GIS is used to model complexity
full representation of reality?

Data model =
limited representation of reality

CSRSR NCU

Data Model
Reality is too complex for even the most
sophisticated GIS software, so in order to
represent reality in a spatial database, a
simplification of reality is created. This
simplification is known as a data model.
In a data model, reality is represented by
geometry and attributes.

CSRSR NCU

Data Model and Reality


Spatial Data
(RASTER)

Spatial Data
(VECTOR)

Real World
Source: Defense Mapping School
National Imagery and Mapping Agency
CSRSR NCU

GIS Data Formats


There are two formats used by GIS
systems to store and retrieve spatial
data:
Vector
Raster
Vector

Raster

CSRSR NCU

Vector Format
Data are associated with points, lines, or areas
Points are located by coordinates
Lines are described by a series of points join-the-dots-books
(Arcs nodes, vertices)
Areas are described by a series of lines enclosing
the area.
(Polygons)

CSRSR NCU

Vector Format
Any attributes (name or code) can be associated
with a point, line or polygon.
Data are stored in two files:
a file containing information of coordinates
a file containing information of the attributes
A third file contains information needed to link
positional data with their attributes (Identifier).

CSRSR NCU

Features have unique identifiers:


point ID, line ID, polygon ID

common identifiers provide link to:


coordinates table (for where)
attributes table (for what)

5
4
2

3
X

Point
1
2
3
4
5

Coordinates Table
ID
x
y
1
3
2
1
4
1
1
2
3
2

Point
1
2
3
4
5

Attributes Table
ID
model
a
b
b
a
c

CSRSR NCU

year
90
90
80
70
70

Raster Format
Data are divided into cell, pixels (picture elements)
Pixels are organized in arrays
Row and Column Numbers (coordinates) are used
to identify the location of the pixel within the
array.
Each pixel has a single value (attribute)

CSRSR NCU

Raster Format
column
8

pixel:
Coordinate: (2,9)
Attribute: 8

row

CSRSR NCU

Vector and Raster Representation of


Point Map Features
Map Feature

GIS Vector
Format

(X,Y)
Coordinate in space

GIS Raster
Format

Pixel Located
in an Array

CSRSR NCU

Vector and Raster Representation of


Line Map Features
Map Feature

GIS Vector
Format

GIS Raster
Format

CSRSR NCU

Vector and Raster Representation of


Area Map Features
Map Feature

GIS Vector
Format

GIS Raster
Format

CSRSR NCU

Vector Model
Vector model uses discrete points, lines and/or areas
corresponding to discrete objects with name or code
number of attributes.

Raster Model
Raster model uses regularly spaced grid cells in specific
sequence. An element of the grid cell is called a pixel
(picture element). The conventional sequence is row by row
from the top to bottom and then column by column from
the left to the right. Every location is given in two
dimensional image coordinates; row number and column
number, which contains a single value of attributes.
CSRSR NCU

Comparison of Raster and Vector Formats


Vector
Vector formats are efficient
when comparing information
whose geographical shapes
and sizes are different.
Vector files are much
smaller because a relatively
small number of vectors can
precisely describe large areas
and many attributes can be
described to these areas.
Vector representations of
shapes can be very precise.

Raster
Raster formats are efficient
when comparing information
among arrays with the same
cell size.
Raster files are generally
very large because each cell
occupies a separate line of
data, only one attribute can
be assigned to each cell, and
cell sizes are relatively small.
Raster representations are
relatively
coarse
and
imprecise.
CSRSR NCU

Vector Model

CSRSR NCU

Vector Model
There are different models to store and manage
vector information. Each of them has different
advantages and disadvantages.

Spaghetti Model (list of coordinates)


Vertex Dictionary Model
Topological Model (Dual Independent Map Encoding: DIME)
Arc / Node Model

CSRSR NCU

Vector representation without model

CSRSR NCU

Spaghetti Model (List of coordinates)

simple and easy to manage


lots of duplication, hence need for large storage space
very often used in CAC (computer assisted cartography)

no topology

z
z

CSRSR NCU

No Topology

CSRSR NCU

Spaghetti data often contains


crossing lines, loose ends,
double digitization of
common boundaries (slivers)
Its a mess!

CSRSR NCU

Sliver Polygons
Sliver polygons are small, narrow polygon
features that inevitably appear along
borders of polygons following the overlay of
two or more geographic data sets.
Often occur when a shared boundary arc is
digitized twice.
Should be removed, but difficult to find.

CSRSR NCU

Sliver Polygons

CSRSR NCU

Vertex Dictionary Model

no duplication, but still this model does not use topology


CSRSR NCU

Topological Model (DIME) / (TIGER)

z
z

assigns a directional code in the form of a


"from node" and a "to node
nodes (intersections of lines) are identified
with codes
developed by US Bureau of the Census
both street addresses and UTM coordinates
are explicitly defined for each link
CSRSR NCU

Topology

CSRSR NCU

The GIS vector data model is slightly more complex


as each vertex, arc, node and polygon is uniquely
identified and the relationships between them are
stored in the database The relationships between
the elements of a vector data model, in terms of
relative location and connections, are known as
Topology. Topology gives the vector data model a
level of intelligence which means that the GIS can
recognize which arcs are joined to each other, and
identify those polygons which are adjacent to each
other.

CSRSR NCU

Topology:
A GIS topology is a set of rules and behaviors
that model how points, lines, polygons share
geometry. For example, adjacent features,
such as two countries, share a common edge.
Simple definition:
Topology stores the relationships of one
spatial element with respect to another.

CSRSR NCU

Topology:
Topology is a mathematical approach that
allows us to structure data based on the
principles of feature adjacency and feature
connectivity.
It is in fact the mathematical method used to
define spatial relationships. Without a
topologic data structure in a vector based GIS
most data manipulation and analysis
functions would not be practical or feasible.

CSRSR NCU

Defining Topology
Topology :
the spatial relationships between features in a GIS
Do polygons overlap?
Do lines intersect or connect?
Are points located near each other?

CSRSR NCU

Topology:
Three main concepts
Connectivity
Arc-node topology
Definition of areas / Containment
Polygon-arc topology

Adjacency/Contiguity
Left/right topology

CSRSR NCU

Topology : spatial relationship


Connectivity:
The topological identification of connected arcs by recording the from- and to-node
for each arc. Arcs that share a common node are connected.
Arcs connect to each other at nodes
Describing linear network, links have direction.

Area:
Arcs that connect to surround an area define a polygon

Containment:
Accounts for polygons within polygons islands
Describing which landscape features are located within, or intersect, the boundary of
polygons

Adjacency:
The identification of adjacent polygons by recording the left-hand and right-hand
polygons.
Arcs have direction and left and right sides
Describing a landscape features neighbor
CSRSR NCU

Concept of Topology
Topology distinguishes GIS data models from nontopological data models supported by many CAD, mapping
and graphics systems
Topology refers to knowledge about relative spatial
positioning of features.
knowledge about how features are connected and which features
are adjacent to each other.

Can be viewed as a mathematical procedure that


determines spatial properties and relationships, including:
Connectivity, contiguity (adjacency)
Lengths of arcs and areas of polygons

CSRSR NCU

Topology Rules for Coverages:


Each arc has a beginning node and an ending
node - this determines directionality.
Directionality is determined during digitizing.
Actual direction is important only if your
application requires directional modeling.

Arcs connect to other arcs at nodes


Connected arcs form polygon boundaries - arc
coordinates are stored only once because two
adjacent polygons share the common arc
between them.
Arcs have polygons on their left and right sides
CSRSR NCU

Topology Concept I
Arc-node topology is how Arc/INFO keeps track
of which arcs are connected to other arcs through
shared nodes (nodes are endpoints of arcs). It
defines length, direction, and connectivity for arcs.

The from-node is an arcs starting point; the to-node


is its ending point. They are determined as you
digitize your data. You can see the from-node and
to-node whenever you list attribute records for a
coverage containing lines. Arcs connect if they
share a node.
CSRSR NCU

Topology Concept II
Polygon-arc topology expresses the relationship
between the arc features and the polygon features
for which the arcs create boundaries. It defines
area and adjacency. Arcs or a set of arcs that form
a closed figure define the area of a polygon. Two
polygons are adjacent if they share an arc. Polygons
are stored as a list of arcs to avoid redundancy.

CSRSR NCU

Topology Concept III


Left-right topology refers to contiguity -- how
polygons are associated with their neighboring
polygons. Each arc has a list of which polygons are
on the right side and which are on the left side.
Commands in Arc/INFO use this information to
determine from one polygon what the adjacent
polygons are: 1
5
4

2
3

6
7

CSRSR NCU

Vector Data Model


Points: represent discrete point features
each point location
has a record in the
table

airports are point features


each point is stored as a
coordinate pair
CSRSR NCU

Vector Data Model


Lines: represent linear features
each road segment
has a record in the
table

roads are linear features


CSRSR NCU

Vector Data Model


Lines: fundamental spatial data model
node
vertex

vertex

vertex vertex
node

Lines start and end at nodes


line #1 goes from node #2 to node #1
Vertices determine shape of line
Nodes and vertices are stored as coordinate pairs_
CSRSR NCU

Vector Data Model


Polygons: represent bounded areas
each bounded polygon
has a record in the
table

landforms and water are


polygonal features
CSRSR NCU

Vector Data Model


Polygons: fundamental spatial data model

Polygon #2 is bounded by lines 1 & 2


Line 2 has polygon 1 on left and polygon 2 on right_
CSRSR NCU

Vector Data Model


Polygons: fundamental spatial data model

complex data model, especially for larger data sets


arc-node topology, only used for ArcInfo data sets_
CSRSR NCU

Connectivity
Arc-node topology

CSRSR NCU

Definition of areas
Polygon-arc topology

CSRSR NCU

Adjacency/Contiguity
Left/right topology

CSRSR NCU

Arc / Node Model

CSRSR NCU

Arc / Node Model


File 1. Coordinates of nodes and vertex for all the arcs

ARC

F_node

Vertex

T_node

3.2, 5.2

1, 5.2

1,3

1,3

1.8,2.6 2.8,3 3.3,4

3.2, 5.2

1,2

3.5,2 4.2,2.7

5.2,2.7

File 2. Arcs topology


ARC

F_node

T_node

R_poly

L_poly

External

External

External

External

File 4. Nodes topology

Node

Arcs

1,2

1,2

File 3. Polygons topology


Polygon

Arcs

1, 2
CSRSR NCU

Geometry and Topology of Vector Data


The geometry of a point is given by two dimensional
coordinates (x, y), while line, string and area are given
by a series of point coordinates.
The topology however defines additional structure as follows
Node : an intersect of more than two lines or strings, or start
and end point of string with node number
Arc : a line or a string with chain number, start and end
node number, left and right neighbored polygons
Polygon : an area with polygon number, series of arcs that
form the area in clockwise order (minus sign is
assigned in case of anti-clockwise order).
CSRSR NCU

CSRSR NCU

Topological Relationships between Spatial Objects


Point-Pont Relationship
"is within" : within a certain distance
"is nearest to" : nearest to a certain point
Point-Line Relationships
"on line" : point on a line
"is nearest to" : a point nearest to a line
Point-area Relationships
"is contained in? : a point in an area
"on border of area" : a point on border of an area
Line-Line Relationships
"intersects" : two lines intersect
"crosses" : two lines cross without an intersect
"flow into" : a stream flows into the river
Line-Area Relationship
"intersects" : a line intersects an area
"borders" : a line is a part of border of an area
Area-Area Relationships
"overlaps" : two areas overlap
"is within" : an island within an area
"is adjacent to" : two area share a common boudary
CSRSR NCU

CSRSR NCU

Topology Review
Topology is the spatial relationship between
connecting or adjacent features in a geographic
data layer.
A procedure used by the computer to explicitly define
and store the spatial relationships between connecting
or adjacent coverage features.

Think of topology as geometry on a rubber sheet.


This type of geometry is concerned with spatial
relationships rather than ridged coordinate location.

CSRSR NCU

Topology Review
If a map is stretched and distorted, some
properties change:
Distances
Angles
Relative proximities

CSRSR NCU

Topology Review
Other properties (topological properties) remain
constant after distortion:
Adjacency
Containment
Connectivity
Areas remain areas, lines remain lines, points
remain points

CSRSR NCU

Topology Review
z

By tracking all the arcs that meet at any node,


topology knows which arcs connect to each other.
(Arc-node topology)
A list of arcs is used to construct the polygon.
Storing each arc only once reduces the amount of
data and ensures that the boundaries of adjacent
polygons do not overlap.
(Polygon-arc topology)
It is easy to find the similar characteristic between
adjacent polygons.
(Left-right topology)
CSRSR NCU

Topology Review
Vector Topology helps deal with:

slivers

overshoots
dangles
Not sharing border
CSRSR NCU

Bureau of the census


Address matching to convert street addresses to
geographic coordinates and census reporting zones
With geographic coordinates, data could be
aggregated to user-specified custom reporting zones

DIME files were the major component of the


geocoding approach
TIGER

CSRSR NCU

Address Matching
Address Matching or Geocoding:
A list of addresses is converted to
points on a map by referencing them
to a special street network.

CSRSR NCU

TIGER Address Range Example

CSRSR NCU

Address Matching
Address Matching or Geocoding:
Two input files and one output file
Input:
A database (dbf) file that has the address list that
needs to be geocoded.
A geographic base file or reference layer (commonly
street layer) that will spatially reference the address
location with the address database (input1).

Output:
This will be a point file that will hold the geocoded
address locations with an attribute file that shows the
full address and the matching accuracy.
CSRSR NCU

What do you match?


Tabular data
Text
Databases

to

Geographic maps
TIGER Streets
ZIP Codes
CSRSR NCU

Raster Model

CSRSR NCU

Raster Model
z

A raster data model uses a


grid.
One grid cell is one unit or
holds one attribute.
Every cell has a value, even if
it is missing.
A cell can hold a number or
an index value standing for an
attribute.
A cell has a resolution, given
as the cell size in ground units.
CSRSR NCU

Raster Data Structures

Square grid: equal length sides

conceptually simplest
cells can be recursively divided into
cells of same shape
4-connected neighborhood (above,
below, left, right)
all neighboring cells are
equidistant

commonly occurs for lat/long


when projected
data collected at 1degree by 1
degree will be varying sized
rectangles

8-connected neighborhood (also


include diagonals)
all neighboring cells not
equidistant
center of cells on diagonal is 1.41
units away (square root of 2)

rectangular

triangular (3-sided) and


hexagonal (6-sided)
all adjacent cells and points are
equidistant

triangulated irregular
network (tin):
vector model used to represent
continuous surfaces (elevation)
more later under vector
CSRSR NCU

Raster Model (Grid, Image)


column
row

pixel:
Coordinate: (2,9)
Attribute: 3

Resolution
(pixel size)
choose raster pixel size 1/2 the length (1/4 the area)
of smallest feature to map (smallest feature called
minimum mapping unit or resel--resolution element)
CSRSR NCU

Assignment scheme
The value of a cell may be:
z
z
z
z

an average over the cell


a total within the cell
max or min or the commonest value in the cell
the value found at the cells central point

CSRSR NCU

Assignment scheme

Line assignment

Polygon assignment
CSRSR NCU

The mixed pixel problem

Water dominates

Winner takes all

Edges separate

W W

W G

W E

W W

W W

W E

W W

W G

CSRSR NCU

Raster Data
z Each

cell can be owned by only one feature.


z Raster is easy to understand, easy to read and
write, and easy to draw on the screen.
z Spatial analytical operations are faster
z Grids are poor at representing points, lines and
areas, but good at surfaces.
z Grids are a natural representation for scanned
or remotely sensed data.
z Grids suffer from the mixed pixel problem.
z Grid compression is easier (techniques used in
GIS are run-length encoding and quad trees).
CSRSR NCU

Raster Data Sources

z Scanned
zB

maps

& W aerial photos

z Color

aerial photos

z Satellite

images

CSRSR NCU

Run-Length Encoded Compression

A A A D D D
D D D B B B
B B B B B B
C C C C D D
D D D B B B

Uncompressed:
AAADDDDDDBBBB
BBBBBCCCCDDDD
DBBBBBAAAA
Run-Length Encoded:
3A6D9B4C5D5B4A

B B A A A A
CSRSR NCU

Data Compression
Runlength Compression (for single layer)
Full Matrix--162 bytes
111111122222222223
111111122222222233
111111122222222333
111111222222223333
111113333333333333
111113333333333333
111113333333333333
111333333333333333
111333333333333333

Run Length (row)--44 bytes


1,7,2,17,3,18
1,7,2,16,3,18
1,7,2,15,3,18
1,6,2,14,3,18
1,5,3,18
This is a lossless
1,5,3,18
compression, as
opposed to lossy,
1,5,3,18
since the original
data can be exactly
1,3,3,18
reproduced.
1,3,3,18

Now, GIS packages generally rely on commercial


compression routines. Pkzip is the most common, general
purpose routine. MrSid (from Lizard Technology)and
ECW (from ER Mapper) are used for images. All these
essentially use the same concept. Occasionally, data is still
delivered to you in run-length compression, especially in
remote sensing applications.

Value thru column coding.


1st number is value, 2nd is
last column with that value.
CSRSR NCU

Basic Quadtree Compression


10

Basic Quadtree
Structure

11

11

0
13

12

10

13

12

200

201

210

211

200

201

210

211

202

203

212

213

202

203

212

213

2
22

23

2
22

23

G = Gray W = White
Quadtree Compression:
G0,G10,G11,G12,W13,G200,G201,G210,G211,W202,W203,W212,
W213,W22,W23,W3
NOTE: There are many variations of the quadtree compression.
The one above represents one of the most basic.
CSRSR NCU

Data Compression
Quad Tree Representation (for single layer)
Essentially involves compression applied to both row and column.

sides of square grid divided evenly


on a recursive basis
length decreases by half
# of areas increases fourfold
area decreases by one fourth

Resample by combining (e.g.


average) the four cell values
although storage increases if save all
samples, can save processing costs if
some operations dont need high
resolution

for nominal or binary data can save


storage by using maximum block
representation
all blocks with same value at any one
level in tree can be stored as single
value

3.25

3.5
2.5

4
4
2

3
4

Layer Width Cell


Count
1
1
1
2
2
4
3
4
16
4
8
64
5
16
256
6
32
1024

4
2

1 1
1
1 1 1 1

store this quadrant


as single 1
store this quadrant
as single zero

1
I
1,0,1,1
III 0,0,0,1

II 1
IV 0

CSRSR NCU

Raster Array Representations for multiple layers

How organize into a one


dimensional data stream for
computer storage & processing?
each characteristic in a separate file
elevation file, temperature file, etc.
good for compression
good if focus on one characteristic
bad if focus on one area

Band Interleaved by Pixel (BIP)

B
B

Band Sequential (BSQ)

all measurements for a pixel grouped together


good if focus on multiple characteristics of
geographical area
bad if want to remove or add a layer

Veg
Soil
III

IV

II

150 160

120 140
Note that we start in lower left.
Upper left is alternative.

File 1: Veg
File 2: Soil
File 3: El.

Elevation

A,B,B,B
I,II,III,IV
120,140,150,160

A,I,120, B,II,140 B,III,150 B,IV,160

Band Interleaved by Line (BIL)

rows follow each other for each characteristic

A,B,I,II,120,140 B,B,III,IV,150,160

CSRSR NCU

File Formats for Vector Spatial Data

They are the most complex because there are


many ways to store coordinates, attributes,
attribute linkages, database structures, and
display information. Some of the most
common formats are briefly described
below.

CSRSR NCU

File Formats for Vector Spatial Data


Arc Export
Arc Export is a transfer format, either ASCII or
compressed into binary, used to transfer files between
different versions of ARC/INFO. It is undocumented and
will work only with ESRI products.
ARC/INFO Coverages
An ARC/INFO "coverage" is a set of internal binary files
used by ARC/INFO, a GIS program. This file format is
proprietary and not readily usable by other programs.

CSRSR NCU

File Formats for Vector Spatial Data


ArcView Shape
The shapefile format defines the geometry and attributes of
geographically-referenced features in as many with specific
file extensions that must be stored in the same project
workspace. They are:
.shp - the file that stores the feature geometry. (required)
.shx - the file that stores the index of the feature
geometry. (required)
.dbf - the dBASE file that stores the attribute
information of features. (required)
.sbn and .sbx - the files that store the spatial index of the
features. (optional)

CSRSR NCU

ARC/INFO vs. ArcView


ARC/INFO is a topologically based hybrid system
ArcView is a file based, non topological, pseudo
object-oriented graphic data structure

CSRSR NCU

File Formats for Vector Spatial Data


ArcGIS Geodatabase
ArcGIS has a well-defined model for working with data. This
generic model, called the geodatabase (short for geographic
database), defines all the types of data that can be used in
ArcGISfor example, features, rasters, addresses, and
survey measurementsand how they are represented,
accessed, stored, managed and processed. The geodatabase is
a common framework shared by all ArcGIS products and
applications.

CSRSR NCU

File Formats for Vector Spatial Data


ArcGIS Geodatabase
The geodatabase offers you the ability to
z
z
z

Handle rich data types.


Apply sophisticated rules and relationships.
Access large volumes of geographic data stored in both files
and databases.

CSRSR NCU

File Formats for Vector Spatial Data


ArcGIS Geodatabase
z

ArcGIS supports a collection of files in a file system or a


collection of tables in a relational database management
system (RDBMS).

Such as several well-known data set types such as


coverages, shapefiles, grids, images, and triangulated
irregular networks (TINs).

And manages the same types of geographic information in


an RDBMS such as DB2, Informix, Oracle, SQL Server,
or Microsoft Access.

CSRSR NCU

File Formats for Vector Spatial Data


Geodatabase Data Management
Two categories:
Personal Geodatabase
Single user editing
Stored in MS Access
Size limit of 2 GB
ArcSDE Geodatabase
Enterprise
Supports multiuser editing via
versioning
Requires ArcEditor or ArcInfo Editor
to edit

CSRSR NCU

File Formats for Vector Spatial Data


Comparison of the file and geodatabase implementations
File-Based Data Sets
Coverages
Shapefiles
Grids
TINs
Images
Vector Product Format files
Computer-aided design files
Geography markup language
Tables
XML

Geodatabase
DB2 with its Spatial type
Informix with its Spatial type
SQL Server
Oracle
Oracle with Spatial or Locator
Personal geodatabases (Microsoft
Access)

CSRSR NCU

File Formats for Vector Spatial Data


AutoCAD" Drawing Files (DWG)
DWG is the internal, proprietary format used in AutoCAD
software, which is a computer-aided design/drafting (CAD)
program. Despite its proprietary nature, AutoCAD can
convert any DWG file to a DXF file (described below)
without loss of graphic information. As with DXF files,
there are a number of ways to store attribute information
in DWG files. The emerging standard is one that uses
Extended Entity Data (EED) to link attributes, but many
others are possible. However, the lack of one standard for
linking attributes can cause problems when data is
transferred between systems.

CSRSR NCU

File Formats for Vector Spatial Data


Autodesk's Data Interchange File (DXF) Format
DXF is probably the most widely used vector data
transfer format, and a file in DXF format offers some
very strong advantages. It contains very complete
display information, and almost every graphics
program can read it. However, there are several
different ways to store attribute information in DXF
and to link DXF entities to external attributes. Because
there are no attribute standards, many programs that
claim to read DXF files still do not import attribute
information properly.

CSRSR NCU

File Formats for Vector Spatial Data


z

Digital Line Graphs (DLG)


DLG, a transfer format used by the US Geological
Survey (USGS), depicts vector information portrayed
on printed paper maps. It carries very accurate
coordinate information and sophisticated, featureclassification information but no other attribute data.
DLG does not include any display information. The
DLG standard is significant because the USGS and
other US government agencies have used it to publish
large numbers of digital maps.

CSRSR NCU

File Formats for Vector Spatial Data


MapInfo" Data Transfer Files (MIF/MID)
MIF/MID is a transfer standard used by MapInfo, a
desktop mapping system. It carries all three types of GIS
information: geographic, attribute, and display. Attribute
links are implicit in the file format.
MapInfo Map Files.
MapInfo has its own internal binary format, known as a
map file. It is undocumented and proprietary, so it cannot
be used outside a MapInfo system.

CSRSR NCU

File Formats for Vector Spatial Data


MicroStation Design Files (DGN)
DGN is the internal format used by Bentley Systems
Inc.'s MicroStation, a CAD program. It is well
documented and standardized, so it may also be used as
a transfer standard. DGN files contain detailed display
information. The most common way to store attributes
is to place them in an external database file and record
links in the MSLINK field-a data item carried for each
element in the DGN file.

CSRSR NCU

File Formats for Raster Spatial Data


The generic raster data model is actually implemented in several different
computer file formats:
GRID is ESRIs proprietary format for storing and processing raster data
Standard industry formats for image data such as JPEG, TIFF and MrSid
formats can be used to display raster data, but not for analysis (must
convert to GRID)
Georeferencing information required to display images with mapped
vector data
Requires an accompanying world file which provides locational
information
Image
TIFF
Bitmap
BIL
JPEG

Image File
World File
image.tif
image.tfw
image.bmp image.bpw
image.bil image.blw
image.jpg
image.jpw
CSRSR NCU

Viewing File
Most importantly, file information includes organizing it so
that people can logically use it without having to know
anything about its physical structure.
The difference between logical and physical:

LOGICAL VIEW

Focus on how you need to arrange and access information


to meet your particular needs.

PHYSICAL VIEW

Deal with how information is physically arranged, stored,


and accessed on some type of secondary storage device.

CSRSR NCU

Logical View and Physical View (Vector Data)


Logical View - Shape file (*.shp)

CSRSR NCU

Logical View and Physical View (Vector Data)


Physical View - Shape file (*.shp)

CSRSR NCU

Logical View and Physical View (Raster Data)


Logical View - Imagine file (*.img)

CSRSR NCU

Logical View and Physical View (Raster Data)


Physical View - Imagine file (*.img)

CSRSR NCU

Metadata
Metadata is data about data

CSRSR NCU

Metadata
Allows a producer to fully describe a dataset so
that users can understand the assumptions and
limitations
and
evaluate
the
dataset's
applicability for their intended use.

CSRSR NCU

What is Metadata?
Metadata should contain possible answers for the following
questions:
Who collected the original data and who is responsible for the
dataset?
What is the purpose of the dataset?
What will users find from this dataset?
What elements are mandatory and what are optional?
What terminology standard and scales have used in this dataset?
How can users access to the dataset?
What geographic area(s) does this dataset cover?
What type of transfer protocol is needed to receive the dataset?

CSRSR NCU

What is Metadata?
Resume of spatial data
Who?
When?
How?
What?
Where?
Cost?
Purpose?

CSRSR NCU

Need for Metadata Standards


Metadata standard will promote:

The proper use and effective retrieval of geo-spatial


data to facilitate the organization and management
of geographic data to provide information about an
organization's database to others

CSRSR NCU

Essential Elements of Metadata for Spatial Information


1. Identification [IDEN]
2. Data Quality [QUAL]
3. Spatial Data Organization [SDOR]
4. Spatial Reference [SREF]
5. Distribution [DIST]
6. Entity and Attributes Information [ENTI]
7. Metadata Reference [REFE]

CSRSR NCU

[IDEN] Identification
[TIT] Title: What is the name of the data set?
[AUT] Author: Who developed the data set?
[COV] Area Coverage: What geographic area does it cover?
[THE] Themes: What themes of information does it include?
[CUR] Currentness: How current are the data?
[RES] Restriction: Are there restrictions on accessing or using the data?
[QUAL] Data Quality
[ACC] Accuracy: What is the positional and attribute accuracy?
[COM] Completeness: Are the data complete?
[LCO] Logical Consistency: Were the consistency of data verified?
[LIN] Lineage: What data were used to create the data set, and what
processes were applied to those sources?

CSRSR NCU

[SDOR] Spatial Data Organization


[VEC ]Vector: Has vector model been used to encode the spatial data?
[RAS] Raster: Has raster model been used to encode the spatial data?
[TNE] Type and Number of Elements: What type and how many spatial
objects are there?
[SREF] Spatial Reference
[PRO] Projection: What map projection method was used to represent the
location of spatial objects?
[LOL] Longitude/Latitude: Are coordinate locations encoded using
longitude and latitude?
[GRI] Grid System: Is a gird system such as the State Plane Coordinate
System used?
[DAT] Datum: What horizontal and vertical datums are used?
[COO] Coordinate System: What parameters should be used to convert the
data to other coordinate system?

CSRSR NCU

[DIST] Distribution
[DIS] Distributor: From whom can one obtain the data?
[ENTI] Entity and Attributes Information
[FEA] Features: What geographic features are included (roads, houses,
elevation, temperature)?
[ATT] Attributes: What characteristics of those features are included?
(lengths, widths, heights)
[AVA] Attribute Values: What parameters are used to represent the
characteristics of features?
[FOR] Formats: What formats are available?
[MED] Media: What media are available?
[ONL] Online: Are the data available online?
[PRI] Price: What is the price of the data?

CSRSR NCU

[REFE] Metadata Reference


[CUR] Currentness of Metadata: When were the metadata compiled?
[RES] Responsible Party: By whom the metadata compiled?
[CIT] Citation: Recommended reference to be used for the dataset.

CSRSR NCU

Currently, a number of metadata exist.


z

USA: Content Standards for Digital Geospatial


Metadata, Federal Geographic Data Committee
(FGDC) http://www.fgdc.gov/metadata/metadata.html
International Organization for Standardization (ISO):
ISO CD 15046 - Part 15: Geographic Information Metadata
Open GIS consortium (OGC), a private sector
initiative, was formed in 1994 for developing software
specifications to advance geoprocessing interoperability across the GIS industry.

OGC has been working very closely with ISO/TC 211 in identifying the
overlap and division of labor in mutual work programs. The formation of
the ISO/TC 211 - OGC coordination group is a result of such efforts.
CSRSR NCU

Metadata Example ( Image.Lan File )


Position

Field

Type

Byte 0

HDWord

Char[6]

Byte 6

IPACK (Bits)

Short int

Byte 8

NBands

Short int

Byte 10

UnUsed

Char[6]

Byte 16

IColumn

int

Byte 20

IRow

int

Byte 24

XStart

int

Byte 28

YStart

int

Byte 32

UnUsed

Char[56]

Byte 88

MapType

Short int

Byte 90

NClass

Short int

Byte 92

UnUsed

Char[14]

Byte 106

IAUTYP

Short int

Byte 108

ACRE

float

Byte 112

XMap

float

Byte 116

YMap

float

Byte 120

XCell

float

Byte 124

YCell

float

CSRSR NCU

Data Exchange
Spatial Data Transfer System (SDTS)
SDTS, a new transfer format developed by the US government,
was designed to handle all types of geographic data. SDTS
can be either binary or ASCII but is generally binary.
Virtually all geographic concepts can be encoded in SDTS,
including coordinate information, complex attribute
information, and display information. This versatility causes
a corresponding increase in complexity. To simplify things,
several standard subsets of SDTS have been adopted. The
first of these, the Topological Vector Profile (TVP), is used to
store certain types of vector maps. SDTS can also be used for
raster information. Not much data is available in SDTS
format at this time, nor do many software systems support it.
However, it will be the foundation of the US National Spatial
Data Infrastructure (NSDI). Its importance will increase as
more NSDI data becomes available.
CSRSR NCU

Вам также может понравиться