Вы находитесь на странице: 1из 27

Spatial Data Services

Statistical Analysis for


ArcView

Reference Manual
June 1999

List of Topics
1. Licensing

Page
3
Liability
License Agreement
Copyright
Trademark Acknowledgment

2. Over-View

Who is SDS Spatial Data Services?


What is Gstats?
Re-Sale Opportunities
Functionality
ArcView Requirements
Platform
What is Gstats Pro?

3. Using Gstats

Loading Gstats
Unloading Gstats

4. Null Values

Setting a Null Value


Clearing a Null Value

5. Charts

6. Data in Plan

Ternary Diagram
Histogram Chart
Scatter Chart
Probability Plot

Spider Plot
Percentile Plot
Bi-Variate Plot
Point Labeling

8
9
11
12

13
14
15
16
17

7. Processing & Reporting

18

Correlation
Level to Background
Statistics
Regression
Concatenation
Percentile values
Cumulative values
Adding a unique record identifier

19
20
22
23
24
25
26
27

Licensing
Liability
SDS Spatial Data Services accepts no liability arising from the use of Gstats software or use
of Gstats documentation. SDS accepts no responsibility for technical errors or omissions
associated with Gstats software.

License Agreement
Gstats is licensed for your use. The software is licensed for single use and may not be
transferred to a 3rd party for any reason. As the licensee you are granted permission to make
copies of the software for backup purposes only.
The license can be dis-continued by returning the original software to SDS and destroying
any copies in your possession. SDS can discontinue your license at any time without reason.
No refund is available for discontinued software.
In no event will SDS be liable to you for any damages, including any lost profits,
lost savings or other incidental or consequential damages arising out of the use of or inability
to
use this program, or for any claim by any third party.
In accepting a license for GSTATS, you agree to the license terms described above and
acknowledge that SDS will be in no way responsible for any damages resulting from use of
the software by yourself or any 3rd party.

Copyright
SDS is the owner of the GStats software and all associated documentation.
The software and documentation are protected by Copyright 1999. Neither may be
reproduced without written permission by GSTATS.

Trademark Acknowledgments
GStats is a registered trade mark of SDS.
ArcView is a registered trade mark of ESRI.
Windows 95 and Windows NT are registered trademarks of Microsoft.

Over-View
Who is SDS?
SDS Spatial Data Services specialize in the development of GIS applications for the
resources sector. SDS are licensed by ESRI to develop software for ArcView and to resell
ESRI ArcView GIS software.

What is Gstats?
Gstats has been developed to assist in the analysis and presentation of point data using ESRI
Arcview . An advanced version of Gstats is under development which provides additional
gridding, contouring and visualization functionality.

Contacting SDS.
Address
25 Richardson St, West Perth, Western Australia, 6005
P.O.Box 943, West Perth, Western Australia 6872
Telephone + 61 8 9486 7587 Mobile 0412 509 356
Fax + 61 8 9322 2994
Internet
E-mail; info@sds.au.com
WebSite http//www.sds.au.com

Functionality
Gstats is an ArcView Extension which enhances the existing statistical capability of
ArcView .
The software has been developed to provide basic statistical tools, which are easy to
understand and use. Many graphing, table and plan display functions are available.

ArcView Requirements
ArcView 3.0+ is required to run GSTATS Gstats.
Gstats has been developed to work with standard ArcView , reducing the cost of purchase of

additional ArcView extensions such as 3D Analyst or Spatial Analyst. Additional functionality


is available in the form of Gstats Pro, which requires the ArcView extension Spatial
Analyst.

Platform
GStats is written entirely in ArcView Avenue and as such is platform independent. Gstats
can run with Arciew on Windows 95/NT, UNIX or Apple MacIntosh.

For Windows 95 or NT , a Pentium Processor with 32 Meg of RAM is required.

What is Gstats Pro?


Gstats Pro provides additional functional not available in Gstats. Many Gstats tools require

the availability of ArcView Spatial Analyst. Tools are available which allow for the gridding
and contouring of point data and extraction of information from surface data.

Using GStats
Installation
1. Copy the file gstats.avx into your AVHOME\ext32 directory. This will often be
C:\ESRI\AV_GIS30\ARCVIEW\EXT32.
2. Copy the file gstats.bmp into your AVHOME\ext32 directory. This will often be
C:\ESRI\AV_GIS30\ARCVIEW\EXT32.

To Load Gstats
Gstats is an ArcView
interface.

extension which can easily be loaded into your existing ArcView

1. Select File | Extensions from the ArcView


appear.

menu. The Extensions dialogue box will

2. Scroll down the list of extensions and select the Gstats check box.
3. Select OK to have the Gstats extension available in the current project or Make Default
to have Gstats automatically load into all projects.
As Gstats is being loaded, a dialog will be displayed.

To Un-Load Gstats
1. Select File | Extensions from the ArcView
appear.

menu. The Extensions dialogue box will

2. Scroll down the list of extensions and select the Gstats check box (which should already
be selected).
3. Select OK.
The Gstats extension will no longer be available in your current project.

Null Values
Description
Null values are data values which are flagged to represent no data present. This is different
to 0, which could be a valid number.
When a Null value is set, all Gstats functions that perform analysis of data will exclude null
values from the analysis.
ArcView
applied.

does not store no data values. Instead, wherever a value has no data, 0 is

Setting a Null Value


1. From the initial GStats menu check the Null tool.
2. In the following input, enter a number. The last selected null is provided by default. A nonnumerical entry will not be accepted.
The Null value will remain set whilst the Null check remains set.

Clearing a Null Value


1. From the GStats menu uncheck the Null tool.

Charts
Introduction
ArcView provides many different chart types as standard.
These include;
Bar Charts
Scatter Charts
Line Charts
Pie Charts

Chart Querying and Selections


Charts allow data to be interrogated like a theme in a view. Attributes of a chart point can be
interactively displayed.
A selection made on a table or theme, controls which data will be displayed in a chart.
From within a chart, chart elements can be removed, changing the original table or theme
selection.
Gstats will automatically link charts, tables and themes. This can cause delays in processing as multiple
ArcView documents are updated in response to changes in the current document. If processing
becomes slow, remove some of the links between linked tables.
Many standard chart functions are available. Refer to the ESRI guide Using ArcView GIS
for a full description on the creation of charts.

Gstat Charts
Gstat provides a number of additional chart functions not available in the standard release of
ArcView 3.1.
These include;
Ternary Diagrams
Probability Plots
Complex Scatter Plots
Theme and binned histograms (bar chart).

Chart Limits
The standard ArcView charting imposes a limit of being able to display a maximum of
around 100 points in a chart.
The number of points which Gstats can process is only limited by memory. Gstats also allows
standard ArcView Charts of any size to be created.

Ternary Diagram
Description
Provides a graphical summary of values from 3
different fields along separate axis.
The axis forms a triangle with the sum of values for
any given point equaling 100 percent.
When a field representing a unique ID is selected,
each point is linked to its source, whether a table or
theme in a view.
Selections in the resulting Ternary Diagram can be viewed interactively with the source data.
Required Input

3 different numeric fields for each axis.


An output point theme, (you will be prompted for this upon selection of the create
button).
An output polygon theme, (you will be prompted for this upon selection of the create
button).
Optional

A field representing unique values, allowing data points to be linked with the source
table.
Check to have selected or all records processed.
Check to create a report describing the processing
Processing
All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Not part of the current selection (if Use Selected was checked).
Output

A new view comprising the ternary frame and related points.


Two new themes are created;
A point theme displaying the relationship between the input fields.
A polygon theme displaying the bounding tri-angle.

Histogram Chart
Description
Displays a histogram of values for a selected
field.
Histograms can be used to visualize the
distribution of data.
For tables or themes with many records, data
from a selected field can be binned.
Two methods for binning data is possible;
linear, where the data is placed into bins
whos ranges are evenly determined across the data range OR percentile, whereby bins are
created according to the calculated percentile value determined from input percentile classes.
When data is binned, the display cumulative option must be checked to display the
binned data cumatively. When Bin Data is not selected, the selected field will always be
displayed cumatively.
See Percentile Values for important information on the results of calculating percentile
intervals.

Required Input

A field whos values will be used in the construction of the histogram.


If a classified histogram is to be created, a theme with a classified legend must be the
current data source.

Optional

Check to display the cumulative values for a field.


A chart title
Check to bin data.
Check to select a bin method.
Entry of number of bins for linear or a list of white-space separated
numbers for percentiles.

Check to have selected, or all records processed.

Dependents

If cumulative values are used, a field representing unique values for each record is
required.

Processing
All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Null
Not part of the current selection (if Use Selected was checked).

Output

A Bar Graph, linked to input source.


A table of cumulative values if this option is checked.
A table of binned data values if Bin Data is checked.

10

Scatter Chart
Description
Graphically displays the
between values of 2 fields.

relationship

Values are scaled along each axis (2)


with a point symbol displayed at the
intersection point.
Calculation of regression is also included
and is added to the scatter title.
The chart is linked to the input data
source; theme or table. Changes in one selection will be reflected in the other. Output
cumulative or percentile tables can be linked to the original data source to provide a
interaction with the original source document.
All charting functions associated with ArcView scatter charts are available.

Required Input

2 fields, one for each axis.

Optional

Check to display cumulative values on the Y axis.


Check to display percenitle values on the X axis.
Check to display the X or Y axis in log form.
Check to have selected, or all records processed.
A chart title

Dependents

If cumulative or percentile values are used, a field representing unique values for each
record is required.
An output table name if cumulative or percentile values are displayed.

Processing
All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Null
Not part of the current selection (if Use Selected was checked).

If cumulative or percentile options are checked, a table to store each type of data will be
created.

Output

A Scatter Graph, linked to the input source.


A table of cumulative or percentile values if either of these options are checked.

11

Log Probability Chart


Description
Graphically displays the distribution of data in a
single field.
Inflection points can be identified where there
is a significant change in grade.
The X axis is displays the percentile value of
each point whilst the Y axis displays the
cumulative frequency of each point.
All charting functions associated with ArcView scatter charts are available.

Required Input

1 numeric field.
A field whos value represents a unique record.

Optional

A chart title.
Check to have selected, or all records processed.

Dependents

Nil.

Processing
All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Null
Not part of the current selection (if Use Selected was checked).

A table is created to store cumulative data.


A table is created to store percentile data.

Output

A log probability plot, linked to the input source.


A table of cumulative or percentile values if either of these options are checked.

12

Data in Plan
Introduction
Users of ArcView would be familiar with the ease at which point data can be displayed in
plan.
Data can be made to look very different given a different classification method, selection
subset or color display.
Gstats provides a number of tools which enhance ArcView s ability to display and plot data
in plan.
These include;
Percentile Plot.
Spider Plot.
Point Labelling.
Bi-Variate Plot.

13

Spider Plot
Description
A spider plot is an effective way to
visualize multiple values for a
single point.
Up to 8 values, representing
intervals of 45 degrees from 0 to
360 can be plotted.
All selections not equal to None
in the Plot Field list are used.
For each value, a line of specified
color is plotted, with its end point calculated according to the orientation and size of the line.
The line size is defined by the proportion of the current value to the maximum value for the
field being plotted.
The plotting of lines can be further refined by selecting a cutoff above which values will be
considered.
Definition of an intended plot scale and maximum line size allows creation of lines suitable for
plotting.

Required Input
A field and a color for a single orientation.
A scale.
A maximum line length in mm.

Optional
Check to have selected, or all records processed.

Dependents
NIL

Processing
All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Null
Not part of the current selection (if Use Selected was checked).
Less than the specified cut-off for a nominated field.

Output

For each orientation selected (up to 8), a graphics group is created which represents a
layer of information which can be deleted, edited or removed from the current view.
A new view is created which contains a legend for referral and plotting.

14

Percentile Plot
Description
A very effective way to visualize
data is to rank and display data by
percentile.
A percentile plot displays data in
plan,
where
each
symbol
represents a different percentile
range.
In some applications, it is useful to
examine the low or high
percentiles
to
gain
an
understanding
of
outstanding
values, distinct from the remainder
of the population.
See Percentile Values for important information on the results of calculating percentile
intervals.

Required Input
A check to display a percentile. An associated percentile value (0 to 100), size and color.
Selection of a field to display percentile value for.

Optional
Check to have selected, or all records processed.

Dependents
If no records are selected in the current theme, the Use Selected box will not be available.

Processing
All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Null
Not part of the current selection (if Use Selected was checked).
Records which are not in the current selection are excluded if the the Use Selected box is
ticked.

Output
A new theme is added to the current view, which displays the classes of percentiles
nominated in the input dialog.
If the the Use Selected box was selected, points not in the original selection will not be
displayed.

15

Bi-Variate Plot
Description
Values of 2 fields are classified
independently producing the same
number of classes in each case.
These classes are combined in a
legend classification with each class
being displayed with the same symbol
color and size.

Required Input

2 Fields
A Classification method for each field.
A display color ramp.
The number of classes to generate for each field.

Optional
NIL

Dependents

f no records are selected in the current theme, the Use Selected box will not be
available.

Processing
All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Null
If all values within a field are equal, no classes can be defined. As a result, the message
Cannot create classes for field", "Data is all equal" will appear.
If no data is valid, the message
"No valid data in field", "Cannot create display" will appear.

Output

A new theme is added to the current view, whos legned displays multiple classes
created from the combination of the 2 input fields.

16

Point Labeling
Description
Gstats allows easy labeling of
numerical data relative to a point.
Values from 8 fields can be plotted
at 1 of 8 label placements.
A cutoff for each field can be
defined. Values equal to or above
this value will be plotted.
For each field, the color of text and
angle of text can also be defined.

Required Input
A field, and associated selection of cutoff, angle, size and color.
A scale.

Optional
Check to have selected, or all records processed.

Dependents
If no records are selected in the current theme, the Use Selected box will not be available.

Processing
All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Null
Not part of the current selection (if Use Selected was checked).
Less than the specified cut-off for a nominated field.

Output
For each selected field, labels will be drawn on the current view. A graphics group is created
for each field, which represents a layer of information which can be deleted, edited or
removed from the current view.

17

Processing & Reporting


Introduction
Processing data involves applying a constant or formula to data in a logical manner to output
new data. The new data may reveal new information, which was previously unseen.
In the standard ArcView , there are many ways to process data to generate new information.
These include;
Summarizing data.
Calculating new data using a combination of numerics, constants and functions.
Classifying data into groupings based on the data distribution.
Ratioing values from 2 fields.
Gstats provides additional functionality in the form of;
A leveling tool.
Calculation of correlation co-efficients.
Calculation of regression.
Calculation of advanced statistics.
In addition, 2 utility tools are provided which enable you to;
Concatenate values from many fields into a single field.
Add a unique record number to each record.
The leveling tool in particular, may require the pre-creation of a concatenated field to
represent groups of data formerly defined in many fields.
Assigning a unique value for each record allows allows many tools to link newly generated
data to the input data source. This is the case of many of the charting functions.

18

Correlation Coefficient
Description
A correlation coefficient is a statistic that describes how similar the
distributions of two columns of data are.
A correlation coefficient will be between 0 and 1.
A 0 value indicates that no correlation exists, whilst 1 indicates exact
correlation.
Correlation coefficients can be viewed in a popup message or written
to a table for permanent storage and usage.

Required Input
At least 2 fields for which a correlation will be tested.

Optional
Check to have selected, or all records processed.
Check to have records output to a table.

Dependents
If no records are selected in the current theme, the Use Selected box will not be available.

Processing
All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Null
Not part of the current selection (if Use Selected was checked).

Output
A popup report of correlation values or a table of information of the same data.

19

Level to Background
Description
A table of data can contain within it, groupings of
data, which unless standardized in some way are
not easily compared.
The process of standardizing data is known as
leveling, the result of which allows data to be
uniformly compared.
Leveling may be undertaken in a number of ways,
depending on the data being leveled.
When leveled by percentile, each group of information has its nominated percentile value
calculated, with this value being used to ratio all data for the current group.
In addition, the mean of the lower selected percentile value can be used as the level by
value.
When leveled by value, a value within a group will be ratiod to the level value.
Each field within the current table can be leveled in a different way.

Required Input

Selection of at least one field for which data will be leveled, (use the ON/OFF buttons
to select/de-select the current field).
Selection of a field which contains a unique value for each record.
Selection of a field which contains values which represent groupings of data.
Selection of a method of leveling for each selected field.

Optional

Check to have selected, or all records processed.


Check to have processing reported to a file.
Check to use the mean of the calculated percentile value .

Dependents

If no records are selected in the current theme, the Use Selected box will not be
available.
When the percentile method of leveling is selected; a percentile value must be input.

Processing
In some cases, a group field may exist, containing a single value, which can be used to group
the data during processing. Sometimes however, a group may be defined by values in more
than 1 field. If this is the case, you may need to combine values from multiple fields into a
single field prior to leveling your data, (see Concatenate).

20

Level to Background Cont.


All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Null
Not part of the current selection (if Use Selected was checked).
Processing is undertaken according to collections of data derived from the grouping field.

Output

A table is created of the output levelled data, containing the fields; <unique_id>,
<group fld>, <fld>_rat.
If selected, a report is generated, describing the levelling process; the groupings and
associated levelling method and values for each field.
If a value of 0 results from the calculation of a percentile value or its mean, then the
following entry will be placed in the output report file;
"Not Processed. Group <grp> for <fld> returned a value of 0 for percentile <apct>."
If no valid records exist in the current grouping, the following is reported to the output file
"Not processed. Group <grp for <fld> has no valid records.
For groups which have been successfully processed, the following is reported to the
output file;
"Processing Successful. Group <grp> "for" <fld> Percentile value of <apct> = <aval>

21

Statistics
Description
Many different types of statistics are useful for analyzing data.
The following statistics are reported, or written to a table as required;
Sum
Count
Maximum
Minimum
Mean
Median
Midrange
Harmonic Mean
Quadratic Mean
Mode
Range
Variance
Standard Deviation

Required Input

If statistics are to be reported, a field must be selected.


If statistics are to be written to a table, one or more fields must be selected.

Optional

Check to have selected, or all records processed.

Dependents

If no records are selected in the current theme, the Use Selected box will not be
available.

Processing
All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Null
Not part of the current selection (if Use Selected was checked).

Output

A table or report of statistical information.

22

Regression
Description
Least squares regression produces a line of best fit
between 2 variables; one which is considered to be
correct (the dependent variable), and the other which
contains error (independent variable). Regression
minimizes the error.

Required Input

A dependent field.
A independent field.

Optional

Check to have selected, or all records processed.

Dependents

If no records are selected in the current theme, the Use Selected box will not be
available.

Processing
All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Null
Not part of the current selection (if Use Selected was checked).

Output

A report detailing the regression value, intercept and slope

23

Concatenate
Description
Concatenation involves the combination of values from multiple fields into
a single new field for each selected record.
Concatenation of data can be useful to combine data to create a unique
identifier.
Some functions, such as the Gstats leveling tool may require that you
concatenate field values to create a grouping field.

Required Input

Selection of more than 1 field.


Definition of a new field name.

Optional

Check to have selected, or all records processed.

Dependents

If no records are selected in the current theme, the Use Selected box will not be
available.

Processing

The table containing the fields to be concatenated must be editable.

All valid records are processed. A record may be invalid if it is;


equal to the current Null value
Null
Not part of the current selection (if Use Selected was checked).

Output

A field is added to the existing table representing a combination of values from the input
fields.

24

Percentile Values
Description
A very effective way to visualize data is to rank and
display data by percentile.
Percentile data is used in many Gstats functions and
written to tables on the fly. In each case a new percentile
table is created.
Creating a table of percentile values for multiple fields
prior to processing can make the management of
percentile data more effective.
The number of records assigned to each percentile class will always be correct. The value
range for each class is a weighted average. There is no way to precisely calculate the value
range for each class. Discrepancies may exists between the percentile range and the number
of samples in that range, In this case, the number of records for the calculated percentile is
correct.
Once a percentile table has been created, it can be joined to an existing table by a common
field identifier.

Required Input

A field defining a unique identifier, which can be used at a later date to join (relate) data
to the output table.
At least one field for which percentile values will be created.

Optional

Check to have selected, or all records processed.

Dependents

If no records are selected in the current theme, the Use Selected box will not be
available.

Processing
All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Null
Not part of the current selection (if Use Selected was checked).

Output

A table is created containing an id field for each input field and an output percentile field
called <field>_pct.
A percentile value in any given row has no relationship with a percentile value of another
field in the same row.
The output table can be used in the creation of scatter charts, log probability plots and
any other chart you wish to create.

25

Cumulative Values
Description
Many Gstats functions make us of cumulative data.
Cumulative data allows different groups of information
to be identified from a single dataset.
Creating a table of cumulative values for multiple fields,
prior to processing can make the management of
cumulative data more effective.
Once a cumulative table has been created, it can be
joined to an existing table by a common field identifier.

Required Input

A field defining a unique identifier, which can be used at a later date to join (relate) data
to the output table.
At least one field for which cumulative values will be created.

Optional

Check to have selected, or all records processed.

Dependents

If no records are selected in the current theme, the Use Selected box will not be
available.

Processing
All valid records are processed. A record may be invalid if it is;
equal to the current Null value
Null
Not part of the current selection (if Use Selected was checked).

Output

A table is created containing an id field for each input field and an output cumulative field
called <field>_cum.
A cumulative value in any given row has no relationship with a cumulative value of
another field in the same row.
The output table can be used in the creation of scatter charts, log probability plots and
any other chart you wish to create.

26

Adding a unique record identifier


Description
Many Gstats options require the presence of a unique value for each record in a table. The
unique value is most often used to relate newly created data to the original table. Use this tool
to add a unique record number to each record.

Required Input

Selection of an editable theme or table.

Optional

NIL.

Dependents

NIL.

Processing

This tool adds a unique identifier for each record, beginning at 1 and ending with <count
records>. If a field called recno already exists in the current table, this field can be
updated.

Output

An updated or newly created field containing record numbers.

27

Вам также может понравиться