You are on page 1of 31

Geospatial Modeling Environment

Data Assembly, Part III
GIS Cyberinfrastructure Module
Day 5

R Questions?

Address any R questions from the tutorial
Become familiar with the Geospatial
Modeling Environment (GME)
Complete and export a dataset suitable for
species distribution modeling
Data Management

Computing Notes
GME is the replacement for Hawthes Tools,
which are no longer updated and are not
guaranteed to function with ArcGIS 9.3 and
If you are running ArcGIS 9.2 or lower, GME will
not run, but you can use Hawthes Tools instead
To obtain GME, follow the installation
instructions here:
NOTE: You MUST be running Arc10 and have all GMEassociated software to run the currently available version
of GME

GME Functionality
Why use GME?
It formally replaces Hawthes Tools for ArcGIS
versions 9.3.1 and above
Hawthes Tools often function with 9.3.1, but not
always. Technical support for Hawthes Tools has
also ceased.

GME contains some of the same functions as

Hawthes Tools, plus added tools
GME (and Hawthes Tools) conducts analyses that
are either not available in ArcToolbox, or run more
efficiently than tools in ArcToolbox

GME Functionality
How does GME work?
GME commands are entered in the GUI, but
processed in R, using ArcGIS only when
Older versions are run from within ArcMap,
but still process commands in R

The GME interface looks the same whether you are
running the stand-alone version or the version that runs
from within ArcMap

Command line
Menu of
Version and
Use Instructions


Command History
Once you enter a command, it will disappear from the
command line
The entered command will appear below the version and
instructions information, along with any processing notes
or errors.
You cannot cut and paste code from the command
history back into the command line
To avoid re-typing potentially long code, I strongly
suggest that you write your functions in a text editor
(Notepad, Word, etc.) and paste them into GME from
there. Any edits can then be made quickly and the
revised function re-pasted into GME

Basic Command Setup

GME commands behave like R packages
They are set up as:
function_name(required input*, optional input*)
*there may be multiple inputs of each type
 Type buffer in the command line (no quotes)
You will see all of the required and optional
inputs displayed below the version and
instruction information

Command Setup
For the buffer function, there are three required inputs: in,
out, and distance
There are three optional inputs: units, copyfields, and where

Buffer Example
 Open a text editor and modify this function for your data to
calculate a 150 ft buffer around linear hydro features:
 The input shapefile should be your linear hydro shapefile
that was clipped to New England and projected to Albers
 Can you reason what the output will be? How is this similar
or different to the ArcToolbox Buffer tool?
 Run the command in GME and add the resulting shapefile to
your map.

Data Check
 You should have X files, all projected in Albers

Equal Area. These should already be clipped to

New England. Files provided for you are in
black, files you created are in blue.
2 climate rasters (MAT & MAP)
Species observation points
Species observation point buffers
New England Boundary

Data Management
 If your current map is very cluttered and disorganized, I
strongly recommend starting a new map and adding only
the layers you now need
 Use the Group function to further organize your layers
 Recall: hold down the Control key, click the layers you want to
group in the T of C, then right click and select Group

 Open the attribute table of the point and the point buffer
layers and remove any unneeded fields from previous
processing errors

Data Summarization
Some variables are most informative when considered at
the landscape scale.
We have 2 layers to summarize within the point buffers
(landscape scale): roads and LULC
Roads: sum length of road within buffer
LULC: calculate percentage of each category within
GME can be used for these analyses
We then need to append these landscape summarized
variables along with climate, elevation, slope, and aspect
values to the point observations.

Sum Roads in Buffers

 The Sum Line Length in Polys function in GME will sum
the lines of a specified input line file contained within
user specified polygons.
 This is a tool that calculates lengths, thus the input
datasets MUST be in a projected coordinate system
 Type sumlinelengthinpolys in the GME command line
to see the required and optional tool inputs
 Modify this function to run on your data:

 Once the tool is complete, open the attribute table of

your point buffers to see the result

Thematic Raster Summary

LULC data are an example of thematic data the
numeric categories represent classes rather than
real numeric values
We can use the isectpolyrst command in GME to
calculate the percentage of each point buffer
represented by each LULC class
Enter isectpolyrst in the GME command line to see
the required and optional inputs.

Thematic Raster Summary


Thematic Raster Summary

 To run the isectployrst command, modify the
following function for your file path:
isectpolyrst(in="C:\Users\Jenica\Documents\UConn\GIS Course\Day
raster="C:\Users\Jenica\Documents\UConn\GIS Course\Day2\ne_lulc",

The results should be added automatically to the

in file attribute table

Thematic Raster Summary

 Open the buffer layer attribute table to view the results
 Each field is named luV#, which corresponds to the prefix we
defined (lu) plus V#, where there is one # for each LULC category
(see LULC layer for definitions of numeric classes)
 Keep in mind that the outputs are proportions of each LULC type in
each buffer. Raw pixel counts in each class could be obtained by
omitting the proportion = TRUE statement. Proportions could
then be custom calculated after all geoprocessing is complete and
the dataset is exported.

This tool can also be used to summarize
continuous rasters (e.g., climate) within
polygons (e.g., buffers)
 What tool have we already used that does
this type of summarization?
Use the isectpolyrst command in GME to
summarize the MAT and MAP climate
rasters within the point buffers

 Where does your data assembly stand?
We have a point buffer layer with roads and
LULC summarized
We need to append those buffer values to the
specimen points, along with point values for
MAT, MAP, elevation, slope, and aspect
 How might you proceed?

Data Organization
 Organize your map data
Group the following layers that you will need for final assembly:
Processed point buffers (with road_sum and luV# fields)
Specimen observation points
These layers should all be in Albers and clipped to New England

Data Organization
A well organized map before final assembly:

Combining Data
 Use the Extract Multi Values to Points tool to append
MAT, MAP, DEM, slope, and aspect to the IPANE
specimen observation points
 You now have 2 data files: IPANE specimen observation
points with associated environmental data (MAT, MAP,
DEM, slope, aspect) and IPANE point buffers with LULC
and road length summaries.
 Use a spatial join to append the buffer data to the point
 Export the data to a new shapefile
 Delete redundant fields (use the Delete Field tool)

Final Dataset
 Check your processing results on the map and the
attribute table to ensure everything looks correct
 Are there any features that seem odd?
 Does your data need additional quality control (QC)?

Final Dataset
For example, there are points in my dataset that have values for road_sum
and luV#, but all zeros or -1 for the other raster values. Each of our data
layers had slightly different extents, so coastal points may have fallen
outside the data area for some layers. There isnt much we can do about
this, other than be sure missing data is coded as such.

Final Dataset
 After exploring your dataset and determining which points
need QC, export the data in tabular form
 You can export a variety of file types
 Text files are very flexible and can be used easily in Excel or R
 dBASE files are commonly used in ArcGIS and can be used in R and

 Open your exported file in Excel

 Do any needed quality control
 Enter NA for cells that you feel
should be quality controlled

Final Dataset
 Save your quality controlled table
You now have a dataset that could be used to
model your species distribution based on
environmental factors. Most statistical modeling
(such as species distribution modeling) is
conducted external to ArcGIS, which is why we
needed to export the dataset to tabular form.

What Have We Learned?!

Geospatial data are complex and must be processed with
Designing a workflow before processing can save a lot of
Datum and projections are both critical and sometimes
difficult to handle
Many analyses and transformation can be accomplished in
ArcGIS, but add on tools can also be quite helpful
Geospatial processing can be time consuming and should
not be approached with short timelines if possible
Practicing consistent data management is the best way to
prevent file chaos and wasted memory

Skills Summary
Geospatial Modeling Environment
Function coding

Sum Line Lengths in Polygons

Thematic Raster Summary
Effective Data Organization
Quality Control of Processed Data

Read the two papers posted on the course
Complete sections 1.5, 1.6, 2.1-2.3, and 3.1
of the R tutorial on the course website