Вы находитесь на странице: 1из 17

An Introduction to Variography using Variowin

Luc Anselin
Spatial Analysis Laboratory
Department of Agricultural and Consumer Economics
University of Illinois, Urbana-Champaign
http://sal.agecon.uiuc.edu/
June 24, 2003

Introduction

This is a brief introduction to the exploration and modeling of variograms using Yvan
Pannatier’s Variowin 2.21 software package. This packages is freely available from
http://www-sst.unil.ch/research/variowin/index.html . However, there is no manual
available on the web. The “official” manual is the book by Pannatier (Springer Verlag,
1996) that contains a disk with an older version.

The data used in this tutorial are the Baltimore house prices (baltprice.dat) and the Los
Angeles ozone data (laozone.dat), both obtainable from the SAL sample data repository
http://sal.agecon.uiuc.edu/stuff/data.html. The data sets come in the format required by
Variowin.

Program Basics

Variowin consists of a collection of four programs (as .exe files) that need to be run
separately. You will be using three of the four in this tutorial: Prevar2D (a utility to
construct a distance matrix for all point pairs in the data set), Vario2D with PCF
(exploring variograms) and Model (fitting theoretical variogram models). You can start
each of the programs in the usual way, by clicking on the matching shortcuts (see Figure
1), or by running the executables (prevar2d.exe, vario2dp.exe and model.exe) in the
directory where Variowin was installed.

Figure 1. Variowin programs shortcuts for Prevar2D, Vario2D and Model.

The data input files for Variowin need to be in a specific format (Geo-EAS), common to
many geostatistical software packages. Each data file starts with a header line containing
a descriptive title. Next follows a line with the number of variables. The following set of
lines contains the variable names, one per line. Next are the actual values, with a new line
for each observation, and the values separated by tabs or spaces, but not by commas. The
last line in the file should be a blank line. For example, in Figure 2, the first few lines of
the file baltprice.dat are shown. This data set contains six variables: an ID (STATION), X
and Y coordinates, house sales price (PRICE), residuals from a first order trend surface
regression (r_p_1) and residuals from a second order trend surface regression (r_p_2).
Figure 2. Input data file in Geo-EAS format.

Creating a Pair Comparison File (pcf)

The pair comparison file (with a pcf file extension) contains the distances between
observations in a binary format. It is used instead of the ascii input file (the .dat file) for
all subsequent analyses in Variowin. Once you have created a baltprice.pcf file, there is
no further need for the baltprice.dat file.

Start the Prevar2D program by double clicking its short cut or by running the executable.
This brings up the Prevar2D welcome window, as in Figure 3. Click OK to move on.
Next, you see a File Open dialog to select the input data file (Figure 4). This is a little
tricky since it is not in the usual windows file explorer format. However, click on the [..]
item to move up in the directory tree and continue navigating until you are in the working
directory, and select the baltpr~1.dat file, as in Figure 5. Note that Variowin still uses the
DOS style file length limitations, so that files longer than 8 characters are truncated.

Next, the coordinates of the points need to be specified. The Settings > XY Coordinates
menu item (Figure 6) brings up a dialog to select the X and Y coordinates (Figure 7).

Figure 3. Prevar2D opening screen.

2
Figure 4. File open dialog in Prevar2D.

Figure 5. Opening the baltprice.dat input file in Prevar2D.

Figure 6. Specifying X, Y coordinates in Prevar2D.

Once the X-Y coordinates are specified, the Run menu becomes enabled, as shown in
Figure 8. Click on this to start the computation of the distance file. At the end of the
process, the summary output appears, which simply lists the number of pairs for which
the distances were computed, as shown in Figure 9.

Practice

Use the laozone.dat file to create a distance pcf file using X_Coord and Y_Coord as the
coordinate variables (these are projected coordinates).

3
Figure 7. Selecting the variables for the X, Y coordinates in Prevar2D.

Figure 8. Run menu enabled in Prevar2D.

Figure 9. Summary output of Prevar2D.

Variogram Cloud Plot

A variogram cloud plot is created in the Vario2D program of the Variowin suite. This
program requires a pcf file as input, not a data file. Start the program by double clicking
on its shortcut or by running the executable in its directory. A welcome screen appears, as
in Figure 10. Clicking OK removes the welcome screen and activates a File Open dialog.
Navigate the directories using the [..] button until you reach the working directory with
the pcf file, as in Figure 11. Opening the pcf file activates the menu items.

4
Figure 10. Vario2D welcome screen.

Figure 11. Vario2D open pcf file.

Simple mapping functionality is available in the Map! Item on the Data menu, as shown
in Figure 12. Selecting this function creates a point map in an x-y coordinate system, as
illustrated in Figure 13. You can change the look of the points in the map with the
Settings … menu (Figure 14, default symbols for scatter plot). The map will later be
linked to the variogram cloud plot. Selecting (by mouse click) any of the points reveals
the associated values in a popup, as in Figure 15. Click on OK to remove the popup.

Figure 12. Vario2D Data menu with Map function.

5
Figure 13. Map in Vario2D.

Figure 14. Settings for scatterplots and variography.

Figure 15. Identify values associated with a point.

6
A variogram cloud plot is constructed by selecting Calculate > Variogram Cloud in the
menu (Figure 16). This brings up a dialog to specify the variable to be analyzed, as well
as some parameters, such as the maximum distance and direction parameters, shown in
Figure 17. If you did not change the settings (i.e., kept them as in Figure 14), the default
is “direct” variography for a single variable. The Maximum distance should be set to the
“distance of reliability,” roughly ½ of the maximum distance for all pairs. Variowin does
not provide you with this maximum distance in an obvious way, but you can obtain it
indirectly. Select Calculate > Variogram Cloud and enter a large value in the text box for
Maximum distance, such as 200. Variowin will bring up a Warning that the maximum
distance must be between 0 and 127.96, as in Figure 18. The distance of reliability would
therefore be 64. For now, you can leave the default of 70 as is. Also leave the “angular
tolerance” to the default of 90. This option is useful for directional variogram cloud plots,
but in this exercise you are only considering an isotropic variogram (no directional
effects). Finally, you must select a variable from the list in the dialog. Choose PRICE and
click on OK to generate the cloud plot, as in Figure 19.

Figure 16. Variogram cloud plot command.

Figure 17. Direct variogram cloud parameter settings.

Figure 18. Maximum distance warning.

7
Figure 19. Variogram cloud plot for Baltimore house price data.

Practice

Using the laozone.pcf file you created, make a map of the points. Use the identify feature
to check on the values for some of the points. Create a variogram cloud plot for the
maxday variable. Experiment with changing the maximum distance.

Linking Map and Variogram Cloud Plot

The points in the variogram cloud plot are linked to a pair of points in the location map.
This provides you with a means to check outliers in the cloud plot. An outlier would be a
point in the cloud plot that is much higher than the other points for that distance. This
suggests that the locations in question are much more different (larger squared difference)
than is the case for other pairs that are a similar distance apart. For example, consider the
cloud plot in Figure 19 together with the map in Figure 13 (recreate these if they are not
present on your desktop).

In the cloud plot, click on the outlier for distance 16, as shown in Figure 20. A dialog will
appear listing the pair that corresponds to this point, as well as the distance that separates
them (h), the value of the variable (z) and the variogram value (squared difference,
variogram). The two points are now also connected in the map by a red line, as shown in
the bottom half of Figure 20. If you select the two records in the dialog and click on
“Keep selected pairs on map and quit” the arrowed line will turn black and remain on the
map. Next, select the outlier in the cloud plot at distance 30 and note how the two line
segments in the map connect to a common point, record 53 (Figure 21). Now check the
values in the dialog again (or click on the points in the map) to see what may cause this.
Station 53 has a house sales price of 8, whereas its “neighbors” (for the given distance
band) have sales prices of respectively 145 and 165. The squared differences between
these prices and the value for Station 53 is much higher than for other pairs a similar
distance apart, suggesting Station 53 might be an outlier (or a data recording error).

8
Figure 19. Linked variogram cloud plot and map.

You can experiment some more with the Baltimore data, constructing a variogram cloud
plot for the variables r_p_1 (residuals from a linear trend surface) and r_p_2 (residuals
from a quadratic trend surface), and assessing whether the trend has removed the
indication of an outlier. Also check other outlying points in the cloud plot and the
locations with which they correspond.

Practice

Use the laozone.pcf file to assess the existence of outliers in the variogram cloud plot for
the variables maxday (maximum July 96 daily ozone emission) and av8top (average of
the 8 highest readings per day). Hint: focus on the pair #26 and #31.

9
Figure 21. Outlier in linked map and variogram cloud plot.

Variogram

A variogram is also calculated in the Vario2dp program. If this program is still active,
make sure you have the balprice.pcf file selected (if not, restart the program and load the
file). The variogram is part of the Calculate menu (Figure 16). Select Calculate >
Directional Variogram to start the process. In the dialog, select r_p_1 as the variable, and
leave all the other settings to their default values, as in Figure 22. Click OK to calculate
the variogram. A graph will appear that shows the estimates for each distance bin (lag) as
well as how many pairs were used in the computation, as in Figure 23. There are 12
circles on this graph, corresponding to h = 0 (zero distance) and 11 distance bands. Note
how the largest distance is 81 (a little over the distance of reliability). Also note how the

10
graph decreases at higher distances, which is not supposed to happen: as points are
further apart, they are supposed to be less similar, hence the variogram should increase
with distance up to a point and then become more or less flat.

Start another calculation (Calculate > Directional variogram) and change the number of
lags to 8. The new variogram is a little more acceptable and has a maximum distance of
60, as shown in Figure 24. However, the variogram still shows somewhat of an upward
trend, suggesting that a spatial trend may still be present. Carry out a third calculation,
now using the residuals from the second order trend surface. The new variogram (Figure
25) is almost flat beyond distance 15, suggesting that the range of spatial autocorrelation
ends at that distance (points more than 15 distance units apart show no change in their
variogram with increases in distance and thus are not spatially correlated).

Figure 22. Variogram dialog.

Figure 23. Variogram for first order trend surface residuals (lags = 11).

11
Figure 24. Variogram for first order trend surface residuals (lags = 8)

Figure 25. Variogram for second order trend surface residuals (lags = 8).

You can experiment by changing some of the settings, such as the number of lags, or by
using a different estimator, such as a Madogram (in the Settings dialog). Note that the
correlogram visualized in Variowin is not the usual, but expressed as a difference. As a
result, it does not go down with increased distance, but goes up. Finally, compute the
variogram for the PRICE variable itself and try to explain why the exercise started with
the trend surface residuals instead. Make sure you save one of the variograms as a “var”
file. With a variogram window active, select File > Save as, and specify a file name (8
character limit). The new file will be saved in the working directory.

12
Practice

Construct variograms for the maxday and av8top variables in the LA ozone data set.
Assess the sensitivity of the graph to the choice of settings. Try to formulate some
tentative conclusions about the range of spatial correlation.

Fitting a Spherical Variogram

Various theoretical variogram models are fit to the data in an empirical variogram with
the Model program in Variowin. Start this program by double clicking on its short cut or
by running the executable in the Variowin program directory. Make sure you saved a .var
file at the end of the variogram computation. Otherwise, you first need to go back to
Vario2pd, compute the variogram and save the result.

Starting the Model program brings up the usual Welcome window, as in Figure 26. Click
on OK to open the File Open dialog, as in Figure 27. The same dialog can also be
obtained later from the menu as File > Open. Select the var file you saved in the previous
Vario2pd session and click OK.

Figure 26. Model welcome screen.

Figure 27. File open dialog for var file.

13
The next dialog is used to specify the variogram data (Experimental Variogram) to which
the fit will be applied. This is useful when several var files have been loaded. In our case,
there is only one listed in the dialog. Select r_p_2 omnidirectional, as in Figure 28. Click
on OK to bring up the Model user interface with menu items and two windows, shown in
Figure 29. This dialog can also be generated by selecting the Model item in the main
menu.

The interface contains two main windows, the one on the left is to select the theoretical
model and its parameters, the one on the right shows the fit of the model to the
experimental variogram. Variowin allows one to fit additive structures to the variogram,
but that will not be pursued here (experiment with this later by filling in values for the
parameters for the 2nd and 3rd structure). For now, only a single model will be fit.

Figure 28. Experimental variogram dialog.

Figure 29. Model user interface.

14
The first model will be a spherical variogram model. Select this specification in the
Model drop down list of the model dialog, as shown in Figure 30. Next, set Dir to 90,
which is required for an isotropic variogram (no directional effects). Either type in the
value of 90 or use the slider bar to move the value. Also, specify the range and sill as 15
and 320, respectively, as illustrated in Figure 30. Once you specify the parameter values,
the model fit is calculated and shown on the top two lines of the dialog (smaller number
is better). It is compared to the best fit found so far, so when the fit on the second line is
better than the current fit, you are moving in the wrong direction. At the same time, a
curve is drawn on the variogram graph, as in Figure 31.

Figure 30. Variogram model parameters.

Figure 31. Estimated spherical variogram (sill = 320, range = 15).

15
Variowin does not use a statistical method to estimate the parameters of the theoretical
variogram, but instead relies on an interactive procedure. By changing the parameters in
the model dialog and monitoring the change in fit, the user is supposed to converge to a
“best fit” model. Once this is obtained, the model parameters can be saved in a file for
use with kriging software, such as GSLIB. For example, in Figure 32, a value for the
nugget was specified as well, and after some experimentation, the parameters were
selected as nugget = 13.09, range = 13.4 and sill = 303.8, with the graph shown in Figure
33. The overall fit improved from 2.488x10-2 to 2.003x10-2.

Figure 32. Parameters for improved spherical model.

Figure 33. Plot for improved spherical model.

16
Practice

Experiment with changing the parameters to improve the fit of the model. Also, use the
laozone data set to find the “best” spherical variogram for that data set for one (or both)
of the ozone variables. Interpret the result in terms of the range of the spatial correlation.

Fitting other Variogram Models

Variowin also fits other theoretical models to the experimental variogram. These can be
found in the Model drop down list of the model dialog (Figure 34). The procedure is
identical to that outlined for the spherical variogram: set the parameters interactively and
move towards a better fit for the model.

Experiment with an exponential variogram for the second order trend surface residuals.
Compare the fit to that of the spherical variogram. Try other models as well if time
permits.

Practice

Compare the results for the spherical variogram fit for the laozone variable(s) to that
obtained with alternative models, such as an exponential model. Contrast the implications
for the range of the spatial correlation.

Figure 34. Theoretical variogram models.

17