Monitoring Design W/ Support Vector Machines

WATER RESOURCES RESEARCH, VOL. 40, W11509, doi:10.
1029/2004WR003304, 2004
Support vectors--based groundwater head observation networks

design
Tirusew Asefa, Mariush W. Kemblowski, Gilberto Urroz, Mac McKee,
and Abedalrazq Khalil
Department of Civil and Environmental Engineering and Utah Water Research Laboratory, Utah State University, Logan,
Utah, USA
Received 26 April 2004; revised 30 August 2004; accepted 20 September 2004; published 25 November 2004.
[1] This study presents a methodology for designing long-term groundwater head
monitoring networks in order to reduce spatial redundancy. A spatially redundant well
does not change the potentiometric surface estimation error appreciably, if not sampled.
This methodology, based on Support Vector Machines, makes use of a uniquely solvable
quadratic optimization problem that minimizes the bound on generalized risk, rather
than just the mean square error of differences between measured and ‘‘predicted’’
groundwater head values. The nature of the optimization problem results in sparse
approximation of the function defining the potentiometric surface that was utilized to
select the number and locations of long-term monitoring wells and guide future data
collection efforts, which is a prerequisite in building and calibrating regional flow and
transport models. The methodology is applied to the design of regional groundwater
monitoring networks in the Water Resources Inventory Area (WRIA) 1, Whatcom County,
northern Washington State, USA. INDEX TERMS: 1829 Hydrology: Groundwater hydrology; 1848
Hydrology: Networks; 9820 General or Miscellaneous: Techniques applicable in three or more fields;
KEYWORDS: Support Vector Machines, groundwater monitoring networks, statistical learning theory
Citation: Asefa, T., M. W. Kemblowski, G. Urroz, M. McKee, and A. Khalil (2004), Support vectors – based groundwater head
observation networks design, Water Resour. Res., 40, W11509, doi:10.1029/2004WR003304.
1. Background and Brill [1988], Morisawa and Inoue [1991], Meyer et al.
[2] This article is concerned with the design of long-term [1994], Jardine et al. [1996], Storck et al. [1997], and
groundwater head observation networks. Groundwater head Angulo and Tang [1999]. Contaminant characterization
observations are important calibration constraining data. networks are concerned with characterizing the nature and
Under ideal conditions, physical models that are based on extent of the pollutant once initial detection is made.
governing physical processes of groundwater flow do not Specifically, the design procedure provides a methodology
need calibration. In reality, since model input parameters are on how existing monitoring wells can be augmented, if
subject to uncertainties and since they are observed locally there are any or siting new wells. Examples of such studies
and in sparse locations only, it is necessary to adjust these are Hudak and Loaiciga [1992], Mahar and Datta [1997],
parameters so that the observed value of a dependent variable Datta and Dhiman [1996], and Montas et al. [2000]. In
(e.g., groundwater head) matches the one simulated. Ground- long-term monitoring network design, the aim is, given an
water monitoring network design is defined as the selection adequately characterized plume, development of a cost-
of sampling points (spatial) and sampling frequency (tem- effective monitoring plan. Issues one looks at are selecting
poral) to determine the physical, chemical, and biological the subset of monitoring wells to be sampled for a given
characteristics of groundwater [Loaiciga et al., 1992]. period and the frequency of monitoring those wells. Exam-
[3] Broadly speaking, groundwater monitoring networks ples of such studies are Molina et al. [1996], Cameron and
may be classified into two categories: (1) groundwater Hunter [2000], Nunes et al. [2004a], and Reed et al. [2000,
contaminant monitoring networks, and (2) groundwater 2001, 2003]. We refer interested readers to a recent publi-
head observation networks. On the basis of design objec- cation of the American Society of Civil Engineers (ASCE)
tives, the former, in turn, may be classified into three task committee on state-of-the-art in long-term monitoring
categories: initial groundwater contamination detection, network design [Minsker and Task Committee, 2003].
characterization, and long-term monitoring networks. Initial [4] On the basis of design objectives, groundwater
groundwater contamination detection networks enable one head observation wells may be classified into two types:
to detect unexpected leaks before reaching a compliance (1) characterization wells, where one tries to locate new
boundary, which is usually located at some relatively short observation wells; and (2) long-term monitoring wells
distance, say 100m, from a landfill. Examples of such where one selects subsets from (many) existing wells to
studies are Massmann and Freeze [1987a, 1987b], Meyer make frequent (monthly, quarterly) observations at those
locations. Examples of such studies are Rouhani [1985],
Gangopadhyay et al. [2001], and Nunes et al. [2004b].
Copyright 2004 by the American Geophysical Union. [5] On the basis of the design approach, Loaiciga et al.
0043-1397/04/2004WR003304$09.00 [1992] classified all types of monitoring networks (both
W11509 1 of 14
W11509 ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS W11509
Figure 1. Study area and 1990 phased groundwater head measurement locations. Cross sections are
shown in Figure 2.
groundwater head and contaminant monitoring) as hydro- classified as optimal, making the final solution nonunique
logic that do not include advanced statistical methods, and and requiring additional criterion to chose between these
statistical otherwise. The statistical approaches were further equally optimal networks. This technique also does not
divided into simulation, variance-based, and probability- depend on actual values of measured variables, but on the
based. Differences between these methods came from relative distribution of the measuring locations.
differences in the objective function formulation. [8] In this paper, we present a methodology that is based
[6] Variance-based, also known as variance reduction and on Support Vector Machines (SVMs) for long-term ground-
redundancy reduction, methods assess the suitability of a water head monitoring network design that globally opti-
given network by relying on variance of estimation error mizes an objective function to identify monitoring well
obtained by kriging [Rouhani, 1985; Ben-Jemaa et al., locations based on their importance in explaining the
1994; Nunes et al., 2004a, 2004b]. A given monitoring potentiometric surface without going through exhaustive
network (number and locations) has associated uncertainty trial and error searches on alternative monitoring well
explained by the variance of estimation error; and if wells in configurations.
the network are to be added, removed, or displaced, the
associated network accuracy will change. These methods
then systematically search through different combinations 2. Case Study
of monitoring well locations that would result in minimum [9] The study area that is in northwestern Washington
variance of estimation of error. State, USA, including a small portion inside Canada, is
[7] Ben-Jemaa et al. [1994] applied a branch-and-bound part of what is known as Water Resources Inventory Area
algorithm in designing monitoring networks for observing #1 (hereafter WRIA 1) (Figure 1). It covers an area of
aquifer properties. The algorithm consists of searching for 629 Km2. As part of a concerted effort in tackling water
optimal monitoring nodes along preconstructed tree resources management problems in WRIA 1, Utah State
branches. If one has to select a monitoring network of University (USU), the United States Geological Survey
n (USGS), and the Public Utility District No. 1 (PUD) of
n nodes from a total of N locations, there will be Whatcom County undertook different tasks within three
N
possible network layouts, each layout corresponding to one phases: Phase I – Organization; Phase II – Technical
branch of the search tree. The limitation of this approach is assessment, and Phase III – Plan development and imple-
that if n N, the number of combinations become very mentation. Different water resources management issues
large and the problem becomes a difficult combinatorial currently being looked at include: (1) Groundwater quan-
optimization problem. Improvements on such an approach tity/quality assessments; (2) Surface water quantity/quality
were made using heuristic algorithms that iteratively look assessments; (3) Instream flow and fish habitat require-
for a better solution by trial and error, rather than searching ments; and (4) Database management and decision support
the entire state space [Wagner, 1995; Nunes et al., 2004a, systems that integrate these different activities and present
2004b]. As in all heuristic searches, these approaches may an easy-to-use computer model for the decision makers.
not guarantee that the final network corresponds to a global All these processes are interrelated. One of the central
optimum. In addition, more than one network may be components of the system is groundwater flow and trans-
2 of 14
port modeling, knowledge of which is a prerequisite for estimator of Z at point x0 is given as the best linear unbiased
other processes. Effective groundwater flow and transport estimation (BLUE) expressed as a linear combination of
modeling, in turn, would require, among other things,
^ X
N
groundwater head observations that should be collected Z ðx0 Þ ¼ wk ðiÞ Z xðiÞ ; ð1Þ
in a timely fashion and used for model building and i¼1
calibration. Therefore acquisition of this calibration con- ^
straining data is the first step in flow modeling. In the where wk(i) s are kriging weights; Z (x0) is kriging estimator;
past, our experience in the project area showed that budget and Z(x(i)) is observation made at location xi. The kriging
and practical constraints (for example, arrangements with weights are determined by requiring unbiasedness
private well owners, and arrangements for transboundary Pn
( wk(i) = 1) and minimum estimation variance [Journel
measurements due to the fact that some well owners are i¼1
within Canada) resulted in asynchronic groundwater head and Huijbregts, 1978].
measurements. The USGS conducted one of the most [13] The groundwater monitoring network optimization
complete surveys in 1990. Within six months (March to problem is then posed as follows: for a given network size
August 1990), observations were made covering the entire of n, find the best monitoring location out of the total of N
present study area. These inventoried wells are shown in that results in minimum mean estimation variance. This is
Figure 1. done through an exhaustive search as in branch-and-bound
[10] Since it is not feasible to measure all these wells at algorithm that guarantees a global optimum, or heuristic
all times, the management problem to be addressed by the near-optimal solution of, say, simulated annealing and
present study is to identify subsets of these wells to be genetic algorithm. For applications of this approach see
monitored simultaneously on a regular basis. Cost-effective studies by Rouhani [1985], Ben-Jemaa et al. [1994], and
acquisition of these data is then crucial in flow and transport Nunes et al. [2004a, 2004b].
modeling. Consequently, regional groundwater monitoring 3.2. Support Vector – Based Method
network design that identifies wells to be monitored on a [14] The support vector methodology [Vapnik, 1995,
regular basis while characterizing the potentiometric surface 1998], based on Statistical Learning Theory (SLT) (see
adequately is the subject of this study. In doing so, we Appendix A for detailed presentation of SVM algorithm),
present a novel approach to regional groundwater monitor- estimates the value of Z at unsampled location x0 (vector of
ing network design that uses a new learning methodology measurement locations x and y) by
called Support Vector Machines (SVM) based on Statistical
Learning Theory (SLT). ^
[11] Despite enjoying success in other fields [Schölkopf et Z ðxÞ ¼ hwsv ; x0 i þ b; ð2Þ
al., 1999], there are few applications of SVM in hydrology.
Dibike et al. [2001] applied SVM successfully in both where h.,.i indicates an inner or dot product between x0
remotely sensed image classification and regression (rain- and wsv. wsv is the support vector weight (basis function),
fall/runoff modeling) problems and reported a superior and b is bias. For simplicity, we will just use x rather than
performance over the traditional artificial neural networks. x0. The weights and bias are found by minimizing a
Kaneviski et al. [2000] used SVM for mapping soil pollu- regularized e-insensitive loss function. This loss function
tion by Chernobyl radionuclide Sr90 and concluded that the is depicted in Figure 2 and given below:
SVM was able to extract spatially structured information 8 ^
from the row data. Liong and Sivapragasam [2000] also >
< 0 if jZ ðxÞ
Z ðxÞj e
^
reported a superior SVM performance compared to artificial G ¼ jZ ðxÞ
Z ðxÞje ¼ ; ð3Þ
>
: ^
neural net in forecasting flood stage. Asefa and Kemblowski jZ ðxÞ
Z ðxÞj
e otherwise
[2002] used SVM to reproduce the behavior of a Monte
Carlo – based groundwater flow and transport model that where Z(x) is measured quantity (groundwater head in this
was, in turn, utilized in the design of initial groundwater ^
case) and e represents the precision by which Z (x)is
contamination detection monitoring systems. Training and
estimated.
testing examples were derived using plumes generated from
[15] In Figure 2, each data point schematically represents
a random contaminant leak resulting from failure of a landfill
measurements made at a monitoring well, and x s are slack
cell and random hydraulic conductivity field. Designed
variables that measure distances of these data points from
monitoring networks by the trained machine were nearly
the e tube. Data points that lie inside the e tube have a zero
identical with those obtained by the physical models.
value of the loss function and do not have associated slack
Contaminant plume detection reliabilities provided by both
variables.
methods were also close.
[16] In order to find wsv and b, if one only minimizes
equation (3), it is an ill-posed problem in Tikhonov’s sense
[Tikhonov and Arsenin, 1977]. Therefore in practice one
3. Methodology
imposes a convex penalty term on some quantity related to
3.1. Estimation Variance--Based Method the complexity of Z. Vapnik’s [Vapnik, 1995, 1998] choice
[12] Suppose one wants to predict the value of a random of the regularization term is given by 12 kwk2 .
variable Z at a location x0 from a space of function F (also [17] The optimization problem is then cast as follows:
named feature space), where no observation is made using What is the ‘‘best’’ subset of long-term monitoring wells
observations at the vicinity, x1, x2, . . .xN. The most common (number and locations) out of the existing N wells that
theory considers second-order stationarity. The kriging would result in the ‘‘best’’ estimation of potentiometric
3 of 14
Figure 2. The e-insensitive loss function G.
surface for prespecified error level (best is in a sense of addresses only the problem of spatial redundancy, assuming
minimum regularized loss function given below). This is that future sampling plans will be evaluated as site con-
mathematically expressed as follows: ditions change. The reason for this restriction is that there is
minimize no consistent time series data at the project area that may be
used for the purpose of analyzing temporal redundancy.
1 XN
k wsv k2 þ C ðxi þ x*i Þ; ð4aÞ Examples of some previous monitoring network design
2 i¼1 studies that are based on data collected at a snapshot in
time are Rouhani [1985], Reed et el. [2001], and Reed and
subject to
Minsker [2004].
8 [20] Usually the optimization problem given in equations
>
> Z ðxÞ
hwsv ; xi
b e þ xi
>
> (4a) –(4c) is solved in its dual form using Lagrange multi-
<
hwsv ; xi þ b
Z ðxÞ e þ x*; ð4bÞ pliers. In addition, the dual formulation lends itself to
i
>
> introducing nonlinearity in potentiometric surface estima-
>
>
: tion (shown below). Writing equations (4a) –(4c) in its dual
xi ; x*i 0
form and differentiating with respect to primal variables (w,
to obtain b, xi, x*i ) and rearranging gives (see Appendix A for details)
the following:
^ maximize
Z ðxÞ ¼ hwsv ; x0 i þ b: ð4cÞ
X
N X
N
This objective function minimizes the complexity of the W ða*; aÞ ¼

e ðai þ a*i Þ þ Zi ðai
a*i Þ
i¼1 i¼1
SVM estimator (i.e., the estimator will tend to be flat)
and penalizes monitoring points that lie outside the e tube. 1 X N

ðai
a*i Þ aj
a*j k xi ; xj ; ð5aÞ
In other words, for any (absolute) error smaller than e, xi, = 2 i; j¼1
x*i = 0, hence these data points do not enter the objective
function. This means that not all groundwater head subject to constraints
observations made at existing monitoring well locations
^
will be used to estimate Z (x). The constant C > 0 X
N
ða*i
ai Þ ¼ 0 0 ai ; a*i C; ð5bÞ
determines the trade-off between the complexity of the i¼1
function Z and the amount up to which deviations larger
than e are tolerated. A smaller value of C means more to obtain
weight is given to the regularizer while higher and higher
values of C make the problem to be more and more ^ X
n
unconstrained. Z ðxÞ ¼ ða*i
ai Þk ðx; xi Þ þ b; ð5cÞ
i¼1
[18] In addition to algorithmic differences, the main
difference between the SVM and kriging estimators is the
fact that the SVM estimator uses a subset of the data where a*i and ai are Lagrange multipliers, k(x, xi) is a kernel
(monitoring wells) from the total set (existing wells) based that replaces dot products of input examples, n is the
on their importance in defining the potentiometric surface. number of selected long-term monitoring wells, and xis are
their locations. One observes that from^the Kuhn-Tucker
3.3. Optimizing Long-Term Monitoring (KT) condition it follows that only for jZ (x)
Z(x)j e,
(LTM) Networks the Lagrange multipliers may be nonzero. In other words,
[19] LTM networks are designed by selecting subsets for all samples inside the e tube (Figure 2) the ai, a*i vanish.
from (many) existing wells to make frequent (monthly, The samples that have no vanishing coefficients are called
quarterly) observations at those locations. This study support vectors, hence the name Support Vector Machines.
4 of 14
Figure 3. Conceptual representation of kernel transformation.
Intuitively, one can imagine the support vectors as potentiometric surface. Since support vectors define the
monitoring well locations that ‘‘support’’ the estimated potentiometric surface, future groundwater head observa-
potentiometric surface. Observe the difference between tions at those locations will explain the nature of this
equation (5c) and equation (2). This is because of the fact surface better than measurements taken at other locations.
that in differentiating the dual form the SVM weights are Therefore support vector locations are assumed to be the
Pn
shown to be equal to wsv = (a*i
ai)xi (equation (A6)). best long-term monitoring well locations. In addition, the
i¼1 SVM algorithm directly gives the number of wells to be
Substituting this expression in equation (2) would result in monitored.
^ Pn
Z (x) = (a*i
ai)hx, xii + b.
i¼1
[21] The dot product is then substituted by a kernel: 4. Application to a Case Study
hx, x0i hF(x), F(x0)i = K(x, x0). This is the so-called [24] SVM-based regional groundwater monitoring net-
‘‘kernel trick’’ depicted in Figure 3 where nonlinear trans- work design may be summarized in two steps: (1) inventory
formation is achieved. This is because of the fact that the of groundwater head observations and hydrogeological
SVM algorithm depends only on the dot product between characterization of different layers within which existing
monitoring well locations (see equations (A9a) –(A9c)). piezometers are located; and (2) SVM implementation.
[22] Kernels may be viewed as dot products of nonlinear
transformation functions. The connection between Repro- 4.1. Hydrogeological Characterization
ducing Kernel Hilbert Space and random processes is well [25] In the present study area, groundwater observation
documented [see, e.g., Wahba, 1990]. According to the wells are distributed within different aquifer layers and one
Bayesian interpretation, the first term in equation (4a) is a has to delineate these aquifers in order to select wells in
stabilizer that is a prior on the regression function Z in the each layer. At a regional scale, the study area is classified
Reproducing Kernel Hilbert Space (RKHS) induced by as what is known as the Puget Sound Lowland that has
kernel K, and the data term is the noise model. If we been influenced in large part by the tectonic and glacial
assume that the data, zi, are affected by additive independent events during the Tertiary and Quaternary periods [Jones,
Gaussian noise process (zi = z(xi) + ei), then the squared 1999]. This part of the Puget Sound Lowland is named the
norm, k Zk 2, can be thought of as the generalization of the Fraser-Whatcom Basin. Cox and Kahle [1999] identified
expression ZS
1Z (also called the Mahalanobis distance two classes of aquifers (from top down): (1) Sumas aquifer
from the mean Z) with covariance S [Wahba, 1990; Poggio (Qsa) and (2) Everson-Vashon aquifer (Qev). The latter may
and Girosi, 1998a, 1998b]. The density, P(Z), is then a be further divided into Everson-Vashon fine-grained
multivariate Gaussian zero-mean function in the Hilbert confining unit (Qevf) and Everson-Vashon coarse-grained
space defined by the covariance function. The existence layer (Qevc), a confined aquifer. The Qevc consists of
of such a well-defined family of random variables is discontinuous patches (lenses, pools). Therefore the hydro-
guaranteed by the Kolmogorov consistency theorem geology of the present study area is a two-aquifer, three-
[Wahba, 1990]. Therefore choosing kernel K may be layer system.
viewed as assuming a Gaussian prior on Z with covariance [26] Characterization data were obtained from Cox and
equal to K [Poggio and Girosi, 1998b]. This is also the link Kahle [1999], Whatcom County Health and Human Serv-
between SVMs and kriging theory where the kernel is given ices Department (WCHHSD) well log database (2826
by the covariance function: K(x, x0) = cov(Z(x), Z(x0) = S. geographically referenced points), and the Department of
[23] The optimization problem given by equations (5a) – Ecology’s scanned well logs (6967 data points). These data
(5c) estimates the best function that defines the potentio- were analyzed to select well logs that were subsequently
metric surface as a function of support vector locations used to delineate these identified hydrogeologic layers. Well
only. Measurements at other locations (those that lie inside log selection criteria, among other factors, include depth of
the e tube) do not contribute to the function defining the completion and uniform aerial coverage. Figure 4 shows
5 of 14
Figure 4. Cross sections of (a) east-west and (b) south-north. The locations of the cross sections are
shown in Figure 1.
two cross sections (east-west and south-north) of the present and this aquifer is practically disconnected from the under-
study area. Locations of the cross sections are shown in lying Qevc layer through a thick low permeable Qevf layer,
Figure 1 (Figure 4). the present study is concerned only with the Sumas Aquifer.
[27] Because of the fact that most of the water supply In addition, most of the inventoried observation wells are
need in the project area is satisfied by the Sumas Aquifer, also sited in this aquifer. We note that several localized
6 of 14
Table 1. Commonly Used Kernels [31] Three hundred and fifty well locations and ground-
Kernel Type Expression
water head observations extracted from the Sumas Aquifer
were used to estimate SVM parameter, C, and radial basis
Simple dot producta K(x, x0) = x*x0 kernel parameter, g. One way of conducting the training/
Polynomial K(x, x0) = (x*x0 + 1)d, validation is with a split sample approach. This approach
d is user specified
Two-layer neural network K(x, x0) = tanh (b(x*x0)
c)) ,
divides the available data into two and uses one for training
b and c are user specified and the remaining for validation. Optimal SVM parameters
Radial basisb K(x, x0) = exp (g2kx
x0k2), will then be selected based on performance (e.g., minimum
g2 is user specified root mean square error) of the validation set. We used a K-
a
This kernel corresponds to linear machine. fold cross-validation technique. The K-fold cross-validation
b
This kernel is translation invariant. Can be written as Gaussian
approach splits the available data into more or less K equal
kx
x0k2 parts. K-1 parts of the data will be used to find the SVM
2
covariance kernel with unit variance: K(x, x0) = s2 exp 2
= ^
h r
estimator, Z (x), and calculate the validation error of
exp
2 , where s2 = 1, r2 = 1/g2, and h2 = kx
x0k2.
r the fitted model while predicting the kth part of the data.
The procedure then continues for k = 1, 2, . . ., K, and
previous hydrological investigations also considered the the selection of parameters is based on minimum prediction
bottom of the Sumas Aquifer as an impermeable unit error estimates over all K parts.
[Associated Earth Sciences, Inc., 1994, 1995; GeoEngineers [32] Now the question is what value to use for K. Hastie
Hydrogeologic Services, 1994; Water Resources Consulting, et al. [2001] recommend the use of K = 5 or 10 based on the
LLC, 1997]. shape of a ‘‘learning curve.’’ A learning curve is a plot of
4.2. SVM Implementation training error versus training size. For given SVM param-
eters (g, e and C), different training errors are calculated by
[28] Equations (5a) – (5c) is a quadratic optimization ^
problem that guarantees a global optimum and can be progressively estimating Z (x) for increased number of the
solved using any off-the-shelf quadratic optimization algo- training size, constituting a plot of the learning curve. For
rithms like LOQO [Vanderbei, 1994]. We used the SVM smaller training sizes, the learning curve has a steep slope
optimization code developed by the Royal Holloway Uni- and it gradually flattens, as the training size increases and
versity of London and AT&T Speech and Image Processing changes in training error becomes small. At this point, the
Service Research Lab [Saunders et al., 1998]. The data training error is said to be independent of the training size.
required to solve equations (5a) – (5c) are observed ground- Consequently, the value 4K or 9K will correspond to the
water head at x (X and Y coordinates) monitoring locations, training size where the learning curve starts to be flat. We
Z(x), and a kernel k(x, xi) that describes the (nonlinear) note that even though the actual value of the training error
dependency between observation points. Table 1 shows the may differ for different combination of SVM parameters,
most commonly used SVM kernels. Here we used a radial the shape of the learning curve remains more or less the
basis kernel that is translation invariant and estimated its same (i.e., the training size that corresponds to flattened
parameter using cross validation (see below). From Table 1, portion of the training curve stays nearly the same).
notice that use of the two-layer neural network kernel in [33] Figure 5 shows a representative learning curve in
SVM is not the same as that of the traditional Artificial our case. The curve was made using e = 0.1 and C = 10 and
Neural Network (ANN) [Govindaraju and Rao, 2000]. This g = 6. The value of the kernel parameter was derived from
important difference between ANNs and SVMs is explained data. This was done by noting that the radial basis kernel,
below. in fact, is a Gaussian covariance with unit variance (see
[29] Although the transformation function (kernel) used Table 1), the relation being r2 = 1/g2, where r is the distance
by ANNs and SVMs with the two-layer neural network after which no spatial autocorrelation is evident. Figure 6
kernel is similar, the loss function used by ANN (based on shows the experimental and Gaussian covariance that was
least square) does not result in a sparse solution [Girosi, used to estimate the value of g. We would like to point out
1998], as in the SVM. Therefore because of the nature of
the loss function employed, if ANNs were to be used to
estimate the potentiometric surface, they will use all the
measured data at monitoring well locations. Consequently,
ANNs will not be able to directly select a subset of
monitoring wells to be used as LTM networks as a function
of different levels of potentiometric surface approximations.
Lastly, most training methods in ANN such as the back
propagation algorithm may not guarantee a global optimum
[Hastie et al., 2001, p. 359; Vapnik, 1998, p. 399].
[30] The SVM algorithm is used in two stages: (1)
training/validation, and (2) design. The training/validation
stages aim at finding the optimal kernel parameter and SVM
parameter C for a range of potentiometric surface approx-
imations (e) that will be used in the design stage. The design
stage then uses trained SVM to provide a long-term mon-
itoring network as a function of groundwater head surface Figure 5. A ‘‘learning curve.’’ The broken line corre-
approximations. Each of these steps is explained below. sponds to fivefold cross validation (280 data points).
7 of 14
observed data (groundwater head observations at monitor-

ing wells and their corresponding locations, X and Y
coordinates) for various levels of potentiometric surface
approximations. At the end of the quadratic optimization
procedure, the support vectors were extracted and geograph-
ically referenced, thus producing a set of long-term moni-
toring well locations. Different magnitude of errors in
defining the potentiometric surface would then result in
different numbers and locations of monitoring wells. There-
fore the relation between e and the number and locations of
monitoring wells can be used to decide the size of the
network as shown in Figure 7. For example, Figure 8 shows
the locations of monitoring wells for four different error
levels. Sixty-five monitoring wells would be required to
maintain an error level of 5%; 23 wells for e = 10%; and so
Figure 6. Experimental covariance along with Gaussian
on. Wells selected in networks of higher error level (for
covariance fit.
example, e = 15%) were found to be progressively included
in the set when e is smaller, rendering consistency in the
solution.
that this kernel parameter value obtained from covariance fit [38] It is interesting to observe that selected monitoring
is used to obtain the learning curve and we do not imply an well locations (Figure 8) are at the areas where the observed
assumption of underlying Gaussian random field for the heads are most uncertain. Inspection of the equipotential
head distribution. One could also use an arbitrarily selected lines shows that the support vector points follow approxi-
kernel parameter value and adjust its value during training. mately the groundwater watershed boundaries. If two or
[34] As shown in Figure 5, the learning curve is relatively more monitoring locations are very close to each other, it is
flat after it reaches a training size of 250. The five- and because the local differences between groundwater heads at
tenfold training sizes correspond to sample sizes of 280 and those locations are large, therefore requiring more monitor-
315, respectively, which is virtually the same as the perfor- ing wells to explain the groundwater head variation at those
mance of the complete set. Thus cross validation would not areas. Figure 9 depicts the SVM prediction error surface for
suffer from much bias. The case K = 5 will have almost the different sizes of monitoring networks. Recall that from the
same performance as the case K = 10, but it will result in a definition of support vectors, at selected monitoring well
smaller computational time and, therefore, was used to locations we have (absolute) prediction errors equal to or
conduct the cross validation. If the five- or tenfold training greater than the prespecified error level. In other words, at
size (training size corresponding to 4K or 9K) indicates a those locations training points are on or outside the e tube.
location where the learning curve has a considerable slope, Nonmonitoring observation wells at other locations lie
from Figure 5 we observe that the true prediction error inside the e tube, hence do not contribute toward the
(where the curve flattens) will be underestimated [Hastie et definition of the potentiometric surface. This confirms
al., 2001]. common intuition as the SVM procedure puts observation
[35] Consequently, we conducted a fivefold cross valida- wells at the most uncertain locations. The groundwater
tion for a range of potentiometric surface approximations surface is then ‘‘supported’’ at those locations.
(e = 0.01
0.5) and obtained optimal values of SVM [39] We also investigated the performance of kernel
parameters to be C = 7 and g = 2. These values were then parameter value (length correlation scale) estimated from
used in the design stage as explained in the next section. covariance fit, compared to the one obtained through
4.3. Selecting Optimal Long-Term
Monitoring Networks
[36] The design of a groundwater monitoring network is a
multiobjective optimization problem [Knopman et al., 1991;
Cieniawski et al., 1995; Wagner, 1995; Reed et al., 2001,
2003]. If one monitors all the available wells, the error
associated with defining the potentiometric surface will be
minimal but this also means a higher cost of monitoring.
Using small number of monitoring wells would be less
costly but will also have higher error in explaining the
potentiometric surface. Therefore our interest in this study
lies in (1) finding how many wells would be required to
define the groundwater flow field, (2) identifying the
locations of those wells, and (3) providing a decision curve
that shows trade-offs between the number of wells and
corresponding relative error in groundwater table elevation
estimates.
[37] Using the optimal SVM parameters estimated in the Figure 7. Network size versus potentiometric surface
previous section, we fit a potentiometric surface to all the approximation.
8 of 14
Figure 8. SVM predicted groundwater head (m) surface and selected monitoring wells for different
levels of a prespecified error level (e, number of monitoring wells): (a) (5%, 65); (b) (10%, 23);
(c) (15%, 11); and (d) (20%, 8).
fivefold cross validation, on the complete set of data. former type is a onetime error (although piezometer loca-
Figure 10 depicts this comparison. tions could be updated through resurvey). Usually, ground-
[40] For small values of e, the covariance fit value water observations are made from ground surface to water
(smaller correlation scale or higher gamma) gave better table and are converted to groundwater heads by subtracting
results of Root Mean Square Error (RMSE), as e increases these values from estimated ground surface elevations.
the kernel parameter derived from the fivefold cross vali- When the variation in topography within the neighborhood
dation (selected based on overall best performance) gave of a piezometer head is large, the impact of dislocation error
better RMSE. This observation can be explained as follows: could be significant and may affect subsequent estimates in
At lower values of e, the estimated potentiometric surface both groundwater network design and flow and transport
will be close to the observed groundwater surface, requiring modeling. In the present study, we extracted piezometer
one to use highly localized kernel and, hence, such a kernel head information from a high-resolution (10m) Digital
is expected to produce a smaller RMSE. As e increases, the Elevation Model (DEM) using GIS operations and assumed
estimated potentiometric surface is flatter, with support that the dislocation error is negligible.
vectors far apart and, hence, a kernel with higher length [42] In order to investigate groundwater head measure-
scale would result in smaller RMSE values. ment errors, we conducted experiments for different Noise
[41] Two types of measurement errors may be identified to Signal Ratios (NSR) using Gaussian noise. NSR is
in the process: (1) piezometer dislocation (X, Y coordi- defined as the ratio between the variance of the noise and
nates); and (2) groundwater head measurement errors. The the variance of the observed data. Table 2 shows compar-
9 of 14
Figure 9. Error surface and selected monitoring wells for different levels of a prespecified error level
(e, number of monitoring wells): (a) (5%, 65); (b) (10%, 23); (c) (15%, 11); and (d) (20%, 8).
isons between designed networks with and without Gauss- Statistical Learning Theory (SLT). The SLT procedure
ian noise. At e = 5 % network size has increased marginally allows for an unbiased selection of monitoring points based
and this change increases with increase in NSR values. on their importance in constructing the groundwater poten-
Whereas at higher e values (for example, e = 10%), the tiometric surface without going through an exhaustive
change in network size remains steady with increasing search on different monitoring network configurations.
NSR. As seen in the table, the network size changes are The approach utilized consists of two parts: one related to
higher at lower values of e, which is in agreement with our the regularization of the solution (i.e., the estimated function
intuition, indicating that designed networks at higher e will always tend to be flat, avoiding over fitting), and the
levels are more tolerant against measurement corruptions. second related to the goodness-of-fit resulting in remarkable
For example, at e = 15%, the NSR value has to be increased generalization capabilities. The current procedure evaluates
to 50% in order to cause changes in the designed network. minimal information (number of monitoring wells) to de-
Overall, we have found the support vector – based designed sign a regional groundwater monitoring network by select-
network to be robust. ing from (many) existing wells. The locations of existing
wells are mapped to the potentiometric surface using a
nonlinear kernel transformation chosen a priori. The e-
5. Conclusions insensitive unique feature of SVMs was used to select
[43] We have presented a regional groundwater network (the number and locations of) monitoring wells. The ability
design procedure that used a new machine learning meth- of SVMs to construct potentiometric surface approxima-
odology called Support Vector Machines (SVM) based on tions using a very rich set of functions and to control the
10 of 14
fact that there is no consistent time series of groundwater

head observations that would enable us to include the
problem of temporal redundancy in the project area. The
methodology presented can be extended to analyze temporal
redundancy, for example, by following the same steps as
that of the spatial redundancy problem but doing it at
different time steps. The temporal-spatial long-term moni-
toring network may then be selected based on some
criterion (for example, frequency of a given monitoring
well occurrence in all the time steps [see Gangopadhyay et
al., 2001; Nunes et al., 2004a, 2004b]). The method
presented here could also be extended to include economic
analysis in the design of monitoring networks provided that
a utility function for e can be meaningfully formulated.
Figure 10. Data-driven (g = 6) and fivefold cross-
validated (g = 2) kernel performances.
Appendix A: Support Vector Machines
Algorithm
trade-off between accuracy of approximation and complex- [46] Suppose one wants to estimate a functional depen-
^
ity of the approximating function was the key to the present dency, Z (x), between input points {x1, x2, . . .,xl} taken
design procedure. Different accuracy of groundwater head from x 2 RK and {z1, z2, . . . .zN} with z 2 R drawn from a
surface approximations would then result in different sizes set of N independent and identically distributed (i.i.d.)
^
of networks that will be used to guide future data collection observations. We seek a function Z (x) by minimizing the
efforts. The nature of the optimization problem resulted in a following regularized risk functional [Vapnik, 1995, 1998]:
sparse set of actually used observation locations. In accor-
minimize
dance with our intuition, the SVM procedure placed mon-
itoring wells at the most uncertain locations (for example, 1 XN
groundwater watershed boundaries). The procedure also k wsv k2 þ C ðxi þ x*i Þ; ðA1aÞ
2 i¼1
retained a selected monitoring well for a higher e while
progressively adding monitoring wells for lower error levels subject to
(higher number of wells), rendering consistency in the
solutions. 8
>
> Z ðxÞ
hwsv ; xi
b e þ xi
[44] There are three important parameters to consider >
>
<
when using SVM for the cases presented in this paper. hwsv ; xi þ b
Z ðxÞ e þ x*i ; ðA1bÞ
For a range of error levels (e-insensitive potentiometric >
>
>
>
surface approximation), complexity, C, and kernel parame- :
xi ; x*i 0
ter, g, were estimated using a fivefold cross validation
approach. The Gaussian covariance kernel parameter was
to obtain
derived from observed data and used to constrain the search
space, incorporating domain knowledge into the design ^
process. The performance of this data-driven kernel param- Z ðxÞ ¼ hwsv ; x0 i þ b; ðA1cÞ
eter was also found to be fairly comparable to the one
obtained from fivefold cross validation in terms of predic- where wsv is support vector weights (we will just use w for
tion error of the complete set. simplicity), xi, x*i are slack variables that determines the
[45] This study has looked at the problem of spatial degree to which samples with error greater than e be
redundancy only, assuming future sampling plans will penalized (Figure 2). C > 0 determine trade-off between
change as site conditions change. This is because of the function complexity and closeness to data.
Table 2. Designed Network Comparisons Under Noise-Free and Corrupted Groundwater Head Observations
E = 5% E = 10% E = 15%
10a 25a 50a 10a 25a 50a 10a 25a 50a
Network size (NZ)b 67 70 68 23 22 22 11 11 10

DNZ,c % (+)3.0 (+)7.7 (+)4.6 0 (
)4.3 (
)4.3 0 0 (
)9.1
(+)/(
),d % (+)10 (+)10.8 (+)13.8 (+/
)4.3 (+)8.7 (+)8.7 0 0 (+)18.2
(
)7.7 (
)3.1 (
)9.2 (
)13 (
)13 0 0 (
)9.1
a
Noise-to-signal ratio (%), defined as the ratio between noise variance and variance of the observed data.
b
Size of network obtained using corrupted groundwater head observations.
c
Percentage change in network size compared to noise-free network size.
d
Percentage increase (+) or reduction (
) in wells compared to a noise-free network. (+/
) means that the number of wells added and reduced are equal.
This is the case when network size remains constant.
11 of 14
[47] The dual form is obtained by using Lagrange multi- to obtain

pliers. Equations (A1a) – (A1c) written in dual form is as X
n
^
follows: Z ðxÞ ¼ ða*i
ai Þhx; xi i þ b: ðA9cÞ
1 X
N i¼1
k w k2 þC
Gðw; x; x*; a; a*; h; h*; bÞ ¼ ðxi þ x*i Þ
2 i¼1 Since the above expression depends only on inner products
" # between input examples, kernel substitution (also called the
X
N X
K

ai e þ xi
zi þ wj xji þ b kernel trick (see Figure 3)) of hx, x0i hF(x), F(x0)i =
0
i¼1 j¼1 K(x, x ) would result in the SVM algorithm:
" #
X
N X
K maximize

a*i e þ x*i þ zi
wj xji
b
i¼1 j¼1 X
N X
N
W ða*; aÞ ¼
e ðai þ a*i Þ þ Zi ðai
a*i Þ
X
L
i¼1 i¼1

i *
½hi xi þ h*xi ; ðA2Þ
i¼1 1X N

ðai
a*i Þ aj
a*j k xi ; xj ; ðA10aÞ
2 i;j¼1
where a*, a, h*, h are Lagrange multipliers, and j = 1,.K is
input dimension. The saddle point condition states that the subject to constraints
partial derivatives of G with respect to primal variables X
N
(w, b, xi, x*i ) have to vanish for optimality, i.e., ða*i
ai Þ ¼ 0 0 ai ; a*i C; ðA10bÞ
i¼1
@G X N
¼ ða*i
ai Þ ¼ 0; ðA3Þ to obtain
@b i¼1
^ X
n
" # Z ðxÞ ¼ ða*i

ai Þk ðx; xi Þ þ b: ðA10cÞ
@G X K
@G ! X K
! X N XK
! i¼1
¼ z ¼ wj z þ ða*i
ai Þ xij z ¼ f0g;
@w j¼1
@wj j j¼1 i¼1 j¼1
The bias b of the function that we are seeking is found from
the Kuhn-Tucker (KT) condition, which requires that for the
ðA4Þ
optimal solution the product between dual variables and
constraints vanish. Mathematically, this is expressed as
!
@G XN X
K
¼w
ða*i
ai Þxi ¼ f0g; ðA5Þ ai e þ xi
zi þ wj xji þ b ¼ 0; ðA11Þ
@w i¼1 j¼1
and thus !
X
K
a*i e þ x*i þ zi þ wj xji þ b ¼ 0; ðA12Þ
X
N
j¼1
w¼ ða*i
ai Þxi ; ðA6Þ
i¼1
! ðai
C Þxi ¼ 0;
where z is a unit vector. Also, ðA13Þ
ða*i
C Þx*i ¼ 0:
@G
¼ C
ai
hi ¼ 0 ðA7Þ
@xi From the relations shown above, it follows that: (1) only
samples (xi, zi) with corresponding a*i or ai = C lie outside
the e tube; (2) the dual variables are mutually exclusive
@G (a*iai = 0); if both dual variables have nonzero values, it
¼ C
a*i
h*i ¼ 0: ðA8Þ
@x*i would require nonzero slack variables on both directions;
and (3) for a*i and ai2 (0,C) it follows that x = x* = 0, i.e.,
Substituting equations (A3) to (A8) in equation (A2) results the zi lie on the e tube. Since the second term has to vanish
in the following quadratic optimization problem. Maximize also to satisfy the KT condition, this result would allow the
the following functional with respect to the forcings (as): estimation of b. Even though a single xi would be enough to
X
N X
L solve the problem, in practice one uses the average of all the
W ða*; aÞ ¼
e ðai þ a*i Þ þ zi ðai
a*i Þ support vectors that lie on the e tube for the purpose of
i¼1 i¼1 assuring stability [Müller et al., 1999]. Thus the proper
1X N X N formulation for estimating b is

ðai
a*i Þ aj
a*j xi ; xj ; ðA9aÞ 8
2 i¼1 j¼1 >
>
> 1 X
M
>
<M ðzm
hw; xm i
e; for am 2 ð0; C Þ
m¼1
subject to constraints b¼ ; ðA14Þ
>
> 1 XM
>
> ðzm
hw; xm i þ e; for am 2 ð0; C Þ
:M
X
N m¼1
ða*i
ai Þ ¼ 0 0 ai ; a*i C; ðA9bÞ
i¼1 where M is the number of sample points on the e tube.
12 of 14
[48] Acknowledgments. We are grateful for the thoughtful review Massmann, J., and R. A. Freeze (1987a), Groundwater contamination from
and suggested improvements by two anonymous reviewers that helped waste management sites: The interaction between risk-based engineering
improve this manuscript. We especially thank the first reviewer for her/his design and regulatory policy: 1. Methodology, Water Resour. Res., 23,
substantive guidance in improving the manuscript. 351 – 367.
Massmann, J., and R. A. Freeze (1987b), Groundwater contamination from
References waste management sites: The interaction between risk-based engineering
Angulo, M., and W. H. Tang (1999), Optimal groundwater detection mon- design and regulatory policy: 2. Results, Water Resour. Res., 23, 368 –
itoring system design under uncertainty, J. Geotech. Geoenviron. Eng., 380.
125, 510 – 517. Meyer, P. D., and E. D. Brill, Jr. (1988), A method for locating wells in a
Asefa, T., and M. W. Kemblowski (2002), Support vector machines approx- groundwater monitoring network under conditions of uncertainty, Water
imation of flow and transport models in initial groundwater contamina- Resour. Res., 24, 1277 – 1282.
tion network design, Eos Trans. AGU, 83(47), Fall Meet. Suppl., Meyer, P. D., A. J. Valocchi, and J. W. Eheart (1994), Monitoring network
Abstract H72D-0882. design to provide initial detection of groundwater contamination, Water
Associated Earth Sciences, Inc. (1994), Wellhead protection plan for the Resour. Res., 30, 2647 – 2659.
city of Everson, Whatcom County, Washington, report, Kirkland, Minsker, B., and Task Committee (2003), Long-term groundwater monitor-
Wash. ing design: State of the art applications, report, Am. Soc. of Civ. Eng.,
Associated Earth Sciences, Inc. (1995), Wellhead protection program, Reston, Va.
Sumas, Washington, for city of Sumas, report, Kirkland, Wash. Molina, G. R., J. J. Beauchamp, and T. Wright (1996), Determining an
Ben-Jemaa, F., M. A. Marino, and H. A. Loaiciga (1994), Multivariate optimal sampling frequency for measuring bulk temporal changes in
geostatistical design of groundwater monitoring networks, J. Water groundwater quality, Ground Water, 34, 579 – 587.
Resour. Plann. Manage., 120, 505 – 522. Montas, H. J., R. H. Mohtar, A. E. Hassan, and F. AlKhad (2000),
Cameron, K., and P. Hunter (2000), Optimization of LTM networks using Heuristic space-time design of the monitoring wells for contaminant
GTS: Statistical approaches to spatial and temporal redundancy, report, plume characterization in stochastic flow fields, J. Contamin. Hydrol.,
Air Force Cent. for Environ. Excell., Brooks AFB, Tex. 43, 271 – 301.
Cieniawski, S. E., J. W. Eheart, and S. R. Ranjithan (1995), Using genetic Morisawa, S., and Y. Inoue (1991), Optimum allocation of monitoring wells
algorithms to solve a multiobjective groundwater monitoring problem, around a solid-waste landfill site using precursor indicators and fuzzy
Water Resour. Res., 31, 399 – 409. utility functions, J. Contamin. Hydrol., 7, 337 – 370.
Cox, S. E., and S. C. Kahle (1999), Hydrogeology, ground water quality, Müller, K. R., A. Smola, G. Rätsch. B. Schölkopf, J. Kohlmorgen, and
and sources of nitrate in lowland glacial aquifer of Whatcom County, V. Vapnik (1999), Predicting time series with support vector machines,
Washington, and British Columbia, Canada, U. S. Geol. Surv. Water in Advances in Kernel Methods: Support Vector Learning, edited by
Resour. Invest. Rep., 98-4195. B. Schölkopf, C. J. C. Burges, and A. J. Smola, pp. 243 – 254, MIT
Datta, B., and S. D. Dhiman (1996), Chance-constrained optimal monitor- Press, Cambridge, Mass.
ing network design for pollutants in groundwater, J. Water Resour. Nunes, L. M., E. Paralta, M. C. Cunha, and L. Ribeiro (2004a), Ground-
Plann. Manage., 122, 180 – 188. water nitrate monitoring network optimization with missing data, Water
Dibike, B. Y., S. Velickov, D. Solomatine, and B. M. Abbot (2001), Model Resour. Res., 40, W02406, doi:10.1029/2003WR002469.
induction with support vector machines: Introduction and applications, Nunes, L. M., M. C. Cunha, and L. Ribeiro (2004b), Groundwater mon-
J. Comput. Civ. Eng., 15, 208 – 216. itoring network optimization with redundancy reduction, J. Water Re-
Gangopadhyay, S., A. D. Gupta, and M. H. Nachabe (2001), Evaluation of sour. Plann. Manage., 130, 33 – 43.
groundwater monitoring network by principal component analysis, Poggio, T., and F. Girosi (1998a), A sparse representation for function
Ground Water, 39, 181 – 191. approximation, Neural Comput., 10, 1445 – 1454.
GeoEngineers Hydrogeologic Services (1994), Wellhead protection study: Poggio, T., and F. Girosi (1998b), Notes on PCA, regularization, sparsity
Dodson’s IGA well, Whatcom County, Washington, U.S.A., report, and support vector machines, AI Memo. 1632, CBCI Pap. 161, Mass.
Bellingham, Wash. Inst. of Technol., Cambridge.
Girosi, F. (1998), An equivalence between sparse approximation and sup- Reed, P., and B. S. Minsker (2004), Striking the balance: Long-term
port vector machines, Neural Comput., 10, 1455 – 1480. groundwater monitoring design for conflicting objectives, J. Water Re-
Govindaraju, R. S., and A. R. Rao (2000), Artificial Neural Network in sour. Plann. Manage., 130, 140 – 149.
Hydrology, 348 pp., Kluwer Acad., Norwell, Mass. Reed, P., B. Minsker, and A. J. Valocchi (2000), Cost-effective
Hastie, T., R. Tibshirani, and J. Friedman (2001), The Elements of Statis- long-term groundwater monitoring design using a genetic algorithm
tical Learning: Data Mining, Inference and Prediction, Springer-Verlag, and global mass interpolation, Water Resour. Res., 36, 3731 –
New York. 3741.
Hudak, P. F., and H. A. Loaiciga (1992), A location modeling approach for Reed, P., B. S. Minsker, and D. E. Goldberg (2001), A multiobjective
groundwater monitoring network augmentation, Water Resour. Res., 28, approach to cost effective long-term groundwater monitoring using an
643 – 649. Elitist Nondominated Sorted Genetic Algorithm with historical data,
Jardine, K., L. Smith, and T. Clemo (1996), Monitoring networks in J. Hydroinformatics, 3, 71 – 90.
fractured rocks: A decision analysis approach, Ground Water, 34, Reed, P., B. S. Minsker, and D. E. Goldberg (2003), Simplifying multi-
504 – 518. objective optimization: An automated design methodology for the non-
Jones, M. A. (1999), Geologic framework for the Puget Sound aquifer dominated sorted genetic algorithm – II, Water Resour. Res., 39(7), 1196,
system, Washington and British Columbia, U.S. Geol. Surv. Prof. Pap., doi:10.1029/2002WR001483.
1424-C. Rouhani, S. (1985), Variance reduction analysis, Water Resour. Res., 21,
Journel, A., and C. Huijbregts (1978), Mining Geostatistics, Academic, San 837 – 846.
Diego, Calif. Saunders, C., M. O. Stitson, J. Weston, L. Bottou, B. Scholkopf, and
Kaneviski, M., A. Pozdnukhov, S. Canu, and M. Maignan (2000), Ad- A. Smola (1998), Support vector machine reference manual, Tech. Rep.
vanced spatial data analysis and modeling with support vector machines, CSD-TR-98-03, Royal Holloway Univ. of London, London.
Int. J. Fuzzy Syst., 4, 606 – 615. Schölkopf, B., J. C. Burges, and A. Smola (1999), Advances in
Knopman, D. S., C. I. Voss, and S. P. Garabedian (1991), Sampling design Kernel Methods: Support Vector Learning, MIT Press, Cambridge,
for groundwater solute transport: Tests of methods and analysis of Cape Mass.
Code tracer test data, Water Resour. Res., 27, 925 – 949. Storck, P., J. W. Eheart, and A. J. Valocchi (1997), A method for the
Liong, S. Y., and C. Sivapragasam (2000), Flood stage forecasting with optimal location of monitoring wells for detection of groundwater con-
SVM, J. Am. Water Resour. Assoc., 38, 173 – 186. tamination in three-dimensional heterogeneous aquifers, Water Resour.
Loaiciga, H. A., R. J. Charbeneau, L. G. Everett, G. E. Fogg, B. F. Hobbs, Res., 33, 2081 – 2088.
and S. Rouhani (1992), Review of groundwater quality monitoring Tikhonov, A., and V. Arsenin (1977), Solution of Ill-Posed Problems, W. H.
network design, J. Hydrol. Eng., 118, 11 – 37. Winston, Washington, D. C.
Mahar, P. S., and B. Datta (1997), Optimal monitoring network and ground- Vanderbei, R. J. (1994), LOQO: An interior point code for quadratic pro-
water pollution source identification, J. Water Resour. Plann. Manage., gramming, Rep. TRSOR-94-15, Stat. and Oper. Res. Princeton Univ.,
123, 199 – 207. Princeton, N. J.
13 of 14
Vapnik, V. (1995), The Nature of Statistical Learning Theory, Springer- Water Resources Consulting, LLC (1997), Wellhead protection program,
Verlag, New York. report, Pole Road Water Assoc., Whatcom County, Wash.
Vapnik, V. (1998), Statistical Learning Theory, John Wiley, Hoboken,
N. J.
Wagner, B. J. (1995), Sampling design methods for groundwater modeling T. Asefa, M. W. Kemblowski, A. Khalil, M. McKee, and G. Urroz,
under uncertainty, Water Resour. Res., 31, 2581 – 2591. Department of Civil and Environmental Engineering, Utah State University,
Wahba, G. (1990), Spline Models for Observation Data, Ser. Appl. Math., Logan, UT 84322, USA. (tasefa@cc.usu.edu; mkem@cc.usu.edu; akhalil@
vol. 59, Soc. for Indust. and Appl. Math., Philadelphia, Pa. cc.usu.edu; mmckee@cc.usu.edu; gurro@cc.usu.edu)
14 of 14

Monitoring Design W/ Support Vector Machines

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Monitoring Design W/ Support Vector Machines

Загружено:

Авторское право:

Доступные форматы

WATER RESOURCES RESEARCH, VOL. 40, W11509, doi:10.

Support vectors--based groundwater head observation networks

Figure 2. The e-insensitive loss function G.

This objective function minimizes the complexity of the W ða*; aÞ ¼

Figure 3. Conceptual representation of kernel transformation.

observed data (groundwater head observations at monitor-

fact that there is no consistent time series of groundwater

10a 25a 50a 10a 25a 50a 10a 25a 50a

Network size (NZ)b 67 70 68 23 22 22 11 11 10

[47] The dual form is obtained by using Lagrange multi- to obtain

" # Z ðxÞ ¼ ða*i

Вам также может понравиться