Journal of Hydrology: K.S. Kasiviswanathan, Jianxun He, K.P. Sudheer, Joo-Hwa Tay

Journal of Hydrology 536 (2016) 161–173
Contents lists available at ScienceDirect
Journal of Hydrology
journal homepage: www.elsevier.com/locate/jhydrol
Potential application of wavelet neural network ensemble to forecast

streamflow for flood management
K.S. Kasiviswanathan a, Jianxun He a,⇑, K.P. Sudheer b, Joo-Hwa Tay a
a
Department of Civil Engineering, Schulich School of Engineering, University of Calgary, 2500 University Drive NW, Calgary T2N 1N4, Canada
b
Department of Civil Engineering, Indian Institute of Technology Madras, Chennai 600 036, India
a r t i c l e i n f o s u m m a r y
Article history: Streamflow forecasting, especially the long lead-time forecasting, is still a very challenging task in hydro-
Received 30 November 2015 logic modeling. This could be due to the fact that the forecast accuracy measured in terms of both the
Received in revised form 4 February 2016 amplitude and phase or temporal errors and the forecast precision/reliability quantified in terms of the
Accepted 23 February 2016
uncertainty significantly deteriorate with the increase of the lead-time. In the model performance eval-
Available online 2 March 2016
This manuscript was handled by Andras
uation, the conventional error metrics, which primarily quantify the amplitude error and do not explicitly
Bardossy, Editor-in-Chief, with the account for the phase error, have been commonly adopted. For the long lead-time forecasting, the wave-
assistance of Fi-John Chang, Associate Editor let based neural network (WNN) among a variety of advanced soft computing methods has been shown
to be promising in the literature. This paper presented and compared WNN and artificial neural network
Keywords: (ANN), both of which were combined with the ensemble method using block bootstrap sampling (BB), in
Artificial neural network terms of the forecast accuracy and precision at various lead-times on the Bow River, Alberta, Canada.
Block bootstrap Apart from conventional model performance metrics, a new index, called percent volumetric error,
Forecast accuracy was proposed, especially for quantifying the phase error. The uncertainty metrics including percentage
Forecast precision of coverage and average width were used to evaluate the precision of the modeling approaches. The
Streamflow forecast results obtained demonstrate that the WNN-BB consistently outperforms the ANN-BB in both the cate-
Wavelet neural network
gories of the forecast accuracy and precision, especially in the long lead-time forecasting. The findings
strongly suggest that the WNN-BB is a robust modeling approach for streamflow forecasting and thus
would aid in flood management.
Ó 2016 Elsevier B.V. All rights reserved.
1. Introduction based models often requires intensive computation, expertise of

hydrologic modelers, and determination of numbers of physical
Although a variety of hydrologic models have been adopted for parameters through field measurement, which can be very chal-
forecasting streamflow, data-driven models have gained significant lenging (Srivastav et al., 2007). In addition, physically-based mod-
interest. The primary reason to promote the use of data-driven els sometimes cannot capture a slight change in watershed
models lies in that many data-driven models are capable of captur- response (Alvisi and Franchini, 2011). All these facts about
ing non-linear processes numerically without fully understanding physically-based models encourage the application of data-driven
the underlying physical processes involved. These models include models, which primarily aim to produce accurate forecasts while
artificial neural network (ANN) (Prakash et al., 2014), recurrent ignoring the complicated underlying physical processes.
neural networks (Chen et al., 2013; Chang et al., 2014), support Most recently, wavelet based data-driven modeling approach
vector machines (Han et al., 2007), genetic programming approach has gained significant attention among hydrologists due to its
(Kisi et al., 2013), and neuro-fuzzy (Nayak et al., 2005). Among potential to capture both the periodic and chaotic behavior and
these modeling approaches, no single modeling approach consis- trend of time series data (Adamowski and Sun, 2010). The wavelet
tently outperforms others. In general, data-driven models primar- decomposes the original signal into several different resolution
ily rely on historical observations unlike physically-based models, levels to capture the useful information and hence increases the
which explicitly account watershed characteristics and physical performance of model prediction (Nourani et al., 2011). The appli-
processes involved. However, the development of physically- cation of wavelet based neural network (WNN) for hydrologic
modeling was initially explored by Wang and Ding (2003), who
concluded that the combination of ANN and wavelet techniques
⇑ Corresponding author. Tel.: +1 403 220 4112.
could enhance the model accuracy, especially in the long
E-mail address: jianhe@ucalgary.ca (J. He).
http://dx.doi.org/10.1016/j.jhydrol.2016.02.044
0022-1694/Ó 2016 Elsevier B.V. All rights reserved.
162 K.S. Kasiviswanathan et al. / Journal of Hydrology 536 (2016) 161–173
lead-time forecasting. Since then, a large number of studies have the entire hydrograph rather than focusing on the peak flow. For
applied WNN for both the long and short lead-time flow forecast- flood management purposes, a week (7-day) of lead-time is
ing (e.g., Shiri and Kisi, 2010). Many studies have demonstrated required to issue flood awareness, activate mitigation measures,
that WNNs often produce more consistent and accurate results and evacuate most vulnerable groups (Golding, 2009). Therefore,
when compared to the traditional ANNs (Nourani et al., 2009; the lead-times up to 7-day are investigated in the paper.
Adamowski and Sun, 2010; Seo et al., 2015). Furthermore, several
studies have quantified uncertainty in wavelet based flood fore-
2. Methodology
casting aiming to improve the model reliability (Tiwari and
Chatterjee, 2010; Sang et al., 2015).
In this paper, the streamflow forecasting is conducted using
In streamflow forecasting, the model accuracy, in general, deteri-
both WNN and ANN. The models are developed for forecasting
orates with the increase of the lead-time, which can be attributed to
streamflow at several lead-times including 1-day, 3-day, 5-day
the weak dependence between the modeled variable and input(s).
and 7-day. In the model development, BB is employed to sample
The forecasting error is generally classified into three categories
data to generate an ensemble of models. The following sections
including amplitude error, phase/temporal error and shape error
briefly describe the methods of ANN, WNN, and BB, respectively.
(Prakash et al., 2014). The amplitude error is mainly caused by the
presence of noise in input data or due to deficiencies in the model
structure. This would lead to either overestimation or underestima- 2.1. Artificial neural network
tion of streamflow (Shamseldin and O’Connor, 2001). The phase error
indicates the lag in timing of the simulated hydrograph, which is crit- An ANN is characterized as massively parallel interconnections
ical in flow forecasting (Prakash et al., 2014). The shape error is of simple neurons that function as a collective system. The network
mainly determined by the rate of flow change in the rising and falling topology consists of a set of nodes (neurons) connected by links that
limbs of a hydrograph. The evaluation of the model performance (in are usually organized in a number of layers. Each node in a layer
terms of the model accuracy) has often been conducted using the receives and processes weighted input from previous layer and
intuitive graphical representation as well as statistical measures transmits its output to nodes in the following layer through links.
such as root mean square error (RMSE) and Nash–Sutcliffe coefficient Each link is assigned a weight, which is a numerical estimate of
(EI), which are objective and quantitative in nature but without con- the connection strength. The weighted summation of inputs to a
sidering the temporal dimension of time series (i.e., hydrograph). In a node is converted to an output according to a transfer function (typ-
few studies, the temporal error, but only limited to the peak flow ically a sigmoid function). Most ANNs have three or more layers: an
rather the entire hydrograph, has been focused aiming to improve input layer, which is used to feed data to the networks; an output
the prediction accuracy of peak flow in terms of both magnitude layer, which is used to produce an appropriate response to the given
and/or timing (e.g., Liu et al., 2011). In such a situation, the model inputs; and one or more hidden layers, which are used to act as a
may not warrant the best possible solution when there are multiple collection of feature detectors. The multilayer perceptron (MLP)
peaks as the model performance often biases toward a particular network is one of the most popular ANN architectures in use today
peak flow. Furthermore, several researchers have made effort to (Maier et al., 2010). The plethora of works that used ANN has been
reduce the phase error by modifying the modeling approaches. For reported in the field of hydrology and water resources. Unfortu-
instance, Abrahart et al. (2007) applied correction factor when cali- nately, to date, there are no systematic rules for ANN training to
brating ANN models to minimize the phase error; however they con- determine the optimal network weights and biases. Among many
cluded that this method is only effective for the short lead-time available training techniques, this paper used the Levenberg–
forecasting but not for the long lead-time forecasting. Marquardt optimization algorithm to train the neural networks.
Besides, the model precision, often assessed in terms of the pre-
diction uncertainty, degrades as the lead-time of forecasting 2.2. Wavelet based neural network
increases (Kasiviswanathan et al., 2013). In the long lead-time fore-
casting, several possible reasons including the decrease of the Wavelet analysis is a signal processing technique employed for
influence of input(s) on the output, increased ranges of model the purpose of extracting the useful information in data series,
parameters, and inadequate model structure could lead to the which are either stationary or non-stationary. The advantage of
increase of prediction uncertainty. However to present, most stud- wavelet analysis lies in producing high resolution information of
ies have concentrated on discussing the model accuracy, but not data series in both time and frequency domains, which is not
the model precision except few (e.g., Alvisi and Franchini, 2011; otherwise available in other transformations. The wavelet trans-
Kasiviswanathan et al., 2013). form has been shown to be a more effective tool than the Fourier
In view of the above, the primary objective of this paper is to transform when dealing with non-stationary hydrologic time ser-
identify the robust modeling approach from two most popular ies (Partal and Kisßi, 2007). The width of the wavelet changes with
data-driven methods, namely ANN and WNN, especially for the each spectral component in the wavelet transform; whereas the
long lead-time forecasting, through assessing both modeling accu- Fourier transform has a constant width. The wavelet function gen-
racy and precision. To the authors’ best knowledge, no research erates the signal to be analyzed and the signal transform is com-
articles that compare ANN and WNN for forecasting from both puted for each segment generated. The scaling and shifting are
aspects of model accuracy and precision to present. The ensemble the two main parameters used in the wavelet transform to decom-
modeling approach has been demonstrated to have the capability pose the original time series signal into different resolutions. The
to improve the model accuracy in hydrologic modeling (Tiwari large-scale signal provides the detailed information, while the
and Chatterjee, 2010). Therefore, the combination of the data- small-scale signal compresses the original signal in order to pre-
driven modeling approaches and the block bootstrap (BB) sampling serve the global information about the signal (Cannas et al.,
method (named as ANN-BB and WNN-BB) are used and compared 2006). The time and frequency domains are responsible to main-
in a real application using data collected from the Bow River, tain the localization property.
Alberta, Canada. The advantage of BB lies in preserving the correla- This paper used the discrete wavelet transform (DWT) as the
tion and periodicity inherent in time series data (Ebtehaj et al., CWT necessitates a large amount of computation time and
2010). In addition, a new index, called percent volume error resources. The proper mother wavelet should be identified prior
(PVE), is proposed to quantify the phase error and it considers to analyzing the input time series signals when using DWT. In
K.S. Kasiviswanathan et al. / Journal of Hydrology 536 (2016) 161–173 163
the paper, the modified form of DWT, which is derived from CWT, (b) Determine the moving length (k) (i.e., 1 year) using the auto-
in Adamowski and Sun (2010) was adopted. The DWT operates two correlation. The moving length is then fixed throughout the
sets of function and consequently, the original time series data are iteration.
processed through two types of filters such as high-pass and low- (c) Initialize the block length ‘l’ (i.e., 1 year).
pass. The high pass filters are further separated into different levels (d) Sample the input–output from the entire calibration data
of details depending on the time scale required. The low pass filters set.
comprise the trend presented in the actual input time series signal (e) Train the model (here, WNN or ANN model) use the samples
and are named as approximation. More details of the mathematical and store the model parameters (i.e. weights and biases).
formulations of DWT are provided in Mallat (1989). (f) Evaluate the model performance by estimating RMSE and
A WNN has the architecture similar to that of the standard ANN, associated variability in the form of standard deviation of
but it is fed by the decomposed signals instead of the actual time ensemble mean.
series data. Therefore, the model development approach for WNN (g) Increase the block size (i.e. l = l + Dl, i.e., where, Dl = 1 year).
is same as that for ANN. (h) Repeat the steps from (d) to (f) and compute RMSE (updated
RMSE) of the ensemble mean and standard deviation.
2.3. Block bootstrap method (i) Continue the steps (g) to (h) if the updated RMSE is less than
the RMSE estimated in previous step (f); otherwise store the
In the deterministic modeling approach, the model parameters block length ‘l’ and estimate model performance and uncer-
are determined assuming that they represent the time invariant tainty indices such as percentage of coverage (POC) and
properties of a process to be modeled. However due to the uncer- average width (AW) as described by Eqs. (9) and (10) in
tainty mentioned above, the model parameters should be treated the following section.
as non-deterministic parameters, which consequently are
required to be contained within certain ranges. One of 3. Model performance evaluation
approaches for quantifying such ranges is the probabilistic
approach, in which the assumption of the prior distribution of The models developed are evaluated in terms of both the model
model parameters is required. In practice, it is however very dif- accuracy and the model precision using various metrics.
ficult, if it is not possible, to make such an assumption with sta-
tistical justification for the distribution selection. Furthermore, 3.1. Evaluation of model accuracy
the assumption might lead to poor convergence in the posterior
distribution of model parameters (Adkison and Peterman, The model results are evaluated with regard to accuracy using
1996). Alternatively, bootstrap resampling method, which can various error metrics including coefficient of determination (R2), EI,
avoid the drawback of the probabilistic approach, has been RMSE, mean bias error (MBE). In addition to these indices, persistence
applied in the model calibration to determine parameter ranges. index (PI) and extrapolation index (EXI) are also included. PI com-
Despite a variety of bootstrap resampling methods available in pares the model forecast with the forecast obtained by assuming that
the literature, the BB is utilized to sample the input–output pat- the process to be modeled is a Wiener process (variance increasing
terns for calibrating the models as the BB has an advantage of linearly with time), in which the best estimate for the future is given
preserving the correlation structure exhibited in time series data by the latest measurement (Kitanidis and Bras, 1980). EXI assesses
(Ebtehaj et al., 2010). whether the model just extrapolates the values from the recent mea-
The BB essentially requires two parameters including block surements. All selected indices (except the mean bias error and coef-
length and moving length. The block length refers to a total num- ficient of extrapolation) also can be found in Crochemore et al.
ber of data that each block contains and the moving length is the (2015), which summarizes various error metrics for evaluating mod-
overlap of the data between two blocks. In order to determine eled hydrograph. These indices are calculated by:
these two parameters, different approaches have been proposed 8 92
>
> h i >
>
in literature (Hall et al., 1995; Politis and White, 2004). The optimal >
> P >
>
< n
i¼1 y o
i
y o
y i
f

y f =
block length can be determined based on model performance (i.e. 2
R ¼ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð1Þ
> X f 2>
> Pn yo y
> >
n
RMSE) so that the model produces less error. In hydrologic model- >
>
: o 2 yi y f > ;
ing, an early study by Ebtehaj et al. (2010) applied a BB approach to i¼1 i
i¼1
improve the robustness of hydrologic parameter estimation. In
their approach, the pseudo time series with a fixed arbitrary block ( Pn o )
f 2
i¼1 yi y
length are sampled uniformly from independent and identical dis- EI ¼ 1 Pn 100 ð2Þ
tribution. However the block length should be determined consid-
o o 2
i¼1 yi y
ering the characteristics of the data used for the model
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
development. Therefore in this paper, the optimal block length 1 Xn o f
2
was determined through the model calibration, so that the model RMSE ¼ y i y i ð3Þ
n i¼1
produces less error (i.e. RMSE). Besides, the moving length is also
n
required to be selected initially. The selected moving length should 1X
ensure that the sample data in each block preserve the statistical MBE ¼ yoi yif ð4Þ
n i¼1
characteristics, such as the auto correlation and heteroscedasticity,
of the original time series data. In this paper, the auto-correlation is 8 2 9
> Pn o >
used to determine the moving length. < i¼1 yi yi
f
=
The novel algorithm proposed to determine the block length is PI ¼ 1 ð5Þ
>
: P n
2>
;
described as follows: yo yo
i¼1 i ij
8 2 9
(a) Let ‘l’ and ‘k’ denote the block length and the moving length, > Pn o >
respectively. The input (xi) and output (yi) patterns of any
< i¼1 yi yi
f
=
EXI ¼ 1 Pn ð6Þ
bootstrap blocks are represented as ðxi ; yi Þ where (i.e. i = 1, >
:
2>
;
i¼1 yi yli
o o
2,. . ., l).
where n is the number of data points; yoi and yif are the observed (Case III) that the simulated hydrograph has a similar geometry (in
o and y
and forecasted values at a time, respectively; y f are the mean terms of both magnitude and shape) as the observed hydrograph,
of the observed and forecasted values; ‘j’ is the lead-time in the but there is a clear shift in time between the observed and simu-
lated hydrographs. Note that in streamflow forecasting, the simu-
forecasting; and ylif is the forecast corresponding to straight line fit-
lated hydrograph often shifts toward the right-hand side of the
ted to two most recent measurements, yoij and yoij1 .
observed hydrograph. In this case, the model overall underesti-
When conducting hydrograph evaluation (comparing modeled mates flows in the region of the rising limb of hydrograph, how-
and observed hydrographs), there are a large number of error met- ever it overestimates flows in the region of the falling limb of
rics currently available for selection and implementation. hydrograph (Case II). Besides the three cases, a model may produce
Crochemore et al. (2015) provided numbers of error metrics com- perfect forecasts, which do not have obvious amplitude and phase
monly used for this purpose. These error metrics, including the errors.
ones selected above in the paper, primarily estimate the amplitude In streamflow forecasting for flood management purposes, both
error in a variety of ways. The phase error, which commonly pre- the hydrograph geometry and the timing are two critical measures
sents in forecasts (e.g., flow forecasts here) and is obvious espe- to assess the model accuracy. Therefore, a new index, percent vol-
cially in long lead-time forecasting, has not been explicitly umetric error (PVE), is proposed aiming to quantitatively estimate
accounted when evaluating the model performance to present. the phase error for entire hydrograph. Fig. 2 illustrates a typical
Therefore, the hydrograph evaluation based on several existing example of observed and simulated hydrographs and the volume
error metrics might convey misleading information without suffi- error (VE) in a time interval, Dt (the shaded area in Fig. 2). PVE is
ciently assessing the modeled results from the aspect of phase calculated by Eqs. (7) and (8). The calculation of PVE for overesti-
error. As a result, an error metrics which explicitly considers the mation (negative VE) and underestimation (positive VE) is
phase error would be beneficial in better evaluating the developed separated.
model or minimizing the phase error in the model development.
8 9
Therefore in the paper, a new metrics is proposed especially for < yoi yif þ yoiþ1 yiþ1
f
=
quantifying the phase error by considering the second dimension VEiþ1 ¼ Dt ð7Þ
i
: 2 ;
– the time, when evaluating the model accuracy. Typically there
are three different possible cases in terms of the relative location
of a simulated hydrograph to an observed hydrograph (Fig. 1). Pu or o
VEiiþ1
The Case I represents that a model captures the dynamics or pre- PVEu or o ¼ Pi¼1
n 100 ð8Þ
i¼1 yi Dt
o
serves the shape of a hydrograph, but consistently underestimates
the observations. Thus the simulated hydrograph shows little to no where u and o indicate the numbers of positive and negative VE,
phase error (time lag). Note that a model can also overestimate the respectively.
observations while without an obvious phase error. In Case II, the In contrast to the conventional model accuracy measures, PVE
simulated hydrograph is right-skewed relative to the observed quantitatively accounts for the difference in the locations between
hydrograph; in addition, the time lags in the raising limb and the the observed and simulated hydrographs through adding the time
falling limb are not symmetric and the peak flow can be either into the calculation instead of concentrating on the magnitude of
overestimated or underestimated. There is also another possibility simulated results. As illustrated in Fig. 1, either PVEu or PVEo is zero
Fig. 1. Three typical relationships between observed (solid line) and simulated hydrographs (dotted line). Case I: no phase error, but consistent underestimation (or
overestimation – not shown in the graph); Case II: right-shifted with uneven phase errors in the rising and falling limbs; Case III: same shape, but evenly right-shifted in the
rising and falling limbs.
Observed the mountain snowpack melts. Flow in the Bow River at Calgary
often peaks after mid-June due to the contribution of both rain-
Simulated fall–runoff and mountain snowmelt and then slowly decreases
toward the winter. The City of Calgary, which is the most popu-
lated community within the river basin, is located in the foothills,
which is 1050 m above mean sea level. In the last decade, the City
Streamflow, m3/s
experienced two major flood events in 2005 and 2013, respec-

tively. The 2013 flood event is recorded as one of the most catas-
trophic flooding event in the Alberta’s history. The recently
occurred floods with the magnitude of 1750 m3/s in 2013 and
602 m3/s in 2005 posed serious threats to Calgary.
There are several flow gauge stations, which are operated by
Water Survey of Canada, on the river. The flow gauge station in
the City of Calgary (Bow River at Calgary, 05BH004) (Fig. 3) was
Time, t used in this paper to demonstrate the methodology. The historical
daily average flow data collected from 1912 to 2013 at the gauge
Fig. 2. The illustration of the calculation of the volume error (VE). The shaded area
between the observed and simulated hydrograph indicates the VE in a time interval, station in Calgary were used. The data set has an average flow of
Dt. The VE in a time interval can be a positive (underestimation) or negative number 90.14 m3/s and a standard deviation of 72.81 m3/s. The distribution
(overestimation). of the data set is right-skewed (skewness = 2.68) as the majority of
data points are located in the range between the low flow (approx-
imately 50 m3/s) and moderate flow; while the data is very sparse
in Case I; PVEu and PVEo are obvious different in terms of magni-
in high flow region (above 300 m3/s). There are 2% of data identi-
tude in Case II; while PVEu and PVEo are equivalent in Case III,
fied as outliers (outside of mean ± 3 times standard deviation).
which suggests that the model yields small errors in magnitude.
The floods recorded in 2013 and 2005 were found to largely devi-
ate from the ranges by 483% and 100%, respectively. Strong auto-
3.2. Evaluation of model precision
correlations (above 0.90) were calculated between the current
and antecedent flows up to a time lag of 9 days. Hence, this paper
Uncertainty has been often mathematically presented in the
investigated using the antecedent flow(s) to forecast flow at cur-
forms of probability distribution, joint probability distribution,
rent time.
and interval bound. This paper assesses the uncertainty in the
model prediction using the interval bound approach. Graphical
5. Model development
representations such as rank histogram and reliability diagram
(Boucher et al., 2010) visually demonstrate the model uncertainty,
The model development that includes identifying input vari-
but they might be subjective. Hence, two quantitative uncertainty
ables and optimizing network architectures were conducted based
indices, the percentage of coverage (POC) and the average width
on the case of 7-day lead-time forecasting, as this paper focuses on
(AW), which have been often adopted in the literature (Zhang
the long lead-time forecasting. However, the performances of the
et al., 2009; Alvisi and Franchini, 2011), are selected to evaluate
models were also assessed for several other lead-times, 1-day, 3-
the magnitude of the uncertainty. The AW quantifies the average
day and 5-day. To facilitate the comparison in different lead-
width of prediction intervals (ranging from minimum to maximum
times for each modeling approach, the same model architecture
simulated results from an ensemble of models) of a data set. The
was adopted for different lead-times to avoid uncertainty intro-
POC measures the percentage of observed values, which fall within
duced due to model structure. The collected data from 1912 to
the prediction intervals, out of the number of data used. These two
2002 were used to optimize the model structures in the model cal-
indices are calculated as follows.
ibration and the remaining data were used in the model validation.
1X n
U
AW ¼ y ^Li
^ y ð9Þ
n i¼1 i 5.1. Wavelet decomposition
! The DWT function was employed to decompose the original

1X n
POC ¼ ci 100 ð10Þ time series of daily flow into the wavelet components. The selec-
n i¼1 tion of particular mother wavelet is problem dependent and based
on the complex nonlinearity of input time series data. The most
where yÛi and y
^Li are the upper and lower bound estimation of the ith
commonly used wavelet transform functions include Daubechies
sample, respectively; ci is 1 if the ith observation falls within the
and Morlet; whereas Daubechies wavelet transform function can
prediction interval ½y ^Li , otherwise ci is equal to 0.
Ûi ; y
produce identical events with different shapes across the time ser-
ies that are in general difficult to be recognized by most other
4. Study area and data description transform functions (Benaouda et al., 2006). Hence, this paper used
Daubechies-10 mother wavelet (db 10) as suggested by Seo et al.
The Bow River originates from the Canadian Rockies and flows (2015). The optimal level of decomposition is another parameter
from east toward west through three different geographic regions in wavelet transform function that should be determined to ensure
including the mountains, the foothills, and the prairies. The Bow model performance. Many previous works (e.g., Nourani et al.,
River is fed by several water sources, including a large amount of 2009; Tiwari and Chatterjee, 2010) have adopted empirical equa-
rainfall–runoff during late spring to early summer, groundwater tion to determine the decomposition level. This paper used the fol-
recharge which is the major water source during winter, and snow- lowing empirical equation suggested by Nourani et al. (2009) to
melt in spring and early summer. In each year, the Bow River is determine the decomposition level (L).
partially covered by ice during December and March. Open water
begins in April and flow generally starts increasing from May as L ¼ int½logðNÞ ð11Þ
Fig. 3. Study area map (the extent of the figure is from the origin of the Bow River to the City of Calgary).
where N is the number of data to be decomposed. Hence, four levels The subscript ‘j’ denotes the forecast lead time.
of decomposition (L = 4) is determined herein. The decomposed sig-
nals, D1, D2, D3, D4 as well as the approximation signal, A4 and the 5.3. Network structure optimization
original flow data are plotted in Fig. 4, respectively.
In both the ANN-BB and the WNN-BB, the MLP ANN of a single
5.2. Input variable identification hidden layer was employed. The single layer of ANN is preferred
for reducing the inherent model structure uncertainty by minimiz-
The development of effective data-driven models largely ing number of network parameters (Kasiviswanathan et al., 2013).
depends on the identification of input variables that govern the The number of hidden neurons, which is responsible for capturing
underlying physical processes (Bowden et al., 2005a,b). This the dynamic and complex relationship between input and output
requires some degree of a priori knowledge of the system to be variables, is determined in the model calibration. Although com-
modeled (Thirumalaiah and Deo, 2000). It is, however, difficult to plex optimization procedures for determining the number of hid-
select the initial input variables based on the underlying physical den neurons are available in the literature (e.g., Maier et al.,
processes, especially in complex systems, and thereby analytical 2010), the simple trial and error method, which has been often
techniques such as auto- or cross-correlation are often employed employed (e.g., Srivastav et al., 2007; Kasiviswanathan and
(Sudheer et al., 2002). Despite that most such analytical techniques Sudheer, 2013), was adopted instead in this paper. The trial and
only take into account of the linear dependence between variables, error approach started with two hidden neurons initially, and the
they are mostly used for selecting input variables (Bowden et al., number of hidden neurons was increased by 1 neuron in each trial.
2005a,b). This paper employed a heuristic approach suggested by For each set of hidden neurons, the network was trained in a batch
Sudheer et al. (2002), in which the potential influencing variables mode (offline learning) to minimize the mean square error at the
corresponding to different time lags are identified through statisti- output layer. In order to avoid over-fitting in the model calibration,
cally analyzing the data series using cross-, auto-, and partial auto a cross validation was performed. The stopping criteria, mean
correlations between variables. In this paper, the output variable square error (0.0001) and/or maximum epochs (10,000), were
was correlated significantly with Qt (current flow) and Qt1 (ante- adopted in the training of ANN/WNN. The model calibration was
cedent flow with one-day time lag ‘t 1’), which were thus stopped when there is no significant improvement in the model
selected as the input variables. The selected variables were fed into performance. The optimal network architecture 2-2-1 (neurons in
the ANNs; while their decomposed signals as well as their approx- the input layer – neurons in the hidden layer – neurons in the out-
imate signals were the inputs of the WNNs. Thus, the ANN-BB and put layer) was determined for the ANN-BB; while the network
the WNN-BB are represented as follows, respectively. architecture 10-2-1 was selected for the WNN-BB.
Q tþj ¼ f ðQ t ; Q t1 Þ ð12Þ
5.4. Block bootstrap convergence
Q tþj ¼ f ðD1t ; D1t1 ; D2t ; D2t1 ; D3t ; D3t1 ; D4t ; D4t1 ; A4t ; A4t1 Þ
In order to assess the model precision, this paper applied BB to
ð13Þ sample different realizations of the input–output patterns during
Fig. 4. The daily flow, decomposed signals (D1, D2, D3 and D4), and the approximation signal (A4) based on data between 1912 and 2002.
the model calibration and subsequently an ensemble of models,

which have same network architectures but different network
parameters, were generated. This approach helps in improving
model generalization by calibrating the model using different sets
of inputs–output patterns.
To apply this calibration approach based on BB, the moving
length and the block length were determined using auto-
correlation of the calibration data set and the model performance
in the model calibration, respectively. As shown in Fig. 5, which
illustrates the calculated auto-correlation coefficients, there is a
1-year periodic cycle. To preserve the statistical characteristics
between each block, 1-year moving length was selected.
In the model calibration, the selection of calibration dataset
should ensure model generalization and consequently can achieve
appreciable results in the model validation. To maintain the statis-
tical characteristics of the whole calibration data set when sam-
pling the inputs–output patterns in the model development, the
block length was determined by minimizing RMSE of the models Fig. 5. Auto-correlation coefficients of the calibration data set.
constructed using different block lengths, varying from 1 year up
to 30 years, for the WNN-BB and ANN-BB, respectively. Fig. 6 dis-
plays the change of RMSE with respect to the block length for the block length of 15 years have an average RMSE of 54.43 m3/s and a
WNN-BB and ANN-BB, respectively. It is evident in Fig. 6 that there standard deviation of ±1.05 m3/s. The WNN-BB appears to produce
is no significant variation in RMSE when the block lengths are considerable lower error than the ANN-BB does. Although the
longer than 8 and 15 years in the WNN-BB and ANN-BB, respec- ANN-BB converges to the minimum RMSE when the block length
tively. The outputs of the WNN-BB models with a block length of is 15 years, there is no prominent variation in both the RMSE and
8 years have an average RMSE of 29.37 m3/s and a standard devia- its deviation, when the block lengths are longer than 8 years.
tion of ±2.05 m3/s; while the outputs of the ANN-BB models with a Therefore, the 8-year block length was used in the model develop-
Fig. 6. Variation of RMSE and its standard deviation with the block length. The dot and the line denote the mean and the standard deviation of RMSEs of the ensemble of the
WNN-BB and ANN-BB, respectively.
ment for both the WNN-BB and the ANN-BB. The results from the Table 1
model development suggest that an arbitrarily selected block Calculated various amplitude errors in the model validation.
length without justification based on the model performance Performance WNN-BB ANN-BB
might yield less optimal results. metrics
Flood Flood Flood Flood Flood Flood
I II III I II III
6. Results and discussion R2 0.79 0.92 0.86 0.24 0.62 0.37

EI (%) 78.31 91.65 83.12 18.36 60.66 20.62
RMSE (m3/s) 49.21 19.31 12.70 95.47 41.92 27.54
The WNN-BB and the ANN-BB developed previously were then MBE (m3/s) 1.78 4.22 0.10 20.98 7.44 1.43
used to conduct streamflow forecasting using the data set for the PI 0.88 0.81 0.79 0.14 0.13 0.02
model validation. To examine the efficacy of the developed models EXI 0.77 0.84 0.89 0.55 0.24 0.49
under different flow regimes, three time periods, June 5–August 7
of 2005 (Flood I), May 9–August 25 of 2007 (Flood II), and May 24–
August 7 of 2010 (Flood III), containing high, medium, and low flow lead-time forecasting. In contrast, the ANN-BB leads to very low
peaks, respectively, were selected from the validation data set. The PIs (less than 0.15) in all three Floods. Similarly, the positive and
peak flows in the three selected time periods are 602 m3/s, 331 m3/ high EXIs (0.77, 0.84 and 0.89 for Floods I, II and III, respectively)
s and 185 m3/s, respectively. As the model accuracy and precision obtained from the WNN-BB support that the model does not just
in general degrade with the increase of the lead-time of forecast- forecast the flows by simply extrapolating two most recent obser-
ing, the results and discussion on the long lead-time forecasting vations. However, the relatively low EXIs from the ANN-BB suggest
(7-day) were provided in details, while some results from the rel- that the ANN-BB has the tendency to memorize the antecedent
atively short lead-time forecasting (1-day, 3-day, and 5-day) were flows, which might lead to low performance of the ANN-BB, espe-
also provided in the following sections. cially in the long lead-time forecasting. All the results support that
the decomposed wavelet signals can capture the dynamics/peri-
6.1. Forecasting accuracy odic variations of the original data set and consequently can lead
to the superior performance of the WNN-BB. The superior perfor-
The assessments were conducted for Floods I, II and III, respec- mance of the WNN-BB becomes obvious when increasing the fore-
tively, based on the ensemble means and the calculated various casting lead-time, whereas there is no prominent difference in the
error metrics for the 7-day lead-time forecasting are presented in model performance between the WNN-BB and the ANN-BB in the
Table 1. Fig. 7 further shows the simulated hydrographs along with short lead-time forecasting (results on 1-day, 3-day and 5-day
the quantified ranges of forecasting for each flood, respectively. forecasting not shown).
Fig. 8 illustrates the scatter plots of the forecasted ensemble means For different magnitudes of peak flow, the WNN-BB always
and observed flows along with the least square regression (LSR) yielded more accurate forecasts than the ANN-BB did. Fig. 7 clearly
lines. demonstrates that the ensemble means of the WNN-BB closely
As illustrated in Table 1, the WNN-BB outperforms the ANN-BB represent the observed flows, whereas the forecasted hydrographs,
in a wide range of flow conditions, as the WNN-BB yields smaller especially the timing of peak flows, of the ANN-BB appear to devi-
RMSE and MBE and higher R2, EI, PI and EXI than the ANN-BB does ate significantly from the observed hydrographs in the all three
in all three floods. In general, both the WNN-BB and the ANN-BB Floods. Compared with Floods II and III, the ensemble mean of
underestimated the observations, as they yielded positive MBEs the WNN-BB appears to obviously under-forecast the peak flow
in the floods; however, the ANN-BB slightly overestimated the of Flood I; however as illustrated in Fig. 7, the forecast ranges given
flows in Flood III, which has the minimum peak flow, as it yielded by the WNN-BB successfully cover the observed peak flow. As dis-
a negative MBE (1.43 m3/s). The R2 estimated from the WNN-BB played in Fig. 8, LSR lines by the WNN-BB match with 45° lines in
were close to or above 0.90 in all three floods, which suggests that all three Floods; whereas the LSR lines of the ANN-BB largely devi-
the developed model reproduces the variations in the observed ate from 45° lines, which confirms the under-forecast of flow, espe-
flows fairly well. The low R2s produced by the ANN-BB is primarily cially high flows, by the ANN-BB. The ANN-BB, overall, has a
due to the obvious phase errors as demonstrated in Fig. 7. The pos- tendency to underestimate the high flows, which can be clearly
itive and close to one PIs, which are 0.88, 0.81, and 0.79 for the observed in Floods I and II. These results imply that ANNs might
three Floods, respectively, resulted from the WNN-BB indicate that not be suitable for the long lead-time forecasting although they
the model is capable of yielding appreciable results in the long often produce appreciable results in the short lead-time forecast-
Fig. 7. Time series of 7-day lead-time forecasted and observed flows in Flood I (June 5–August 7, 2005), Flood II (May 9–August 25, 2007) and Flood III (May 24–August 7,
2010), respective, in the model validation.
ing. ANNs-based models are suspected to over-fit the data in the forecasted by the WNN-BB; but the phase error could result in
long lead-time forecasting (Prakash et al., 2014). In addition, the the inferior performance of the ANN-BB in reproducing the
WNN-BB produced more or less consistent performance in the observed hydrographs. These results suggest that the WNN-BB
three time periods; while the WNN-BB produced slightly better might be more efficient in capturing the dynamics of the
performance in Floods II and III compared to Flood I, which has observations.
the highest peak flow. This may be attributed to better model gen-
eralization of the developed model in the range from low to med- 6.2. Phase error in forecasting
ium flows, where contains a large portion of observations in the
calibration data set. The indices, PVEo and PVEu, were calculated for each Flood for 1-
In addition, the WNN-BB and the ANN-BB were compared in day, 3-day, 5-day and 7-day lead-time forecasting, respectively,
terms of reproducing the statistical characteristics of observations. and the results are presented in Table 3. It is under expectation
Descriptive statistics, including mean, standard deviation, and that the magnitudes of both PVEo and PVEu tend to increase with
skewness, calculated from the ensemble means of the WNN-BB the increase of the lead-time in both the WNN-BB and the ANN-
and the ANN-BB for each Flood are presented in Table 2. The results BB. This implies that the phase error increases with the increase
demonstrate that the forecasted flows by the WNN-BB reproduced of the lead-time. The WNN-BB, in general, yielded more or less
the statistical characteristics of the observations fairly well overall equivalent PVEo and PVEu in terms of magnitude but having differ-
compared to the ANN-BB. Both the means of forecasted flows ent signs in all cases, which suggests the WNN-BB can reproduce
obtained from the WNN-BB and the ANN-BB appear to be equiva- the shape of the observed hydrograph fairly well although there
lent and also closely match with observed means in all three is a phase error between forecasted and observed hydrographs
Floods; however there are considerable differences in other two (Case III in Fig. 1). The consistent forecast error in terms of PVEo
statistics, standard deviation and skewness, between these two and PVEu produced by WNN-BB might imply that the model can
models. Compared to the ANN-BB, the WNN-BB, in general, yielded be naive. However, the EXI values reported in Table 1 suggest that
standard deviation and skewness, which are closer to those of the the WNN-BB model does not just extrapolate the previous values.
observations. One of possible explanation of the results is that the In the forecasts produced by the ANN-BB, the magnitudes of PVEu
amplitude and the shape of the observed hydrographs are well are approximately two times of those of PVEo in Flood I and II (Case
Table 3
Estimates of percent volumetric error by both the WNN-BB and the ANN-BB for
various lead-times in the model validation.
Modeling Lead- Flood I Flood II Flood III

approach time
PVEo PVEu PVEo PVEu PVEo PVEu
(day)
(%) (%) (%) (%) (%) (%)
WNN-BB 1 4.12 4.02 1.19 1.47 1.19 1.75
3 8.88 9.04 2.95 3.78 3.91 4.34
5 12.00 13.01 4.34 7.32 5.40 5.34
7 12.75 14.62 4.97 9.87 8.33 8.77
ANN-BB 1 5.13 9.44 2.29 3.53 2.95 4.65
3 10.57 24.69 7.32 10.74 10.92 11.81
5 12.98 29.29 9.47 16.72 16.21 11.97
7 16.37 34.82 11.02 19.68 20.07 17.32
under-forecast or shift the hydrograph. On the other hand, MBE

cannot distinguish Case III (Fig. 1) from perfect forecasting, as MBEs
in these two cases are zero or very close to zero. In addition, the
skewness, which is a measure of symmetry, cannot accurately cap-
ture the relative location of the forecasted hydrograph to the
observed hydrograph when the magnitudes of the forecasts do
not match with the observations fairly well, for example in Flood
I (Table 2). The result of skewness may convey biased information,
though the WNN-BB model performance was better than the ANN-
BB in Flood I (Table 1 and Fig. 7). However, the skewnesses pro-
duced by the WNN-BB in Floods II and III are equivalent to those
of the observations. Therefore, the proposed PVE, which offers use-
ful information to identify the differences between the observed
and forecasted hydrographs in terms of both the magnitude and
the time lag, is shown to be another promising metrics to assess
the model accuracy.
6.3. Forecasting precision
The model prediction uncertainty was quantified using the

ensemble of simulations obtained through BB sampling. As the
Fig. 8. The scatter plots of observed (Obs.) and 7-day lead-time forecasted (For.)
long lead-time forecasting is challenging, the 7-day lead-time fore-
flows in Floods I, II, and III, respectively, in the model validation. The red lines are
the least square regression lines. (For interpretation of the references to colour in casting was focused in this section, while the results from 1-day, 3-
this figure legend, the reader is referred to the web version of this article.) day, and 5-day were not shown here. The POC and AW calculated
by the WNN-BB and ANN-BB, respectively, for each Flood in the
model validation are presented in Table 4. As illustrated in Table 4,
II), but similar result is not obtained in Flood III. Compared to the the WNN-BB produced less uncertainty than the ANN-BB did, as it
WNN-BB, the ANN-BB led to large phase error, especially in med- has higher POC but slightly lower AW in Floods II and III. These
ium and high flow years (Floods I and II), as the magnitude of results can also been seen from Fig. 7, which shows a better cover-
the indices are higher than those of the WNN-BB in all Floods. age of the observations with comparable prediction interval when
Compared to the metrics, R2, EI, RMSE, MBE, PI and EXI illus- comparing the WNN-BB to the ANN-BB. However, the WNN-BB
trated in Table 1, the WNN-BB overall outperformed the ANN-BB appeared to achieve the high POC by sacrificing AW in Flood I,
in the 7-day lead-time forecasting. As discussed previously, all which has the highest peak flow among the three Floods. The result
these metrics are calculated based on the amplitude error, namely might be ascribed to several possible reasons such as the selection
the magnitude difference between forecasted and observed values. of wavelet function; uncertainty associated with wavelet parame-
To a certain degree, some of these metrics, such as RMSE and MBE, ters, inadequate model structure, and input uncertainty. However,
can imply the phase error in the forecasted hydrograph. However, the understanding of hydrologic uncertainty is still beyond current
RMSE cannot distinguish between Cases I and III (Fig. 1); namely available knowledge of the system and interaction among various
given similar RMSE magnitudes, the model might either over-/ physical phenomena (Zhang et al., 2009).
Table 2
Comparison of the descriptive statistics between the observed flows and the forecasted flows by the WNN-BB and the ANN-BB for each Flood in the model validation.
Statistical index Flood I Flood II Flood III

Obs. WNN-BB ANN-BB Obs. WNN-BB ANN-BB Obs. WNN-BB ANN-BB
Mean (m3/s) 223.99 222.21 203.01 172.43 168.21 164.99 113.35 113.24 114.77
Stda (m3/s) 106.49 96.01 64.21 67.15 65.21 54.08 31.12 35.08 31.28
Skewness 1.50 0.76 0.87 0.68 0.63 0.33 1.02 0.99 0.70
a
Std stands for standard deviation.
Table 4 tendency to produce a large phase error in the long lead-time fore-
Prediction uncertainties estimated in each Flood in the 7-day lead-time forecasting by casting as discussed previously.
the WNN-BB and ANN-BB, respectively, in the model validation.
In water resource management, the decision makers always
Uncertainty WNN-BB ANN-BB desire accurate prediction with narrow prediction interval, when
index further analysis and management is needed based on modeling/-
Flood Flood Flood Flood Flood Flood
I II III I II III forecasting results. Based on the results in comparing the WNN-
POC (%) 53.13 52.78 33.33 32.81 35.19 21.33 BB and the ANN-BB, it can be concluded that the WNN-BB, in gen-
AW (m3/s) 97.34 23.03 12.94 50.00 27.02 16.40 eral, produces more accurate and precise forecasts, in particular in
the long lead-time forecasting.
The histograms of the forecasted peak flows along with the 6.4. Peak flow forecasting
observed peak flows for each Flood are presented in Fig. 9 for the
WNN-BB and the ANN-BB, respectively. It can be seen that the From the flood management perspective, for instant flood early
ranges of the forecasts by the WNN-BB encompass the observed warning, it is crucial to accurately forecast the timing of the peak
peak flows in all events; whereas the peak flows consistently fall flow besides the magnitude of the peak flow. Several errors were
outside of the interval bounds of forecasts by the ANN-BB. The calculated based on the observed and forecasted peaks. The peak
under-forecast of peak flows by the ANN-BB can also be seen from error (PE), which is the percentage of difference between the
Fig. 7. By closely observing Fig. 7, the under-forecast of peak flows observed peak and the ensemble mean of forecasted peak to the
by the ANN-BB might be ascribed to the fact that the ANN-BB has observed peak, is less than ±3.5% in Flood II and III in the WNN-
Fig. 9. Histograms of forecasted peak flows in each Flood by the WNN-BB and the ANN-BB, respectively. The blue points indicate the observed peak flows. (For interpretation
of the references to colour in this figure legend, the reader is referred to the web version of this article.)
1999). Thus, data-driven models might not be able to accurately

forecast/predict high and extremely high flows; however, the mod-
els might be further enhanced by adding other influencing factors,
such as weather forecasting, as input variables into the models. The
future study on integrating physically-based and/or empirical
models with data-driven models is also recommended for improv-
ing the forecast accuracy and precision for high and extreme flows.
7. Summary and conclusions
Ensemble data-driven modeling approaches, including the

WNN-BB and the ANN-BB, were presented and illustrated in fore-
casting flows at the Calgary on the Bow River, Alberta, especially
for the long lead-time forecasting (here 7-day lead-time). These
two modeling approaches were evaluated and compared in terms
of both the model accuracy and the model precision using several
statistic metrics. According to the evaluation metrics including R2,
EI, RMSE, MBE, PI and EXI for model accuracy and POC and AW for
Fig. 10. The scatter plot of observed (Obs.) and forecasted (For.) flows using the model precision, the WNN-BB appeared to always outperform
developed WNN-BB for the 7-day lead-time forecasting in 2013. the ANN-BB, especially when forecasting high flows in the long
lead-time forecasting. The newly proposed index, PVE, was shown
to have the potential to evaluate the phase error between the mod-
BB; whereas, the PE of Flood I is approximately 30%. In contrast, the eled and observed hydrographs, rather than only focusing on the
ANN-BB resulted in much higher PEs, for example, a PE of 63.56% in amplitude error. The proposed BB, which is capable of determining
Flood I. In addition, the ANN-BB appeared not to be capable of fore- the optimal block length in the model calibration, was shown
casting the timing of the peak flows, as there was always a promi- effective in ensuring quality of the model calibration. The WNN-
nent time lag (4–6 days) between the forecasted and observed BB also yielded low phase error represented by PVE. Therefore,
peak flows in the 7-day lead-time forecasting. The WNN-BB, which the WNN-BB might be, overall, preferable for flow forecasting,
yielded less than 2-day time lag in all events, is obviously superior especially for the long lead-time forecasting. The better perfor-
to the ANN-BB from this perspective. These results are consistent mance of the WNN-BB might be attributed to the detailed informa-
with the results of PVE, which demonstrate relatively large error tion encompassed into the decomposed multi-level wavelet signals
in the ANN-BB compared to the WNN-BB. of input variables, which can effectively diagnose the main fre-
Based on the results of model accuracy and model precision dis- quency component of the original signal and abstract the local
cussed previously, and peak flow forecasting, overall, the WNN-BB information of a time series. However, the results also revealed
performed better than the ANN-BB in most criteria. The superior the limitations in the use of the data-driven models for forecasting
performance of WNN-BB can be ascribed to the decomposition of high and extremely high flows, as the capability of data-driven
input variables, which aids in capturing the non-linear dynamics models is always constrained by the availability of high flow data
of the processes to be modeled in terms of both magnitude and for model development. Therefore it is recommended that a hybrid
timing. However in flow forecasting, the model accuracy and pre- modeling approach, which integrates the data-driven models with
cision appeared to degrade in high flow years (Flood I) compared physically-based/conceptual models and/or empirical relation-
to low to medium flow years (Floods II and III). A flow data set ships between high flows and influencing factors, might be
commonly has very sparse data points located in high flow region, required to further enhance forecasting of high flows.
which consequently compromises the model generalization in
modeling/forecasting high flows.
Acknowledgements
6.5. Forecasting extremely high flow The authors would like to thank the University of Calgary (Eyes
High Program), Canada, for the financial support of this study. The
The foregoing discussions argue the applicability of the WNN- authors would also like to thank the anonymous reviewers for their
BB for the long lead-time forecasting. The developed WNN-BB insightful comments.
(for the 7-day lead-time forecasting) was also applied to forecast
the flows in 2013, when the highest peak flow of 1750 m3/s was References
recorded in Alberta’s history, and the results are illustrated in
Fig. 10. As shown in this figure, the WNN-BB forecasted the peak Abrahart, R.J., Heppenstall, A.J., See, L.M., 2007. Timing error correction procedure
applied to neural network rainfall—runoff modelling. Hydrol. Sci. J. 52, 414–
flow of 750 m3/s against the observed peak flow of 1750 m3/s, 431. http://dx.doi.org/10.1623/hysj.52.3.414.
which is an extreme outlier in the whole data set. Although the Adamowski, J., Sun, K., 2010. Development of a coupled wavelet transform and
model cannot forecast the extremely high flow, it can reproduce neural network method for flow forecasting of non-perennial rivers in semi-arid
watersheds. J. Hydrol. 390, 85–91. http://dx.doi.org/10.1016/j.
the low and medium flows reasonably. While observing the flows jhydrol.2010.06.033.
measured at Calgary over the historical period, the mechanisms Adkison, M., Peterman, R.M., 1996. Results of Bayesian methods depend on details
of generating peak flows, which can be rain-runoff, snow melting, of implementation: an example of estimating salmon escapement goals. Fish.
Res. 25, 155–170.
and the combination of rain-runoff and snow melting, vary from a
Alvisi, S., Franchini, M., 2011. Fuzzy neural networks for water level and discharge
year to a year. In addition as discussed previously, the lack of data forecasting with uncertainty. Environ. Model. Softw. 26, 523–537. http://dx.doi.
in the high flow region can also be blamed for the inferior perfor- org/10.1016/j.envsoft.2010.10.016.
mance of data-driven models in forecasting high flows, as the mod- Benaouda, D., Murtagh, F., Starck, J.L., Renaud, O., 2006. Wavelet-based nonlinear
multiscale decomposition model for electricity load forecasting.
els generally perform well within the range where they are Neurocomputing 70, 139–154. http://dx.doi.org/10.1016/j.neucom.2006.
sufficiently calibrated (Minns and Hall, 1996; Tokar and Johnson, 04.005.
Boucher, M.A., Lalibert́e, J.P., Anctil, F., 2010. An experiment on the evolution of an Mallat, S.G., 1989. A theory for multi-resolution signal decomposition: the wavelet
ensemble of neural networks for streamflow forecasting. Hydrol. Earth Syst. Sci. representation. IEEE Trans. Pattern Anal. Mach. Intell. 11 (7), 674–693.
14, 603–612. Minns, W., Hall, M.J., 1996. Artificial neural networks as rainfall-runoff models.
Bowden, G.J., Maier, H.R., Dandy, G.C., 2005a. Input determination for neural Hydrol. Sci. J. 41, 399–417.
network models in water resources applications. Part 1 – Background and Nayak, P.C., Sudheer, K.P., Rangan, D.M., Ramasastri, K.S., 2005. Short-term flood
methodology. J. Hydrol. 301, 75–92. http://dx.doi.org/10.1623/hysj.54.5.852. forecasting with a neurofuzzy model. Water Resour. Res. 41, 1–16. http://dx.doi.
Bowden, G.J., Maier, H.R., Dandy, G.C., 2005b. Input determination for neural org/10.1029/2004WR003562.
network models in water resources applications. Part 2 – Case study: Nourani, V., Kisi, Ö., Komasi, M., 2011. Two hybrid Artificial Intelligence approaches
forecasting salinity in a river. J. Hydrol. 301, 93–107. http://dx.doi.org/ for modeling rainfall–runoff process. J. Hydrol. 402, 41–59. http://dx.doi.org/
10.1016/j.jhydrol.2004.06.020. 10.1016/j.jhydrol.2011.03.002.
Cannas, B., Fanni, A., See, L., Sias, G., 2006. Data preprocessing for river flow Nourani, V., Komasi, M., Mano, A., 2009. A multivariate ANN-wavelet approach for
forecasting using neural networks: wavelet transforms and data partitioning. rainfall–runoff modeling. Water Resour. Manage. 23, 2877–2894. http://dx.doi.
Phys. Chem. Earth 31, 1164–1171. http://dx.doi.org/10.1016/j.pce.2006.03.020. org/10.1007/s11269-009-9414-5.
Chang, F.-J., Chen, P.-A., Lu, Y.-R., Huang, E., Chang, K.-Y., 2014. Real-time multistep- Partal, T., Kisßi, Ö., 2007. Wavelet and neuro-fuzzy conjunction model for
ahead water level forecasting by recurrent neural networks for urban flood precipitation forecasting. J. Hydrol. 342, 199–212. http://dx.doi.org/10.1016/j.
control. J. Hydrol. 517, 836–846. http://dx.doi.org/10.1016/j. jhydrol.2007.05.026.
jhydrol.2014.06.013. Politis, D.N., White, H., 2004. Automatic block-length selection for the dependent
Chen, P.A., Chang, L.C., Chang, F.J., 2013. Reinforced recurrent neural networks for bootstrap. Econ. Rev. 23, 53–70. http://dx.doi.org/10.1081/ETC-120028836.
multi-step-ahead flood forecasts. J. Hydrol. 497, 71–79. http://dx.doi.org/ Prakash, O., Sudheer, K.P., Srinivasan, K., 2014. Improved higher lead time river flow
10.1016/j.jhydrol.2013.05.038. forecasts using sequential neural network with error updating. J. Hydrol.
Crochemore, L., Perrin, C., Andréassian, V., Ehret, U., Seibert, S.P., Grimaldi, S., Gupta, Hydromech. 62, 60–74. http://dx.doi.org/10.2478/johh-2014-0010.
H., Paturel, J.-E., 2015. Comparing expert judgement and numerical criteria for Sang, Y.-F., Wang, Z., Liu, C., 2015. Wavelet neural modeling for hydrologic time
hydrograph evaluation. Hydrol. Sci. J. 60 (3), 402–423. series forecasting with uncertainty evaluation. Water Resour. Manage. http://
Ebtehaj, M., Moradkhani, H., Gupta, H.V., 2010. Improving robustness of hydrologic dx.doi.org/10.1007/s11269-014-0911-9.
parameter estimation by the use of moving block bootstrap resampling. Water Seo, Y., Kim, S., Kisi, O., Singh, V.P., 2015. Daily water level forecasting using wavelet
Resour. Res. 46, 1–14. http://dx.doi.org/10.1029/2009WR007981. decomposition and artificial intelligence techniques. J. Hydrol. 520, 224–243.
Golding, B., 2009. Long lead time flood warnings: reality or fantasy? Meteorol. Appl. http://dx.doi.org/10.1016/j.jhydrol.2014.11.050.
16, 3–12. Shamseldin, a.Y., O’Connor, K.M., 2001. A non-linear neural network technique for
Hall, P., Horowitz, J.L., Jing, B.Y., 1995. On blocking rules for the bootstrap with updating of river flow forecasts. Hydrol. Earth Syst. Sci. 5, 577–598. http://dx.
dependent data. Biometrika 82, 561–574. http://dx.doi.org/10.2307/2337534. doi.org/10.5194/hess-5-577-2001.
Han, D., Chan, L., Zhu, N., 2007. Flood forecasting using support vector machines. J. Shiri, J., Kisi, O., 2010. Short-term and long-term streamflow forecasting using a
Hydroinformatics 9 (4), 267–276. wavelet and neuro-fuzzy conjunction model. J. Hydrol. 394, 486–493. http://dx.
Kasiviswanathan, K.S., Cibin, R., Sudheer, K.P., Chaubey, I., 2013. Constructing doi.org/10.1016/j.jhydrol.2010.10.008.
prediction interval for artificial neural network rainfall runoff models based on Srivastav, R.K., Sudheer, K.P., Chaubey, I., 2007. A simplified approach to quantifying
ensemble simulations. J. Hydrol. 499, 275–288. http://dx.doi.org/10.1016/j. predictive and parametric uncertainty in artificial neural network hydrologic
jhydrol.2013.06.043. models. Water Resour. Res. 43, 1–12. http://dx.doi.org/10.1029/
Kasiviswanathan, K.S., Sudheer, K.P., 2013. Quantification of the predictive 2006WR005352.
uncertainty of artificial neural network based river flow forecast models. Sudheer, K.P., Gosain, A.K., Ramasastri, K.S., 2002. A data-driven algorithm for
Stoch. Environ. Res. Risk Assess. 27, 137–146. http://dx.doi.org/10.1007/ constructing artificial neural network rainfall-runoff models. Hydrol. Process.
s00477-012-0600-2. 16, 1325–1330. http://dx.doi.org/10.1002/hyp.554.
Kisi, O., Shiri, J., Tombul, M., 2013. Modeling rainfall-runoff process using soft Thirumalaiah, K., Deo, M.C., 2000. Hydrological forecasting using neural networks. J.
computing techniques. Comput. Geosci. 51, 108–117. http://dx.doi.org/ Hydrol. Eng. 5 (2), 180–189.
10.1016/j.cageo.2012.07.001. Tiwari, M.K., Chatterjee, C., 2010. Development of an accurate and reliable hourly
Kitanidis, P.K., Bras, R.L., 1980. Real-time forecasting with a conceptual hydrologic flood forecasting model using wavelet–bootstrap–ANN (WBANN) hybrid
model: 2. Applications and results. Water Resour. Res. 16, 1034–1044. http://dx. approach. J. Hydrol. 394, 458–470. http://dx.doi.org/10.1016/j.
doi.org/10.1029/WR016i006p01034. jhydrol.2009.12.013.
Liu, Y., Brown, J., Demargne, J., Seo, D.J., 2011. A wavelet-based approach to Tokar, B.A.S., Johnson, P.A., 1999. Rainfall-runoff modeling using artificial neural
assessing timing errors in hydrologic predictions. J. Hydrol. 397, 210–224. networks. J. Hydrol. Eng. 4 (3), 232–239.
http://dx.doi.org/10.1016/j.jhydrol.2010.11.040. Wang, W., Ding, J., 2003. Wavelet network model and its application to the
Maier, H.R., Jain, A., Dandy, G.C., Sudheer, K.P., 2010. Methods used for the prediction of hydrology. Nat. Sci. 1 (1), 67–71.
development of neural networks for the prediction of water resource variables Zhang, X.S., Liang, F.M., Srinivasan, R., Van Liew, M., 2009. Estimating uncertainty of
in river systems: current status and future directions. Environ. Model. Softw. 25, streamflow simulation using Bayesian neural networks. Water Resour. Res. 45.
891–909. http://dx.doi.org/10.1016/j.envsoft.2010.02.003. http://dx.doi.org/10.1029/2008wr007030.

Journal of Hydrology: K.S. Kasiviswanathan, Jianxun He, K.P. Sudheer, Joo-Hwa Tay

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Journal of Hydrology: K.S. Kasiviswanathan, Jianxun He, K.P. Sudheer, Joo-Hwa Tay

Загружено:

Авторское право:

Доступные форматы

Journal of Hydrology 536 (2016) 161–173

Contents lists available at ScienceDirect

Potential application of wavelet neural network ensemble to forecast

1. Introduction based models often requires intensive computation, expertise of

experienced two major flood events in 2005 and 2013, respec-

! The DWT function was employed to decompose the original

the model calibration and subsequently an ensemble of models,

6. Results and discussion R2 0.79 0.92 0.86 0.24 0.62 0.37

Modeling Lead- Flood I Flood II Flood III

under-forecast or shift the hydrograph. On the other hand, MBE

6.3. Forecasting precision

The model prediction uncertainty was quantified using the

Statistical index Flood I Flood II Flood III

1999). Thus, data-driven models might not be able to accurately

7. Summary and conclusions

Ensemble data-driven modeling approaches, including the

Вам также может понравиться