Gis Ann

Environ Earth Sci (2012) 67:251264
DOI 10.1007/s12665-011-1504-z
ORIGINAL ARTICLE
An artificial neural network model for flood simulation using GIS:

Johor River Basin, Malaysia
Masoud Bakhtyari Kia Saied Pirasteh
Biswajeet Pradhan Ahmad Rodzi Mahmud
Wan Nor Azmin Sulaiman Abbas Moradi
Received: 8 November 2010 / Accepted: 13 December 2011 / Published online: 31 December 2011
Springer-Verlag 2011
Abstract Flooding is one of the most destructive natural assignments, the ANN is used to directly produce water
hazards that cause damage to both life and property every levels and then the flood map is constructed in GIS. To
year, and therefore the development of flood model to measure the performance of the model, four criteria per-
determine inundation area in watersheds is important for formances, including a coefficient of determination (R2),
decision makers. In recent years, data mining approaches the sum squared error, the mean square error, and the root
such as artificial neural network (ANN) techniques are mean square error are used. The verification results showed
being increasingly used for flood modeling. Previously, this satisfactory agreement between the predicted and the real
ANN method was frequently used for hydrological and hydrological records. The results of this study could be
flood modeling by taking rainfall as input and runoff data used to help local and national government plan for the
as output, usually without taking into consideration of other future and develop appropriate (to the local environmental
flood causative factors. The specific objective of this study conditions) new infrastructure to protect the lives and
is to develop a flood model using various flood causative property of the people of Johor.
factors using ANN techniques and geographic information
system (GIS) to modeling and simulate flood-prone areas in Keywords Flood modeling Neural network GIS
the southern part of Peninsular Malaysia. The ANN model Remote sensing Spatial modeling Johor Malaysia
for this study was developed in MATLAB using seven
flood causative factors. Relevant thematic layers (including
rainfall, slope, elevation, flow accumulation, soil, land use, Introduction
and geology) are generated using GIS, remote sensing data,
and field surveys. In the context of objective weight Globally, floods are frequent natural disasters that cause
severe damage to both lives and property. It is estimated
that of the total economic loss caused by all kinds of
Electronic supplementary material The online version of this disasters, 40% are due to flooding (Feng and Lu 2010).
article (doi:10.1007/s12665-011-1504-z) contains supplementary Recurring floods is a severe problem in Malaysia causing
material, which is available to authorized users.
many deaths. In Malaysia, floods are the most important
M. B. Kia B. Pradhan (&) A. R. Mahmud natural hazard in terms of population affected, frequency,
Institute of Advanced Technology (ITMA), areal extent, and socio-economic damage (Pradhan and
University Putra Malaysia (UPM), 43400 Serdang, Youssef 2011). According to the Department of Irrigation
Selangor Darul Ehsan, Malaysia
and Drainage (DID), 9% of land area (29,800 km2) in the
e-mail: biswajeet@mailcity.com; biswajeet24@gmail.com
country is prone to flood, 22% of the population (4.82
S. Pirasteh million) is affected by floods (Fig. 1), and the average
Dezful Branch, Islamic Azad University, Dezful, Iran annual flood damage is about USD 0.3 billion (Pradhan
2010).
W. N. A. Sulaiman A. Moradi
Faculty of Environmental Studies, University Putra Malaysia, In Malaysia, there are many reports available on flood-
Serdang, Malaysia ing since the 1920s. According to DID, Malaysia has
123
252 Environ Earth Sci (2012) 67:251264
Fig. 1 a Flood-prone area in

west of Malaysia (Hassan and
Ghani 2006) and b location of
the study area
experienced major floods in the past decades and most costly flood events in Malaysian history. At the peak of the
recently in December 2006 and January 2007 which recent Johor flood, around 110,000 people were evacuated
severely affected Johor state. In fact, during the recent and moved to relief centers and the death toll was 18 lives.
20062007 floods in Johor a couple of abnormally heavy Despite large costs to manage and control of these floods
rainfall events caused massive flooding. The estimated total in Malaysia, flooding occurs repeatedly every year in
cost in terms of loss of property of these flood disasters various parts of the country, destroying property and killing
amounted to USD 0.5 billion, considered as one of the most people each year (Pradhan 2010a). Even though flood
123
Environ Earth Sci (2012) 67:251264 253
events are unavoidable (Maidment 2002; Youssef et al. and Openshaw 2000; Liu and Chandrashekar 2000; Dixon
2011), to reduce the losses, authorities and the general 2005). These new methods are frequently developed for
public should know when and where the next flood is going hydrological and flood modeling only with rainfall and
to happen and what areas are going to be inundated due to runoff as input and output, usually without taking into
such events. Therefore, there is a need to develop an consideration of other flood causative factors. The specific
accurate technique for flood forecasting in order to prevent objective of this study is to develop a flood model using
future disasters. various flood causative factors using ANN technique and
Hydrologists have a long history of research into con- GIS to modeling and simulation of flood-prone areas at
structing models of hydrological processes. It is the goal of Johor River Basin, Malaysia.
flood modeling to provide timely and accurate estimates of This paper is presented in seven major sections: firstly,
future discharge conditions at specific watershed locations. Introduction to the paper, followed by Study area which
The wide variety of available forecasting techniques used presents the data and methodology used. Thirdly, Data set,
by the hydrologists today, include physically based rain- which presents the background of artificial neural network
fall-runoff modeling techniques, data-driven techniques, model, then Flood simulation, which presents the appli-
and varying degrees of combination of the both, with cation of neural network in the Johor River Basin, followed
forecasts ranging in scale from short-term involving a by Flood map generated by ANN, which consists of the
number of hours, through to long-term involving a number main results and discussion, followed by Model perfor-
of months or years (Smith and Ward 1998). mance assessments and finally, the paper summarizes the
Although hydrologists have used many models to pre- conclusion in Discussion and conclusions.
dict flooding, the problem remains. Some of the models
cannot cope with dynamic changes inside the watersheds.
Some models are too difficult to calibrate and need to have Study area
robust optimization tools and some models require an
understanding of the physical processes inside the basin. The Johor River Basin is located at the South-east of
These problems have lead to exploration of a more data- Peninsular Malaysia (Fig. 1). The river originates from
driven approach (Varoonchotikul 2003). Gunung Belumut (at an elevation of 1,010 m) and Bukit
In recent years, there have been many studies on flood Gemuruh (at an elevation of 109 m) in the north of basin. It
susceptibility and hazard mapping using remote sensing flows south-east to discharge into the Johor Straits. Natural
data and GIS tools. Radar remote sensing data have been forest and low land swamps are the dominant land cover in
extensively used for flood monitoring across the globe the northern and central part, and oil palm and rubber
(Hess et al. 1995, 1990; Pradhan and Shafie 2009; Pirasteh plantations with swamps occupied the southern part of the
et al. 2010) and many of these studies have applied prob- basin. The most important population center is Kota
abilistic methods (Farajzadeh 2001, 2002; Horritt and Tinggi, the administrative center of the Kota Tinggi Dis-
Bates 2002; Pradhan and Shafie 2009). Hydrological and trict. Average annual rainfall is 2,500 mm and the mean
stochastic rainfall method for flood susceptibility mapping annual discharge at Rantau Panjang station (1,130 km2) is
has been employed in other areas (Blazkova and Beven 37.7 m3/s. The catchment area of Johor River at Kota
1997; Cunderlik and Burn 2002). Flood susceptibility Tinggi is about 1,620 km2. The main tributaries of this
mapping using GIS and neural network methods have also river are Sayong, Linggiu, Semanggar, Tiram and Lebam.
been applied in various case studies (Islam and Sado 2001,
2002; Dixon 2005).
Therefore, to increase the precision of flood models and The data set
to cope with some of the above limitations, in recent years,
several hydrological studies have used new techniques such Like most modeling methods, the techniques used in this
as ANN, fuzzy logic and neuro-fuzzy to make flood pre- research are based on the well-known principle of past
dictions (Dixon 2005). These techniques are capable of and today are keys to the future. In order to develop the
dealing with uncertainties in the inputs and can extract flood model, understanding and determination of flood
information from incomplete or contradictory datasets causative factors are crucial for this study area. These
(Rashid et al. 1992; Pradhan 2010a, b, c, 2011a, b; Pradhan factors are selected based on the knowledge acquired from
et al. 2006, 2010a, b, c, d; Rogers et al. 1995; Pradhan and a literature review and from previous research such as the
Youssef 2010; Oh and Pradhan 2011; Sezer et al. 2011; United Nations Environment Program (UNEP 2002),
Lorrai and Sechi 1995; Tamari et al. 1996; Woldt et al. Kingma (2002), Smith and Ward (1998), World Meteoro-
1996; Holger and Dandy 1996; Zhu et al. 1997; Schaap logical Organization (WMO 2008), and field studies.
et al. 1998; Lin et al. 1999; Ray and Klindworth 2000; See Although many factors may be important with respect to the
123
flood occurrence for a particular region, the same factors may area are granite, adamellite, and minor granodiorite
not be important for other regions (Pradhan 2010a). (source: Depart of Mineral & Geosciences, Malaysia).
Hence, different thematic data layers corresponding to The Riverine and swamp alluvium, colluviums, sand, silt
causative and intensifying flood factors, namely, topogra- and clay with some gravel occupy about 10% of the
phy, topographic slope, soil, land cover/land use, lithology, study area along the river valleys. The center of water-
and drainage were prepared as input for this study. Besides shed and around the Linggui Dam lake are covered by
the intense rainfall, these factors are classified as direct and massive cross-bedded sandstone with intercalations of
indirect causative factors in floods. They are considered as maroon and greenish gray mudstone, and grit (about
being responsible for flood occurrence in the study area. 7%). The sandstone, siltstone, conglomerate, shale, tuff
and lava are found in the south-east part of the lake and
DEM and its derivatives cover about 4.2%. Acid to intermediate pyroclastics,
lava, shale and microadamellite are other geological
Topography, as an intensifying factor, plays an important formations in this area. The geological map is shown in
role in flood severity and for the determination of a flood- the Fig. 2e.
prone area. On one hand, topographic factors have a direct
effect on flow size and runoff velocity. On the other hand, Soil types
river flood-prone areas mostly have low elevation and also
slight topographic slope. The study area is characterized by five different types of soil
Digital elevation models (DEM) are an excellent source series. The most dominant soil series of the study area is
to derive topographic factors responsible for flood activity Ultisols. This soil occupies about 73% of the study area.
in a region. Because the results of the flooding model have Steep lands comprise about 16.5% of study area and are
to show on the DEM to define flood-prone areas, the DEM found in the north and north east part of the watershed. En-
must have appropriate accuracy (Pradhan 2009). Therefore, tisols covers about 8.3% and are found mainly along the
a DEM has been generated from contours on 1:25,000 stream valleys. Main lands are found in the eastern part of the
topographic maps (Fig. 2a). watershed center, whereas small patches of Oxisols are
Topographic slope is defined as the angle between the found in the south-east part of the study area. Spatial distri-
surface and a horizontal datum. It means that gravity has an bution of major soil series in the study area is shown in
effect in inducing runoff and its velocity. Therefore, this Fig. 2f.
factor is very important in hydrology (Gomez and Ka-
vzoglu 2005). Although the steeper slopes produce more Land use
rapid flows, floods tend to occur on gentle slopes. The
estimation of the slope angle for the Johor basin was Land use and type of land cover are also key factors
derived from the DEM and divided into four classes responsible for flood incidence. The occurrence of flooding
(Fig. 2b). Nearly 64% of the Johor basin has slope angles is inversely related to the vegetation density. Rain falls on
ranging from 0 to 5. The mean slope angle is 6 while the the barren slopes run over the surface rapidly as compared
maximum slope angle is 49. to the forest area. Consequently, some land use areas (for
Many of the floods occur in the high density drainage due instance, high percentage of cropland or urban land use)
to accumulation of a large quantity of water. To build this yield more storm runoff in comparison with similar areas
layer, a drainage data layer has been prepared from the DEM which are covered by grassland or forest. To define this
with the flow direction for each filled DEM cell one of the factor, the layer was prepared and seven dominant land use
keys to deriving the hydrologic characteristics of a surface. land cover classes namely forest, agriculture, mine area,
This function directs the flow out of each cell (Fig. 2c). built up, water bodies, barren area and swamp have been
The values in the resultant pixels of the flow direction considered (Fig. 2g).
grid indicate the direction of the steepest descent from that
pixel. Later, the flow accumulation was computed using the Rainfall and runoff data
accumulated number of pixels in upstream. The results of
flow accumulation can be used to create a stream network To develop a flood model, rainfall and simultaneous
by classification of pixel values (Fig. 2d). runoff data were obtained from seven rain gauges and one
water level station within the Johor River Basin area; the
Geology summary of the rain gauge data is shown in Table 1. In
total, 267 samples of highest pick discharges (flood
The watershed has different geological formations. The events) and their rainfalls, as published by the DID, were
major rock units, which occupy about 67% of the study selected.
123
Fig. 2 Input thematic layers: a DEM, b slope angle, c flow direction, d flow accumulation, e geology, f soil types, and g land use
123
Fig. 2 continued
123
Table 1 Summary of the rain gauges of Johor River Basin

Stations no. 1,737,001 1,834,001 1,833,123 1,834,122 1,835,001 1,836,001 1,739,002
Mean annual rainfall (mm) 1,946.8 1,844.6 2,111.1 2,058.9 2,601.2 2,285 2,345.3
Minimum annual rainfall (mm) 1,251 865.2 1,225 1,355.5 1,847.5 1,291 1,292
Maximum annual rainfall (mm) 2,613.9 2,929 3,014 2,575.7 5,152.5 3,274 3,178
Period of record 198705 198905 198706 198706 198706 198706 198797
Flood simulation 1997). The number of neurons in output layers is fixed by

the application and is represented by the class being pro-
Artificial neural networks cessed. Each hidden neuron responds to the weighted
inputs it receives from the connected neurons from the
ANNs are mathematical models of human perception that preceding input layer. Once the combined effect on each
can be trained for performing a particular task based on hidden neuron is determined, the activation at this neuron
available empirical data. When the relationships between is determined via a transfer function. Many differentiable
data are unknown, they can make a powerful tool for nonlinear functions are available as a transfer function.
modeling (Lek et al. 1996; Lek and Guegan 1999; Mas Since the sigmoid function enables a network to map any
2004; Pradhan and Buchroithner 2010; Pradhan and Lee nonlinear process, most networks of practical interest make
2009, 2010a, b, c; Pradhan and Pirasteh 2010). The theory use of it (Bishop 1994; ASCE Task Committee 2000).
and mathematical basis of ANNs are explained in detail by A typical artificial neuron and the modeling of a mul-
many researchers (Bishop 1995; Haykin 1999). Therefore, tilayered neural network are illustrated in Fig. 3. Referring
a brief description is presented here. to this Fig. 3, the signal flow from inputs x1, , xn is
As shown in Fig. 3, an ANN includes a number of considered to be unidirectional, which are indicated by
neurons or nodes that work in parallel to transform the arrows, as is a neurons output signal flow (O). The neuron
input data into output categories. Typically, an ANN con- output signal O is given by the following relationship:
sists of three layers namely input, hidden layers and output. !
X n
Each layer, depending on the specific application in a O f net f wj xi ; 1
network, has some neurons. Each neuron is connected to j1
other neurons in the next consecutive layer by direct links. where wj is the weight vector, and the function f(net) is
These links have a weight that represents the strength of referred to as an activation (transfer) function. The variable
outgoing signal (Atkinson and Tatnall 1997; Var- net is defined as a scalar product of the weight and input
oonchotikul 2003).
vectors,
The input layer receives the data from different sources
(e.g., thematic layers). Hence, the number of neurons in the net wT x w1 x1 w2 x2 wn xn ; 2
input layer depends on the number of input data sources. where T is the transpose of a matrix, and, in the simplest
The data are processed in hidden and output layers case, the output value O is computed as
actively. The number of hidden layers and their neurons are
1 if wT x h
often defined by trial and error (Atkinson and Tatnall O f net ; 3
0 if otherwise
where h is called the threshold level, and this type of node
is called a linear threshold unit (Abraham 2005).
Input 1
Application of ANN to flood simulation
Input 2
Output The ANN flood model development consisted of three
Input 3 steps: the ANN architecture, training, and testing (Principe
et al. 1999).
Input 4
ANN architectures
Fig. 3 Architecture of a multilayer neural network and an artificial

The ANN architecture refers to the number of layers and
neuron connection weights. It also defines the flow of information
123
in the ANN network. Design of a suitable structure is the Once the input and output variables were defined, to
most important and also the most difficult part in the ANN identify the hidden layers neurons, the ANN architecture
modeling process (Maier and Dandy 1996). There are no 7-NN-1 (which N represent the number of neuron in
strict rules to define the number of hidden layers and hidden layers) was examined (Fig. 4).
neurons in the literature. Most researchers have been using
the trial and error method to determine them. Although Training of the network
some studies use a single hidden layer in ANN architecture,
Sarle (1994) showed that a higher flexibility can be taken The aim of training process is to decrease the error between
using more than one hidden layer and subsequently they the ANN output and the real data by changing the weight
used two hidden layers as starting point (Flood and Kartam values based on a given algorithm (Pradhan and Lee
1994; Tamura and Tateishi 1997). However, the optimal 2010a). Typically, back-propagation algorithms are used
design of ANN architectures depends on the type of by the ANN at this stage.
problem under investigation. A successful ANN model can predict target data from a
In this research, a three-interconnection ANN architecture given set of input data. Once the minimal error is achieved
comprises an input layer, two hidden layers, and an output and training is completed, the feed-forward structure is
layer was used. The input layer contains seven neurons (one applied by ANN to generate a classification of the whole
each for elevation, topographic slope, flow accumulation, data set (Paola and Schowengerdt 1995).
geology, land use, soil, and rainfall data) each representing a The flowchart for the determination of weights in ANN
causative factor that contributes to the occurrence of the flood is shown in Fig. 5. The weights between different layers
in the catchment. The output layer contains a single neuron were calculated by training the ANN through a reverse
representing river flow. The hidden layers and their number of calculation process in which the contribution or importance
neurons are used to define the complex relationship between of each factor was computed. Then, the contribution or
the input and output variables. importance of each factor, i.e., the weight, was determined.
Fig. 4 A schematic architecture of ANN for flood modeling
123
the algorithms, the training of ANN, and to prevent any

over-fitting error, the data are divided into three indepen-
dent parts: training, validation, and testing. The training
part is used for training and updating the parameters of the
ANN. The error in validation part is checked during the
training stage. Normally, the error of validation is reduced
during the initial stage of training. When over-fitting is
started, the validation error is increased. When the vali-
dation error increases for a specified number of iterations,
the training is stopped, and the weights that produced the
minimum error on the validation set are retrieved.
In this study, the data were randomly selected, i.e. 60%
used for training, 20% used for validation purpose in order
to stop training before over-fitting, and the remaining 20%
were used as a completely independent test of the network
generalization. The aim of this part is to confirm the ANN
accuracy by application of untrained data in the model.
The multilayer perceptron (MLP) program has been
used in MATLAB software. The input portions of the
program were modified for the easy computation and
handling of GIS data. The LevenbergMarquardt algorithm
has been used to construct the input files, scaling (nor-
Fig. 5 Flow chart showing weight determination of flood factors
malized) the data, train the network and doing needed post-
using ANN model (modified after Pradhan and Lee 2010a)
processing to obtain model output (Pradhan and Lee
2010a).
To train the ANN, a 7-NN-1 format was used in this The number of epochs was set to 2,000, and the sum
study, where N represents the hidden layer nodes. By square error (SSE) value used for the stopping criterion
varying the number of neurons in both hidden layers, the was set to 0.001. The experimental data sets met the 0.001
neural networks were run several times to identify the most SSE goals in the case of 0 slopes (ESM Fig. 1). However,
appropriate neural network architecture based on training if the SSE value was not achieved then the maximum
and testing accuracies. Therefore, the neurons in the hidden number of iterations was terminated at 2,000 epochs. The
layers were changed by repeating the training process 20 rate of change of SSE is displayed on the training set (blue
times and then taking the minimum mean square error. line), the validation set (green line) and test (red line)
The values of neurons in the first and second layers were (ESM Fig. 2). The validation and test lines in ESM Fig. 2
checked from 7 to 25 and 3 to 14, respectively. For each are very similar. As can be observed in ESM Fig. 2, the
ANN configuration the training procedure was repeated rate of change was very similar and the error is scaled
starting from independent initial conditions and ultimately down with increasing epoch, and it reached to 5.6498e-05
ensuring selection of the best performing network. The at epoch 1,811. This indicates a good fit meaning that the
decreasing trend in the minimum mean square error in the ANN is well trained and it can be used to predict and
training and validation sets was used to decide the optimal modeling.
learning. The training was stopped when the minimum Seven input nodes each representing flood causative
mean square error was achieved. This was done by adding parameters including rainfall, slope, elevation, soil, geol-
an early stopping technique in the MATLAB software. ogy, flow accumulation, and land use were used during the
This is an indication of the network getting over-trained; as ANN modeling. These factors are denoted as I1, I2, I3, I4,
such an ANN would perform very well in the training stage I5, I6 and I7, respectively (Table 2). The 20 hidden nodes
but would fail to maintain that level of performance when are denoted as HA1 to HA20 (Table 2, column 1). Table 2
applied to a different dataset. shows the weight connections changes between the input
The over-fitting error can occur during the training and first hidden layers. From Table 2, it can be observed
stage. In over-fitting error, however, the training set error is that there is little variation in maximum and minimum
very small, but the error is large when new data are used connection weights between the input and hidden layers
and the new solution cannot be generalized to a new situ- nodes except the rainfall parameter (I1). For the rainfall
ation by ANN. The early stopping method is used to pre- parameter (I1), the variation between maximum and mini-
vent this error during training. In this method, to develop mum connection weights are larger than other factors (e.g.
123
I2, I3, , I7) in the corresponding input layers. This indi- Testing the network
cates that rainfall factor is the main factor in training of the
neural network compared to the other inputs. For the ANN After ANN training process is completed, different datasets
modeling, we used a back-propagated algorithm to adjust (testing data) were used to extend, and to determine the
and tune the connection weights between the inputs, first model accuracy. Using new data, the network performance
hidden, second hidden and output layers (Pradhan et al. was evaluated. These data had the same properties as the
2010a). In the algorithms used here, the error and variation training data but they had not been used during the training
between the ANN outputs and measured data (target) are of the model.
back propagated through the ANN and are minimized by An important result in testing these data was that the
updating interconnection weights between the layers ANN was able to identify all values same as training stage.
(Arora et al. 2004; Lee et al. 2004; Pradhan et al. 2010a, b; This result yields a R2 value of 1 which is acceptable result
Pradhan and Lee 2010b), the best weight adjustment may and it shows a high level of prediction. The simulated and
happen between first hidden and second hidden layers. ANN predicted river flow, and the regression plot are
The ANN training results are shown in ESM Fig. 2. In this shown in ESM Figs. 4 and 5, respectively.
figure the observed river flow and ANN training results are
compared. High and low river flow values are predicted very
well, and there is high convergence and good agreement Flood map generated by ANN
between the predicted flows and observed data. Generally,
the relationship between the observed and predicted data is Flood maps will be helpful for disaster planning in addition
shown by regression plot. Since the ANN is perfectly trained, to an actual emergency response to the floods. Estimation
the ANN output equals the predicted data (ESM Fig. 3). of the flood inundation area is the most important duty and
The next step was to investigate this data point in order highest priority for decision makers and most relevant for
to determine if it represents extrapolation (i.e., is it outside national and local governments. The outputs of the ANN
of the training data set). If so, then it should be included in model can be used in GIS for visualization of the flood
the training set, and additional data should be collected to extent and the flood inundation areas. This map is the best
be used as the test set. tool to quickly use as a potential impact assessment and for
Table 2 Input (I)hidden layer 1 (HA) connection weights

Node I1 (rainfall) I2 (slope) I3 (elevation) I4 (soil) I5 (geology) I6 (flow accumulation) I7 (land use)
HA1 23.6799 -9.9E-09 1.6E-09 1.6E-09 -3.2E-09 -1.3E-09 -4.1E-09

HA2 0.2785 1.4E-05 -2.6E-06 -2.3E-06 5.1E-06 1.8E-06 6.2E-06
HA3 -1.4384 -5.2E-06 1.0E-06 8.4E-07 -1.9E-06 -6.2E-07 -2.4E-06
HA4 -10.7464 1.7E-06 -2.9E-07 -2.6E-07 3.7E-07 2.1E-07 6.9E-07
HA5 2.0906 -2.1E-07 3.8E-08 3.2E-08 -6.8E-08 -2.3E-08 -9.1E-08
HA6 3.6231 1.0E-05 -1.8E-06 -1.6E-06 3.3E-06 1.3E-06 4.2E-06
HA7 5.3119 4.4E-07 -8.1E-08 -6.9E-08 1.5E-07 5.5E-08 1.9E-07
HA8 0.1299 3.6E-06 -6.3E-07 -5.9E-07 1.2E-06 4.9E-07 1.5E-06
HA9 -2.9412 1.8E-06 -3.1E-07 -2.8E-07 5.6E-07 2.3E-07 7.4E-07
HA10 -26.4989 2.5E-08 -4.6E-09 -3.8E-09 8.2E-09 2.7E-09 1.1E-08
HA11 -0.6151 2.1E-07 -3.2E-08 -3.2E-08 5.5E-08 2.8E-08 7.8E-08
HA12 4.4757 -1.2E-06 2.2E-07 1.9E-07 -4.2E-07 -1.4E-07 -5.2E-07
HA13 4.2550 1.8E-06 -3.2E-07 -2.9E-07 5.9E-07 2.4E-07 7.7E-07
HA14 -0.5180 -6.8E-06 1.2E-06 1.1E-06 -2.2E-06 -8.2E-07 -2.8E-06
HA15 1.0745 5.4E-05 -9.4E-06 -8.6E-06 1.7E-05 6.9E-06 2.2E-05
HA16 9.6419 -2.6E-07 4.7E-08 4.2E-08 -9.1E-08 -3.0E-08 -1.1E-07
HA17 4.3609 -3.1086 0.673 0.5908 -1.213 -0.421 -1.634
HA18 -1.1144 6.0E-05 -1.1E-05 -8.9E-06 1.9E-05 6.0E-06 2.5E-05
HA19 1.05169 -3.7E-05 6.2E-06 5.2E-06 -9.7E-06 -3.3E-06 -1.5E-05
HA20 -3.6817 3.3E-07 -5.4E-08 -5.3E-08 9.8E-08 4.6E-08 1.3E-07
123
Comparison between the forecasted and observed river

flow in Fig. 6 indicates that the accuracy of model is
quite good, especially in high river flows. Since the
topography is the main factor to specify the flood
inundation extent, the DEM map was used to determine
the flood-prone area. The flood inundation area is
derived from the DEM based on water levels in the river
cross-section (Fig. 7).
Model performance assessments
The model accuracy assessment is described in terms of the

Fig. 6 The comparison of simulated flood hydrographs with error of forecasting or the variation between the observed
observed hydrographs at the Kota Tinggi gauging stations and predicted values. In the literature, there are many
rescue operations for any flood as well as to compute the performance assessment methods for measuring the accu-
type and number of buildings affected by the flood. racy and each one has advantages and limitations. In this
To further extend the model performance, the ANN study, the most widely used methods namely coefficient of
model was used to simulate recent floods in January 2007 determination (R2), sum squared error (SSE), mean squared
that occurred in the Johor state. In fact, during the error (MSE), and root mean squared error (RMSE) were
20062007 Johor flood (due to a couple of abnormally used to check the performance of the ANN. Each method is
heavy rainfall events which caused massive floods) the estimated from the ANN predicted values and the mea-
estimated total cost of these flood disasters valued at UDS sured discharges (targets).
0.5 billion considered the most costly flood event in To check the model performance, the multilayered
Malaysian history. At the peak of that Johor flood, around perceptron (MLP) was used for forecasting flood events
110,000 people were evacuated and sheltered in relief and was calculated based on both training and testing data
centers and the death toll was 18 persons. The simulated (Table 3). The MLP model forecasting results produced
hydrograph is compared to observed river flow for this excellent agreement with the real data at determination
event at Kota Tinggi station (Fig. 6). At this station, three coefficient (R2). These values are 1 for MLP training and
critical flood levels are designated by DID, namely Alert testing data. The results showed that the model has less
(1.7 m), Warning (2.2 m) and Danger (2.7). SSE, MSE, and RMSE. Overall, the errors were negligible.
Fig. 7 Flood inundation area in

January 2007 at Johor River
Basin
123
Table 3 Comparison of model performance for MLP during training watershed infiltration, geomorphology, etc. There exists a
and testing very complicated relationship between these factors, and
Train Test they have significant influence on each other and on the
runoff. Understanding these factors and the interaction
R2 1 1
between them are necessary for hydrological modeling.
SSE 6.5E-08 6.4E-08 This study gives a detailed selection of the most important
MSE 4.9E-12 4.9E-12 flood causative factors and an understanding of the inter-
RMSE 2.5E-23 2.4E-23 actions between them.
The sensitivity analysis performed here shows that the
elevation is the most important factor for flood susceptibility
Table 4 Sensitivity analysis results for the input factors mapping. The average normalized value shows that eleva-
tion has the highest weight values (R2 = 0.931) followed by
Factors SSE MSE RMSE R2
slope (R2 = 0.963) and then landuse (R2 = 0.986). The
Slope 3.70 0.00028 0.01687 0.963 scientific weights and ratings are essential to flood suscep-
Elevation 4.13 0.00032 0.01782 0.931 tibility mapping. The back-propagation training algorithm
Soil 2.37 0.00018 0.01350 0.988 presents difficulties when trying to follow the internal pro-
Geology 5.28 0.00041 0.02015 0.990 cesses of the procedure. The method also involves a long
Flow accumulation 6.38 0.00049 0.02215 0.988 execution time with a heavy computing load. Therefore, the
Land use 4.19 0.00032 0.01795 0.986 thematic data layers were converted into ASCII format to
speed up the computing process. Computation of the weight
of the factors and artificial neural network modeling was
Some researchers used the sensitivity analysis to performed in MATLAB; the outputs were exported to GIS
improve hydrological model (Liu et al. 2003; Bahremand for map production and visual interpretation. Flood suscep-
and De Smedt 2008; Pappenberger et al. 2008; Fernandez tibility map was analyzed qualitatively using equal area
and Lutz 2010). Sensitivity analysis is a common tool for classification schemes.
finding important model factors, testing the model con- A prototype of MLP was developed and integrated with
ceptualization, and developing the model structure (Sieber GIS. These new methods apply different causative flood
and Uhlenbrook 2005). In this study, a sensitivity analysis factors to model the floods and present the results in a
was applied to define the relative importance of each of the spatial form. The results showed the models could simulate
input data with the exception of the rainfall factor as the peak river flows as well as base flows.
main reason of floods, on the river flow in the output. Input These results can be used as basic data to assist slope
factors including slope, elevation, soil, geology, flow management and landuse planning. The methods used in
accumulation, and land use were considered, in turn, in the the study can also be used for generalized planning and
sensitivity analysis. Table 4 shows the ANN models, with assessment purposes, although they may be less useful on a
one of the input factors eliminated in each case. As can be site-specific scale, where local geomorphologic and geo-
seen in the table, the R2 of the factors varies within the graphic heterogeneities may prevail. The study suggests the
range 0.9310.990. This range indicates that all input model can be used to predict floods in the study area with
causative factors influence the river flow. Table 4 indicated acceptable accuracy. It seems integration of this model
that among the different input factors, elevation has the with a real-time warning system will provide a great
most significant (R2 = 0.931) and geology (R2 = 0.990) advantage and flood damages can be reduced significantly.
has the least influence on river flow and flood.
Acknowledgments This article is greatly benefited from very
helpful reviews by two anonymous reviewers and editorial comments
by James W. LaMoreaux.
Discussion and conclusions
Over the last decade, ANN has been used in many geo-
hazard applications. Integration of GIS and neural network References
techniques in the field of water resource has opened various
new approaches in hydrological modeling, improved our Abraham A (2005) Artificial Neural Networks. In: Peter H. Syden-
ability to create more accurate flood models, and helped to ham, Richard Thorn (ed) Handbook of measuring system design.
John Wiley and Sons, London, pp 901908
present the results in a spatial environment.
Arora MK, Das Gupta AS, Gupta RP (2004) An artificial neural
Floods are affected by several factors such as rainfall, network approach for landslide hazard zonation in the Bhagirathi
initial soil moisture, geology, land use, evaporation, (Ganga) Valley, Himalayas. Int J Remote Sens 25(3):559572
123
ASCE Task Committee (2000) Artificial neural networks in hydrol- Lek S, Guegan JF (1999) Artificial neural networks as a tool in
ogy I: preliminary concepts. J Hydrol Eng 5(2):115123 ecological modelling, an introduction. Ecol Model 120:6573
Atkinson PM, Tatnall ARL (1997) Neural networks in remote Lek S, Delacoste M, Baran P, Dimopoulos I, Lauga J, Aulanier S
sensing. Int J Remote Sens 18:699709 (1996) Application of neural networks to modelling non-linear
Bahremand A, De Smedt F (2008) Distributed hydrological modeling relationships in ecology. Ecol Model 90:3952
and sensitivity analysis in Torysa Watershed, Slovakia. Water Lin HS, McInnes KJ, Wilding LP, Hallmark CT (1999) Effects of soil
Resour Manag 22:393408 morphology on hydraulic properties: I. Quantification of soil
Bishop CM (1994) Neural networks and their application. Rev Sci morphology. Soil Sci Soc Am J 63:948953
Instrum 65(6):18031830 Liu H, Chandrashekar V (2000) Classification of hydrometers based
Bishop CM (1995) Neural networks for pattern recognition. Claren- on polarimetric radar measurements: development of fuzzy logic
don Press, Oxford, UK and neuro-fuzzy systems and in situ verifications. J Atmos Ocean
Blazkova S, Beven K (1997) Flood frequency prediction for data Tech 17:140164
limited catchments in the Czech Republic using a stochastic Liu YB, Gebremeskel S, De Smedt F, Hoffmann L, Pfister L (2003) A
rainfall model and TOPMODEL. J Hydrol 195(14):256278 diffusive transport approach for flow routing in GIS-based flood
Cunderlik JM, Burn DH (2002) Analysis of the linkage between rain modelling. J Hydrol 283:91106
and flood regime and its application to regional flood frequency Lorrai M, Sechi GM (1995) Neural nets for modeling rainfall-runoff
estimation. J Hydrol 261(14):115131 transformations. Int Ser Prog Water Res 9:299313
Dixon B (2005) Applicability of neuro-fuzzy techniques in predicting Maidment DR (2002) Arc Hydro: GIS for water resources. ESRI
ground-water vulnerability: a GIS-based sensitivity analysis. Press, Redlands
J Hydrol 309:1738 Maier HR, Dandy GC (1996) The use of artificial neural networks for
Farajzadeh M (2001) The flood modeling using multiple regression the prediction of water quality parameters. Water Resour Res
analysis in Zohre & Khyrabad Basins. In: 5th International 32(4):10131022
Conference of Geomorphology, August, Tokyo, Japan Mas JF (2004) Mapping land use/cover in a tropical coastal area using
Farajzadeh M (2002) Flood susceptibility zonation of drainage basins satellite sensor data, GIS and artificial neural networks. Estuar
using remote sensing and GIS, case study area: Gaveh rod Iran. Coast Shelf S 59:219230
In: Proceeding of international symposium on geographic Oh JJ, Pradhan B (2011) Application of a neuro-fuzzy model to
information systems, Istanbul, Turkey, 2326 Sept 2002 landslide susceptibility mapping in a tropical hilly area. Comput
Feng LH, Lu J (2010) The practical research on flood forecasting Geosci 37(9):12641276. doi:10.1016/j.cageo.2010.10.012
based on artificial neural networks. Expert Syst Appl Paola JD, Schowengerdt RA (1995) A review and analysis of
37:29742977 backpropagation neural networks for classification of remotely
Fernandez DS, Lutz MA (2010) Urban flood hazard zoning in sensed multi-spectral imagery. Int J Remote Sens 16:30333058
Tucuman Province, Argentina, using GIS and multicriteria Pappenberger F, Beven KJ, Ratto M, Matgen P (2008) Multi-method
decision analysis. Eng Geol 111:9098 global sensitivity analysis of flood inundation models. Adv
Flood I, Kartam N (1994) Neural networks in civil engineering. I: Water Resour 31:114
principles and understanding. J Comput Civil Eng 8(2):131148 Pirasteh S, Rizvi SMA, Ayazi MH, Mahmoodzadeh A (2010) Using
Gomez H, Kavzoglu T (2005) Assessment of shallow landslide microwave remote sensing for flood study in Bhuj Taluk,
susceptibility using artificial neural networks in Jabonosa River Kuchch District Gujarat, India. Int Geoinform Res Dev J
Basin. Venezuela. Eng Geol 78(12):1127 1(1):1324
Hassan AJ, Ghani AA (2006) Development of flood risk map using Pradhan B (2009) Groundwater potential zonation for basaltic
gis for sg. Selangor Basin. http://redac.eng.usm.my/html/ watersheds using satellite remote sensing data and GIS tech-
publish/2006_11.pdf. Accessed 19 April 2008 niques. Central Eur J Geosci 1(1):120129. doi:10.2478/v10085-
Haykin S (1999) Neural networks: a comprehensive foundation, 2nd 009-0008-5
edn. Prentice Hall, New Jersey Pradhan B (2010a) Flood susceptible mapping and risk area
Hess LL, Melack JM, Simonett DS (1990) Radar detection of flooding delineation using logistic regression, GIS and remote sensing.
beneath the forest canopy: a review. Int J Remote Sens J Spatial Hydrol 9(2):118
11:13131325 Pradhan B (2010b) Landslide susceptibility mapping of a catchment area
Hess LL, Melack J, Filoso S, Wang Y (1995) Delineation of inundated using frequency ratio, fuzzy logic and multivariate logistic
area and vegetation along the Amazon floodplain with the SIR-C regression approaches. J Indian Soc Remote Sens 38(2):301320.
Synthetic Aperture Radar. IEEE T Geosci Remote 33:896903 doi:10.1007/s12524-010-0020-z
Holger RM, Dandy GC (1996) The use of artificial neural networks Pradhan B (2010c) Application of an advanced fuzzy logic model for
for the prediction of water quality parameters. Water Resour Res landslide susceptibility analysis. Int J Comput Int Sys 3(3):370381
32:10131022 Pradhan B (2011a) Manifestation of an advanced fuzzy logic model
Horritt MS, Bates PD (2002) Evaluation of 1D and 2D numerical coupled with geoinformation techniques for landslide suscepti-
models for predicting river flood inundation. J Hydrol 268:8799 bility analysis. Environ Ecol Stat 18(3):471493. doi:
Islam MM, Sado K (2001) Flood damage and modeling using satellite 10.1007/s10651-010-0147-7
remote sensing data with GIS: case study of Bangladesh. In: Pradhan B (2011b) Use of GIS based fuzzy relations and its cross
Jerry Ritchie et al (eds) Remote sensing and hydrology 2000. application to produce landslide susceptibility maps in three test
IAHS Publication, Oxford, pp 455458 areas in Malaysia. Environ Earth Sci 63(2):329349. doi:
Islam MM, Sado K (2002) Development priority map for flood 10.1007/s12665-010-0705-1
countermeasures by remote sensing data with geographic Pradhan B, Buchroithner MF (2010) Comparison and validation of
information system. J Hydrol Eng 9:346355 landslide susceptibility maps using an artificial neural network
Kingma NC (2002) Flood hazard assessment and zonation, Lecture model for three test areas in Malaysia. Environ Eng Geosci
Note. ITC, Enschede 16(2):107126. doi:10.2113/gseegeosci.16.2.107
Lee S, Ryu J, Won J, Park H (2004) Determination and application of Pradhan B, Lee S (2009) Landslide risk analysis using artificial neural
the weights for landslide susceptibility mapping using an network model focusing on different training sites. Int J Phys Sci
artificial neural network. Eng Geol 71:289302 3(11):115
123
Pradhan B, Lee S (2010a) Landslide susceptibility assessment and Rashid A, Aziz A, Wong KFV (1992) A neural network approach to
factor effect analysis: backpropagation artificial neural networks the determination of aquifer parameters. Ground Water
and their comparison with frequency ratio and bivariate logistic 30:164166
regression modeling. Environ Modell Softw 25:747759. doi: Ray C, Klindworth KK (2000) Neural networks for agrichemical
10.1016/j.envsoft.2009.10.016 vulnerability assessment of rural private wells. J Hydrol Eng
Pradhan B, Lee S (2010b) Delineation of landslide hazard areas using 4:162171
frequency ratio, logistic regression and artificial neural network Rogers SJ, Chen HC, Kopaska-Merkel DC, Fang JH (1995) Predict-
model at Penang Island, Malaysia. Environ Earth Sci 60:1037 ing permeability from porosity using artificial neural networks.
1054. doi:10.1007/s12665-009-0245-8 AAPG Bull 79:17861797
Pradhan B, Lee S (2010c) Regional landslide susceptibility analysis Sarle WS (1994) Neural networks and statistical models. In:
using backpropagation neural network model at Cameron High- Proceedings of the nineteenth annual SAS users group interna-
land, Malaysia. Landslides 7(1):1330. doi:10.1007/s10346- tional conference, SAS Institute, pp 15381550
009-0183-2 Schaap MG, Leij FJ, VanGenuchten MT (1998) Neural network
Pradhan B, Pirasteh S (2010) Comparison between prediction analysis for hierarchical prediction of soil hydraulic properties.
capabilities of neural network and fuzzy logic techniques for Soil Sci Soc Am J 62:847855
landslide susceptibility mapping. Disaster Adv 3(2):2634 See L, Openshaw S (2000) A hybrid multi-model approach to river
Pradhan B, Shafie M (2009) Flood hazard assessment for cloud prone level forecasting. Hydrol Sci J 45:523536
rainy areas in a typical tropical environment. Disaster Adv Sezer E, Pradhan B, Gokceoglu C (2011) Manifestation of an
2(2):715 adaptive neuro-fuzzy model on landslide susceptibility mapping:
Pradhan B, Youssef AM (2010) Manifestation of remote sensing data Klang valley, Malaysia. Expert Syst Appl 38(7):82088219. doi:
and GIS for landslide hazard analysis using spatial-based 10.1016/j.eswa.2010.12.167
statistical models. Arab J Geosci 3(3):319326. doi:10.1007/ Sieber A, Uhlenbrook S (2005) Sensitivity analyses of a distributed
s12517-009-0089-2 catchment model to verify the model structure. J Hydrol
Pradhan B, Youssef AM (2011) A 100-year maximum flood 310:216235
susceptibility mapping using integrated hydrological and hydro- Smith K, Ward R (1998) Floods: physical processes and human
dynamic models: Kelantan River Corridor, Malaysia. J Flood impacts. John Wiley and Sons Ltd, West Sussex, pp 333
Risk Manag 4:189202. doi:10.1111/j.1753-318X.2011.01103.x Tamari S, Wosten JHM, Ruiz-Suarez JC (1996) Testing an artificial
Pradhan B, Singh RP, Buchroithner MF (2006) Estimation of stress neural network for predicting soil hydraulic conductivity. Soil
and its use in evaluation of landslide prone regions using remote Sci Soc Am J 57:10881095
sensing data. Adv Space Res 37:698709. doi:10.1016/j.asr. Tamura SI, Tateishi M (1997) Capabilities of a four-layered feed-
2005.03.137 forward neural network: Four layers versus three. IEEE T Neural
Pradhan B, Lee S, Buchroithner MF (2010a) A GIS-based backprop- Netw 8(2):251255
agation neural network model and its cross application and United Nations Environment Program (2002) Early warning, fore-
validation for landslide susceptibility analyses. Comput Environ casting and operational flood risk monitoring in Asia (Bangla-
Urban Sys 34:216235. doi:10.1016/j.compenvurbsys. desh, China and India). http://www.unep.org/geo/geo3.asp.
2009.12.004 Accessed 21 Aug 2010
Pradhan B, Lee S, Buchroithner M (2010b) Remote sensing and GIS- Varoonchotikul P (2003) Flood forecasting using artificial neural
based landslide susceptibility analysis and its cross-validation in networks. Taylor & Francis, The Netherlands, p 102
three test areas using a frequency ratio model. Photogramm World Meteorological Organisation (2008) Urban flood management: a
Fernerkun 1:1732. doi:10.1127/1432-8364/2010/0037 tool for integrated flood management. http://www.wmo.int/pages/
Pradhan B, Youssef AM, Varathrajoo R (2010c) Approaches for mediacentre/press_releases/pr_835_en.html. Accessed 15 July
delineating landslide hazard areas using different training sites in 2010
an advanced artificial neural network model. Geospatial Inf Sci Woldt W, Dahab I, Bogardi C, Dou C (1996) Management of diffuse
13(2):93102. doi:10.1007/s11806-010-0236-7 pollution in groundwater under imprecise conditions using fuzzy
Pradhan B, Sezer E, Gokceoglu C, Buchroithner MF (2010d) models. Water Sci Technol 33:249257
Landslide susceptibility mapping by neuro-fuzzy approach in a Youssef AM, Pradhan B, Hassan AM (2011) Flash flood risk
landslide prone area (Cameron Highland, Malaysia). IEEE T estimation along the St. Katherine road, southern Sinai, Egypt
Geosci Remote 48(12):41644177. doi:10.1109/TGRS. using GIS based morphometry and satellite imagery. Environ
2010.2050328 Earth Sci 62(3):611623. doi:10.1007/s12665-010-0551-1
Principe JC, Euliano NR, Lefebvre WC (1999) Neural and adaptive Zhu XY, SHi Xu, Zhu J-J, Zhou N-Q, Wu C-Y (1997) Study on the
systems: fundamentals through simulations. John Wiley and contamination of fracture karst water in Boshan District, China.
Sons, New York Ground Water 35:538545
123

Gis Ann

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Gis Ann

Загружено:

Авторское право:

Доступные форматы

Environ Earth Sci (2012) 67:251264

An artificial neural network model for flood simulation using GIS:

Wan Nor Azmin Sulaiman Abbas Moradi

Fig. 1 a Flood-prone area in

Table 1 Summary of the rain gauges of Johor River Basin

Flood simulation 1997). The number of neurons in output layers is fixed by

Fig. 3 Architecture of a multilayer neural network and an artificial

Fig. 4 A schematic architecture of ANN for flood modeling

the algorithms, the training of ANN, and to prevent any

Table 2 Input (I)hidden layer 1 (HA) connection weights

HA1 23.6799 -9.9E-09 1.6E-09 1.6E-09 -3.2E-09 -1.3E-09 -4.1E-09

Comparison between the forecasted and observed river

Model performance assessments

The model accuracy assessment is described in terms of the

Fig. 7 Flood inundation area in

Вам также может понравиться