10 1016@j Jhydrol 2016 06 027

Accepted Manuscript
Hybrid Artificial Intelligence Approach Based on Neural Fuzzy Inference Mod-

el and Metaheuristic Optimization for Flood Susceptibility Modelling in A
High-Frequency Tropical Cyclone Area using GIS
Dieu Tien Bui, Biswajeet Pradhan, Haleh Nampak, Thanh Quang Bui, Quynh-
An Tran, Quoc Phi Nguyen
PII: S0022-1694(16)30378-X
DOI: http://dx.doi.org/10.1016/j.jhydrol.2016.06.027
Reference: HYDROL 21345
To appear in: Journal of Hydrology
Received Date: 4 April 2016

Revised Date: 8 June 2016
Accepted Date: 14 June 2016
Please cite this article as: Bui, D.T., Pradhan, B., Nampak, H., Quang Bui, T., Tran, Q-A., Nguyen, Q.P., Hybrid
Artificial Intelligence Approach Based on Neural Fuzzy Inference Model and Metaheuristic Optimization for Flood
Susceptibility Modelling in A High-Frequency Tropical Cyclone Area using GIS, Journal of Hydrology (2016),
doi: http://dx.doi.org/10.1016/j.jhydrol.2016.06.027
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers
we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and
review of the resulting proof before it is published in its final form. Please note that during the production process
errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Hybrid Artificial Intelligence Approach Based on Neural Fuzzy Inference
Model and Metaheuristic Optimization for Flood Susceptibility Modelling
in A High-Frequency Tropical Cyclone Area using GIS
Dieu Tien Bui a,*; Biswajeet Pradhan b,f; Haleh Nampak b; Thanh Quang Bui c; Quynh-An Tran d ; Quoc Phi
Nguyene
a
Geographic Information System Group, Department of Business Administration and Computer,
Science, University College of Southeast Norway, Hallvard Eikas Plass 1, N-3800 Bø i Telemark,
Norway
b
Department of Civil Engineering, Geospatial Information Science Research Center (GISRC),
Faculty of Engineering, University Putra Malaysia, Serdang, Selangor Darul Ehsan 43400,
Malaysia
c
Faculty of Geography, VNU University of Science, 334 Nguyen Trai, Thanh Xuan, Hanoi,
Vietnam
d
Faculty of Geomatics and Land Administration , Hanoi University of Mining and Geology, Duc
Thang, Bac Tu Liem, Hanoi, Vietnam
e
Department of Environmental Sciences, Hanoi University of Mining and Geology, Duc Thang,
Bac Tu Liem, Hanoi, Vietnam
f
Department of Geoinformation Engineering, Choongmu-gwan, Sejong University, 209
Neungdong-ro Gwangjingu, Seoul 05006 Republic of Korea
* Corresponding author to D. Tien Bui; e-mail: Dieu.T.Bui@hit.no/BuiTienDieu@gmail.com
Abstract
This paper proposes a new artificial intelligence approach based on neural fuzzy inference system
and metaheuristic optimization for flood susceptibility modeling, namely MONF. In the new
approach, the neural fuzzy inference system was used to create an initial flood susceptibility model
and then the model was optimized using two metaheuristic algorithms, Evolutionary Genetic and
Particle Swarm Optimization. A high-frequency tropical cyclone area of the Tuong Duong district
in Central Vietnam was used as a case study. First, a GIS database for the study area was
constructed. The database that includes 76 historical flood inundated areas and ten flood influencing
1
factors was used to develop and validate the proposed model. Root Mean Square Error (RMSE),
Mean Absolute Error (MAE), Receiver Operating Characteristic (ROC) curve, and area under the
ROC curve (AUC) were used to assess the model performance and its prediction capability.
Experimental results showed that the proposed model has high performance on both the training
(RMSE = 0.306, MAE = 0.094, AUC = 0.962) and validation dataset (RMSE = 0.362, MAE =
0.130, AUC = 0.911). The usability of the proposed model was evaluated by comparing with those
obtained from state-of-the art benchmark soft computing techniques such as J48 Decision Tree,
Random Forest, Multi-layer Perceptron Neural Network, Support Vector Machine, and Adaptive
Neuro Fuzzy Inference System. The results show that the proposed MONF model outperforms the
above benchmark models; we conclude that the MONF model is a new alternative tool that should
be used in flood susceptibility mapping. The result in this study is useful for planners and decision
makers for sustainable management of flood-prone areas.
Key words: Flood susceptibility mapping; Neural Fuzzy; Genetic Evolutionary; Particle Swarm
Optimization; GIS; Vietnam
1. Introduction
Flood is commonly known as one of the most frequent and destructive natural hazards due to its
huge economic loss and spatial extent (Du et al., 2013). According to Ceola et al. (2014), around
200 million people were affected by floods in two years 2011 and 2012 throughout the world, and a
total losses reached to $ 95 billion. Southeast Asian countries are considered as the most frequently
flood affected region due to monsoonal rainfalls (Loo et al., 2015). Hence, identification of
susceptible areas to the flood can significantly contribute to reduce its damages to settlement,
agriculture, and livelihood by avoiding more construction and developments in the flooded-prone
areas.
Vietnam is one of the most disaster-prone countries in the world, and in particular, the central
region part is highly susceptible to natural disasters i.e. tropical cyclones, tropical depressions,
2
floods, landslides, droughts, whirlwinds and salinity intrusion (Arouri et al., 2015; Brown and Pty,
2009; Fernandez et al., 2012). Flooding and tropical cyclones are considered to be the most
devastating natural disasters in this area. More than 71% of the Vietnam’s population and 59 % of
the total land area of Vietnam are susceptible to the impacts of such hazards (Shaw et al., 2011).
According to Kreft et al. (2014), from 1994-2013, this country endured a gauged yearly economic
loss equivalent to 1.01% of GDP (PPP) or $ 2.9 billion.
Prevention of flood events may not be completely possible but they can be predicted, and flood risk
and vulnerable areas can be mapped out through prediction models (Alfieri et al., 2015; Cloke and
Pappenberger, 2009). Traditionally, the main purpose of producing flood models is to obtain an
accurate assessment of discharge over the watersheds (Smith and Ward, 1998). Literature review
shows that many hydrological techniques have been applied in flood prediction (Fenicia et al.,
2008). However, the complexity, dynamic and non-linearity structures of watersheds prevents from
an accurate flood modelling through simple and non-linear hydrological methods and techniques
(Sahoo et al., 2006). In addition, extensive field works are required for data collection, which is
often time taking and costly (Fenicia et al., 2008). Therefore, it can be deduced that conventional
hydrological methods are not capable to fulfil the requirements of comprehensive flood evaluation
in regional studies (Li et al., 2012).
Application of remote sensing and GIS techniques has made important contributions in hydrological
studies, and in particular, for flood management (Correia et al., 1999; Haq et al., 2012; Pradhan et
al., 2009). Some hydrological models integrated with geospatial techniques have received great
attentions such as HYDROTEL (Fortin et al., 2001), WetSpa (Liu and De Smedt, 2005), and SWAT
(Jayakrishnan et al., 2005). However, conventional hydrological models are still required to be
developed and optimized or replaced by robust and automated methods in order to solve the
limitations of the traditional hydrological techniques (Bai et al., 2015; Hostache et al., 2010; Tsai et
al., 2015).
3
Due to the critical problem of flood, statistical and data driven approaches have been proposed in
flood studies such as analytic hierarchy process (Kazakis et al., 2015), frequency ratio (Lee et al.,
2012; Tehrany et al., 2015a), logistic regression (Fekete, 2009; Tehrany et al., 2014a), weights-of-
evidence (Tehrany et al., 2014b), and fuzzy logic (Pulvirenti et al., 2011). However, flood is
complex process that difficult to assess and model (Dottori et al., 2016), therefore non-linear
machine learning algorithms have been proposed for flood modelling with promising results, such
as artificial neural networks (Kia et al., 2012; Seckin et al., 2013), decision tree (Tehrany et al.,
2013), support vector machines (Tehrany et al., 2015b), k-nearest neighbors (Liu et al., 2016), M5
model trees (Sattari et al., 2013).
Among machine learning techniques, artificial neural networks have been the most widely used
technique due to its computational efficiency (Ghalkhani et al., 2013; Rezaeianzadeh et al., 2014;
Tokar and Johnson, 1999). However, errors in the process of modelling and poor predictions due to
length of dataset and dissimilar value ranges of validation and training dataset are the weak points
of artificial neural networks (Toth et al., 2000). Therefore neural fuzzy models that combine fuzzy
logic and neural networks have been proposed with high accuracy (Chang and Tsai, 2016; Güçlü
and Şen, 2016; Lohani et al., 2012; Shu and Ouarda, 2008). Several authors compared neural fuzzy
models with neural network models with conclusions that performance of neural fuzzy models was
better (Mukerji et al., 2009; Nayak et al., 2005; Nayak et al., 2004). However, application of neural
fuzzy models has some restrictions due to its inability to find best weight parameters that heavily
influence the prediction performance of these neural fuzzy models. In addition, these neural fuzzy
models have slow training speed and sensitivity to noise in hydrological modeling (Hong and
White, 2009), therefore local learning paradigms were proposed (Talei et al., 2013).
Nevertheless, it is still difficult to find optimal parameters for neural fuzzy models and thus the best
parameters should be searched through soft computing optimization processes (Tien Bui et al.,
2016b). We address this issue in this paper by proposing a hybrid artificial intelligence approach
based on metaheuristic optimization and neural fuzzy inference model (namely MONF) for flood
4
susceptibility modelling with a case study at the Tuong Duong district in Central Vietnam. In the
proposed soft computing methodological approach, the neural fuzzy was used to create an initial
flood susceptibility model, and then, two metaheuristic algorithms, Evolutionary Genetic (Goldberg
and Holland, 1988) and Particle Swarm Optimization (Eberhart and Shi, 2001), were adopted to
optimize the model. The overall performances of the proposed model were assessed using the
training and validation datasets, Root Mean Square Error (RMSE), Mean Absolute Error (MAE),
Receiver Operating Characteristic (ROC) curve, and area under the ROC curve (AUC). Finally, the
usability of the proposed model flood susceptibility mapping is verified through a comparison with
current state-of-the art machine learning methods such as J48 Decision Tree (J48DT), Random
Forests (RF), Multi-layer perceptron neural networks (MLP Neural Nets), Support vector machines
(SVM), and popular Adaptive Neuro-Fuzzy Inference System (ANFIS), and conclusion remarks
were given.
2. The study area and the flood database
2.1 Description of the study area
The study area is the Tuong Duong district that is located in the mountainous region of the Nghe An
province (around 350 km to the south of Hanoi city), one of the most affected flood province in
Vietnam (Reynaud and Nguyen, 2016). The district covers an area of around 2803.1 km2 (Fig. 1),
and lies between the longitudes 18°58'42''N and 19°39'16''N, and between the latitudes 104°15''58'E
and 104°55''57'E. The topography of the district is quite complex and divided by mountains, hills,
rivers, and streams. The altitude ranges from 2.9 m to 2122.2 m a.s.l with the mean altitude of 527.8
m and the standard deviation of 315.1 m. Slope angles vary from 0o to 84.7o. Areas with slope angle
larger than 20o occupy 62.6% of the total study area, whereas areas with slopes less than 5 o and
from 5o and 20o cover 14% and 23.5% of the total study, respectively.
The district belongs to the tropical coastal highland area with two separated seasons: (i) a cold and
dry season due to the north-east monsoon lasts from November until March and (ii) a hot, dry
5
winds, and humid season influenced by the south-west monsoon is from April to October (Tottrup,
2004). The highest temperatures can peak of 42.7°C (in June and July) whereas the lowest
temperatures can down to 0.5o (in December, January, and February). The mean temperature is
from 23–24°C. Our analysis of rainfall from the years 1979 to 2010 shows that the annual rainfall
of the study area varies from 1679 mm to 3259 mm. The rainfall is mainly concentrated in the rainy
season from April to October that accounts for 88.6-93.3% of the total rainfall yearly.
Fig. 1. Location of the study area and flood inventories.
Due to the geographical location and topographic characteristics of the mountainous highlands, the
district is highly vulnerable to food hazards with large losses life and properties. Floods happen
every year due to extreme rainfalls during the tropical cyclone season. Investigations of Reynaud
and Nguyen (2012) showed that around 40.4% of households have been flooded and approximately
20.3% of households were evacuated, the average cost of flood accounting for 24.1% of the total
6
household income annually. Statistics of Ministry of Agriculture and Rural Development of
Vietnam shows that floods have killed 30 people each year on average during the last 10 years in
the Nghe An province (including the study area) (Reynaud and Nguyen, 2016).
2.2 Flood inventory map
In order to estimate future flood zones, analyzing the past records of its occurrence is essential
(Toth et al., 2000; Tropeano and Turconi, 2004), therefore, an inventory map is considered as the
most important factor for prediction of future disaster occurrence and it can represent single or
multiple events in a specific area (Tien Bui et al., 2012a). The flood inventory map in this study,
which contains the historical flood records, was constructed based on: (i) documentary sources of
the Tuong Duong district; (ii) flood locations collected during field works; and (iii) interpretation of
Landsat 8 Operational Land Imagery that that acquired in from 2010-2014 (30 m resolution,
available at the USGS archive at http://earthexplorer.usgs.gov). In addition, the flood inventory map
was checked during field work conducted in 2014 using handhold GPS. A total of 76 flood
locations that occurred during the last five years were prepared.
2.3 Flood conditioning factor
It is essential to determine the flood conditioning factors in order to perform flood susceptibility
mapping. The selection of the conditioning factors varies from one study area to another based on
different characteristics of each place. This is because one variable can have high degree of impact
in flooding in a specific area, but it can be without any influence in another regions (Kia et al.,
2012). For current research, these variables were chosen based on field survey and the information
derived from the literature. Hence, ten flood conditioning factors were selected for the susceptibility
analysis and a GIS database of these factors was compiled. Those factors are: slope, elevation,
curvature, topographic wetness index (TWI), stream power index (SPI), distance to river, stream
density, NDVI, lithology, rainfall. Once the dataset was prepared, each conditioning factor was
7
transformed into a grid spatial database by 20 m size and the grid of the Tuong Duong district was
constructed by 3900 columns and 4125 rows.
Topographical factors play significant role to distinguish the prone areas to flood occurrence and
have direct impact on the output of modeling and many studies (Cook and Merwade, 2009),
therefore a digital Elevation Model (DEM) for the study area was generated first. The DEM has a
resolution of 20 m and was generated from the topographical maps in 1:50,000 scale having a
contour interval of 10 m. Frequently, runoff volume is influenced by slope and curvature, and
floods are usually identified in low elevations (Heerdegen and Beran, 1982; Niedda, 2004; Qi et al.,
2009), therefore, slope, elevation, and curvature that derived from the DEM were used for this
analysis. The slope map (Fig. 2a) was classified into eight classes e.g., 0°-0.5°, 0.5°-2°, 2°-5°, 5°-
8°, 8°-13°, 13°-20°, 20°-30°, > 30°. The elevation map (Fig. 2b) was grouped into ten different
classes between < 100 m and >1300 m. The slope curvature map (Fig. 2c) was compiled with five
categories: <-2; -2 to -0.05; -0.05 to 0.05; 0.05 to 2; > 2.
8
Fig. 2. Flood influencing factor: (a) slope; (b) Elevation; (c) Curvature; (d) TWI; (e) SPI; (f)
Distance to river.
The spatial variation of floods may be influenced by hydrological conditions; therefore TWI and
SPI should be used in flood susceptibility analysis. TWI that is a topographic indices developed by
Beven et al. (1984) within the runoff model is expressed as below:
(1)
where is the local upslope area draining through a certain point per unit contour length and β is
the local slope.
SPI that measures erosive powers of streams is calculated using the following equation:
(2)
where is the specific catchment area and β is the local slope gradient in degrees.
9
In this study, the TWI map (Fig. 2d) was constructed with eight classes whereas the SPI map (Fig.
2e) was built with ten categories.
The distance from the river and stream density has significant impacts on the spread and magnitude
of flooding (Glenn et al., 2012), therefore the two factors were selected. In this study, the river
network was extracted from the aforementioned topographical maps and was used to construct the
distance to river map (Fig. 2f) and river density map (Fig. 2g), respectively.
NDVI is a proxy for vegetation attributes and areas with dense vegetation are less prone to flooding
due to the negative relationship between flooding and vegetation density (Tehrany et al., 2013),
therefore, NDVI should be used for flood modeling. The NDVI map with 8 classes (Fig. 2h) of the
current study was produced from the aforementioned Landsat 8 OLI imagery using the equation
(Tucker and Sellers, 1986) as follows:
(3)
where NIR is the near-infrared portion of the electromagnetic spectrum (0.76-0.90 µm), R is the red
portion of the electromagnetic spectrum (0.63–0.69 µm).
10
Fig. 2. (continues): (g) Stream density; (h) NDVI; (i) Lithology; (j) Rainfall.
Lithology is considered to be a main control of channel shape that influences the development of
inset floodplains (Heitmuller et al., 2015), therefore the lithology was selected for flood
susceptibility modeling. The lithology map (Fig. 2i) in this study was generated from Geological
and Mineral Resources Map of Vietnam on a scale of 1:50,000. The lithology map was constructed
with 12 classes based on lithological similarities (Ayalew and Yamagishi, 2005; Pham et al., 2015).
The rainfall data for the study area for from the year 1979 to 2010 were extracted from the Climate
Forecast System Reanalysis (CFSR) database (available at https://www.ncdc.noaa.gov/). An
average rainy season which was recorded from the above period was used to construct the rainfall
map with 8 classes (Fig. 2j) using the Inverse Distance Weighed method.
3. Theoretical background of the methods used
3.1 Neural fuzzy inference model
Neural fuzzy inference model is an amalgamation of fuzzy logic and neural networks that has
capability to map the input space to output space through approximating functions. The behavior of
the neural fuzzy system is strongly influenced by the fuzzy inference engine and algorithms used to
find its optimal weights. Two types of inference engine models that are widely used are the
11
Mamdani fuzzy model and the Takagi and Sugeno fuzzy model (Hellendoorn and Driankov, 2012).
The main difference between them is that the Mamdani model used the fuzzy proposition for both
the antecedent and the consequent parts, whereas in the Takagi and Sugeno fuzzy model, the fuzzy
proposition is used for the premise part, but an affine linear function is used for the consequent part.
In this research, the Takagi and Sugeno fuzzy model and the neural fuzzy structure proposed by
Jang et al. (1997) were used to construct flood susceptibility model.
Structurally, this is a five-layered feed-forward neural fuzzy network in which the first and forth
layers contain adaptive nodes, whereas the second, the third, and the fifth layers contains fixed
nodes. Fig. 3 shows the structure of the neural fuzzy model with two inputs (x, y), one output, and
two rules. In our case study, the input layer consists of ten factors whereas the output is flood
susceptibility index.
The operating mechanism of the neuro-fuzzy network can be summarized as follows: first, the
fuzzification process is performed that transforms the values of 10 flood influencing factors into
fuzzy membership values using a fuzzy membership function. In this study the Gaussian function
(Eq. 4) is used.
 ( x  c )2
 A ( x)  e 2 2 (4)
i
where  Ai ( x ) and  Bi ( y ) are the membership functions; A and B are the linguistic labels of the inputs;
 and c are the width and the centre of the function, respectively.
12
Fig. 3. Architecture of the neuro fuzzy network.
Second, the antecedent (or premise) part of the rules is constructed in the second layer. The output
is called the firing strengths ( i ) and is computed by multiplying all the incoming signals (Eq. 5):
i   A ( y ) . B ( y ) ; i  1, 2
i i
(5)
In the next step, the normalized firing strengths ( i ) are computed the layer 3 using Eq. 6, and then,
the defuzzification is performed in the layer 4 using Eq. 7
i  i / (1  2 ); i  1, 2 (6)
i . fi  i .( pi x  qi y  ri ) (7)
where pi, qi, and ri are the consequent parameters of the affine linear function.
Finally, flood susceptibility values are calculated in the layer 5 (the aggregation layer) using Eq. 8
as follows:
Flood susceptibility values  ii . fi ; i  1, 2 (8)

The set of two if–then rules that is generated from the neural fuzzy model can be written as bellows
(Tien Bui et al., 2012c):
Rule 1: if x is A1 and y is B1, then f1 = p1x + q1y + r1.
Rule 2: if x is A2 and y is B2, then f2 = p2x + q2y + r2.
3.2 Metaheuristic optimization algorithms
It is noted that the neural fuzzy network described in the above section was proposed by Jang et al.
(1997), in which the best values of antecedent parameters ( , c) the consequent parameters (p, q, r)
were determined using the hybrid learning algorithm. In this study, we propose two relatively new
algorithms, Evolutionary Genetic optimization (Goldberg and Holland, 1988) and Particle Swarm
optimization (Eberhart and Shi, 2001), for searching the best values for the aforementioned
parameters of the neural fuzzy model.
3.2.1 Evolutionary Genetic optimization algorithm

13
Evolutionary Genetic optimization (EG) is a stochastic algorithm based on Darwin's Evolution
Theory and Natural Selection (Mitchell, 1998) that has been widely used for solving optimization
problems. The purpose of EG is to find the best population that has the lowest the difference
between output and target values. In current research context, RMSE is used to measure the
difference between the output of the neural fuzzy model (flood susceptibility values) and the flood
target values.
The evolutionary processes of EG for flood susceptibility modeling can be summarized as bellows:
Step 1(Initialization): An initial population is generated where the total number of chromosomes is
called the population size.
Step 2 (GA procedure): This step aims to update and create a new population through three
operators (selection, crossover, and mutation) that has higher fitness values (lower RMSE):
- Selection: The fitness (RMSE) of each individual on the population is calculated, and
then, all of them are used to create a mating pool population. These individuals are called
parents.
- Crossover: The parents are randomly paired to produce a new generation of chromosomes
called offsprings that inherits some characteristics of their parents.
- Mutation: Some chromosomes randomly changed to produce offsprings and the new
population is updated.
Step 3 (Fitness evaluation): The fitness of each offspring in the new population is evaluated.
This is an iteration process and the offspring in the final population that has the lowest RMSE will
be used to extract the best antecedent and consequent parameters of the neural fuzzy model.
3.2.2 Particle Swarm optimization
Particle Swarm optimization (PSO) is a relatively new algorithm that simulates social behavior, for
instance birds flocking, to obtain the best position in a multidimensional space (Kennedy &
Eberhard, 1997). In current context of flood susceptibility modeling, PSO aims to find the best
position of the population (called swarm). This position will have the lowest RMSE.
14
In the swarm, position of each individual (called particle) are updated from iteration to iteration to
find the best one through modifying its position and velocity, then these positions were compared to
select the globally best position for the swarm. Thus, a particle is considered as a point in a D-
dimension space and its status is characterized according to its position and velocity. The D-
dimensional position for particle i at iteration t can be represented as ,
whereas the velocity for particle i at iteration t can be described as Let
represents the best position that particle i at iteration t and
denotes the globally best position at iteration t. To search for the optimal solution,
each particle changes its velocity and position based on equations as follows:
, d = 1,2,…,D (6)
where is the inertia weight; indicates the cognition learning factor; indicates the social
learning factor; and are random numbers uniformly distributed in [0,1].
(7)
The basic process of the PSO algorithm is given as follows:
Step 1(Initialization): An initial swarm was randomly generated.
Step 2 (Fitness evaluation): The fitness each particle in the swarm is valuated based on RMSE
Step 3 (Update): The velocity computation of each particle is estimated based on Eq.6 and position
of each particle is updated using Eq.7.
Step 4 (Construction). For each particle, move to the next position based on Eq.7.
Step 5 (Termination). Stop the algorithm if termination criterion is satisfied or return to step 2
otherwise. For flood susceptibility modeling in this study, the iteration is terminated if the
number of iteration reaches the pre-determined maximum number of iteration.
3.3 Performance assessment
The quantitative evaluation of model accuracy is defined in terms of the forecasting error or the
difference between the observed and anticipated values. In this study, the efficiency of the flood
15
prediction models was evaluated employing statistical evaluation criteria such as Root Mean
Squared Error (RMSE), Mean Absolute Error (MAE), Receiver Operating Characteristic (ROC),
and area under the curve (AUC).
RMSE (Eq.8) and MAE (Eq.9) are statistical measures for evaluation of flood models performance
(Kia et al., 2012). Although many studies declared RMSE can be used as a standard metric for
model errors in geosciences (Savage et al., 2013), however RMSE is sensitive with large values and
outliers (Chai and Draxler, 2014), therefore MAE is further used. Thus, RMSE and MAE are used
together in this study to identify the variation in the errors in a set of predictions. RMSE will always
be larger or equal to the MAE; the greater difference between them, the greater the variance in the
individual errors in the samples.
(8)
(9)
where n is total samples in the training dataset or the validation dataset; is the target values in the
training dataset or the validation dataset; is output values from the flood susceptibility models.
In order to determine the global performance of prediction models, the ROC curve is used (Tien Bui
et al., 2016a). The ROC curve is a bi-dimensional graph plotting sensitivity (true-positive rate)
versus false-positive rate (1-specificity) with various cut-off values, and the closer the curve to the
upper left corner, the better the model is. Global performance of the final neural fuzzy model is
quantified using (AUC). AUC of 1 indicates a perfect model whereas a model with AUC of 0.5 is
non-informative. The qualitative correlation between AUC and prediction capability of models can
be categorized into the following groups: 0.5–0.6 (poor); 0.6–0.7 (average); 0.7–0.8 (good); 0.8–0.9
(very good); and 0.9–1 (excellent) (Kantardzic, 2011; Tien Bui et al., 2016c) .
16
4. Proposed novel integration model based on metaheuristic optimization
algorithms and neural fuzzy Inference model (MONF) for flood
susceptibility mapping
This research aims to propose and verify a novel soft computing methodological approach (namely
MONF) through combination of neural fuzzy inference system and two metaheuristic optimization
algorithms (EG and PSO) for flood susceptibility modeling. This part of the paper describes the
proposed MONF model. It is noted that the data preparation and processing were carried out using
ArcGIS10.2 and IDRISI Selva 17.01. The neural fuzzy inference system is available in the
MatlabTM Fuzzy Logic Toolbox (The MathWorks, 2014). The proposed MONF model was
programmed by the authors in Matlab 2014b environment. In addition, a C++ application was
developed to convert the flood susceptibility indices to a GIS format for ArcGIS10.2. The structure
of the proposed MONF model for flood susceptibility mapping is shown in Fig. 4.
(1) GIS database
The first stage is the data preparation and processing in which historical flood locations;
topographic maps, Landsat 8 OLI imagery, geological maps, and precipitation data were collected,
processed, and compiled. The result was then used to construct a GIS database that consists of the
flood inventory map and 10 influencing factors (slope, elevation, curvature, TWI, SPI, distance to
river, stream density, NDVI, lithology, and rainfall). Since these factors were constructed with
categorical classes (see section 3.2), a transformation process suggested by Tien Bui et al. (2012b)
was used to convert these categorical classes to numeric values. Accordingly, each class of these
maps was assigned an attribute value, and then, rescaled into a range 0.01 to 0.99 with the use of the
max-min normalization. This will avoid negative effects when the differences in magnitudes of the
categorical values are large (Tien Bui et al., 2012c).
In flood modeling, prediction models should be validated concerning unknown future flood
occurrence and without this task the models will have no scientific meaning. Thus, flood locations
17
that are not employed during the model building are used to validate the resulting map. For this
purpose, in this study, these flood points in the flood inventory map was randomly partitioned into
two subsets with a ratio of 70/30 for training (54 locations) and validating (22 locations) models,
respectively (Hoang and Tien Bui, 2016; Tien Bui et al., 2016c). These flood points were assigned
to ‘1’. The same amount of non-flood points were randomly generated from non-flood areas of the
study area and assigned to ‘0’. Finally, a sampling process was carried out on the ten influencing
factor to constructed the training and validation datasets. The training dataset was used to build
flood susceptibility models whereas the validation dataset was employed to estimate the prediction
capability of the models.
Fig. 4. The structure of the proposed MONF model based on metaheuristic optimization and neural
fuzzy Inference model for flood susceptibility mapping.
(2) Model configuration:
The initial neural fuzzy model was generated using the training dataset where fuzzy sets and fuzzy
membership values were derived using the Gaussian membership function. Since the neural fuzzy
18
model could work more efficient if the training data was represented more concisely (Tien Bui et
al., 2012c), therefore, fuzzy c-means clustering algorithm that has capability to distil natural groups
of data was used (Abdulshahed et al., 2015). A test based on trial-and-error processes was
conducted to analyze between RMSE and the number of clusters, and finally, 15 clusters were
found as the best for this study. These clusters will be used to generate if-then rules of the neural
fuzzy model, where the best values of antecedent and consequent parameters of these rules are
determined in the metaheuristic optimization in the next step.
(3) Metaheuristic optimization
The aim of this step is to find the optimal antecedent and consequent parameters of the neural fuzzy
model using EG and PSO algorithms. Since the diversity of initial populations is influenced by
numbers of individuals (in EG)) and particles (in PSO), therefore a trial-and-error test was
conducted; and 30 individuals and 40 particles are the best for EG and PSO, respectively. For the
case of EG, the crossover rate of 0.4 and mutation rate of 0.6 are used for all the calculations. This
is because these rates have been shown to maintain sufficient diversity of the population
(Musharavati, 2010; Sarker and Ray, 2010). For the case of PSO, the inertia weight of 0.9 is used
due to the ability to obtain high performance of the model (Poli et al., 2007).
(4) Evaluation of fitness and (5) Stopping criteria
To evaluate the performance of the neural fuzzy model, RMSE function in Eq.8 was used. It is
noted that the lower the RMSE, the better the neural fuzzy models is. During the optimization
process, the EG and PSO algorithms explore various combination of antecedent and consequent
parameters. In each generation, the EG algorithm creates chromosomes and checks them with
RMSE to select the best chromosomes population. In different way, the PSO algorithm updates
positions of particles based on RMSE to find the best position of the swarm. This is an iteration
process and a maximum of 1000 generations as the stopping criteria was used. The final RMSEs of
19
the best chromosomes and the best swarm position are compared and then the optimal antecedent
and consequent parameters are derived.
(6). Final flood susceptibility model
Using the optimal antecedent and consequent parameters, the final neural fuzzy model was
constructed, and subsequently, used to calculate flood susceptibility index for all pixels in the study
areas. The degree-of-fit of the model with the training dataset and the prediction capability of the
model are assessed using RMSE, MAE, ROC curve, and AUC, as mentioned in section 3.3.
5. Results and discussion
Although the flood influencing factors have selected based on analysis of flood occurrence and
characteristics of the study, however, it is logical to say that the degree of impact of these factors is
different, and in some case, some factors may have no influencing to flood. Therefore the predictive
power of the flood influencing factors should be analyzed and factors with non predictive value
should be eliminated. This will help to reduce noise and the performance of the resulting model will
be enhanced (Tien Bui et al., 2016d).
For this purpose, Pearson correlation technique (Guyon and Elisseeff, 2003) was adopted to
quantify the predictive power of ten flood influencing factors in this study. The Pearson technique is
selected due to its efficiency for feature selection in machine learning (Canuto et al., 2008; Lloyd et
al., 2011). The higher the Pearson correlation value, the better the influencing factor to the flood
model. The analysis result is shown in Table 1. The column ‘average predictive power’ is the
average Pearson correlation value and its standard deviation derived from calculation of the Pearson
correlation values in a 10-fold cross-validation process.
It could be observed a clear distinction of the predictive powers of 10 influencing factors in
correlations to the flood in the study area. The highest predictive power is the elevation (0.599) and
the stream density (0.596), followed by the NDVI (0.568), the distance to river (0.362), the slope
20
(0.307), the TWI (0.282), SPI (0.241), the lithology (0.134), the rainfall (0.084), and the curvature
(0.030). Because all the factors have predictive power for the flood, none of them was eliminated in
this analysis.
Table 1 Predictive power of flood influencing factors for the study area using the Pearson
correlation with the 10-fold cross validation.
No Forest fire related factor Average predictive power Standard deviation

1 Elevation 0.599 0.016
2 Stream density 0.596 0.021
3 NDVI 0.568 0.016
4 Distance to river 0.362 0.026
5 Slope 0.307 0.037
6 TWI 0.282 0.033
7 SPI 0.241 0.039
8 Lithology 0.134 0.031
9 Rainfall 0.084 0.035
10 Curvature 0.030 0.016
The final structure of the neural fuzzy model in this study is shown in Fig. 5. It could be observed
that the model consists of 10 input factors, one output, and 15 rules with 300 antecedent parameters
and 165 consequent parameters. Using the training dataset, the MONF model was trained with 1000
iterations. The results (Tables 2, 3 and Fig. 6) show that RMSEs of the MONF model are 0.306 and
0.362 on the training dataset and the validation dataset, respectively. These values are significantly
lowers than the standard deviation of the target values (0.5) in the two dataset indicating that the
model has high performance. AUC of 0.965 on the training dataset indicating the global degree-of-
fit of the model is 96.5% (Fig. 6).
Since RMSE is a quadratic scoring index (Chai and Draxler, 2014) that does not provides possible
variations of the error distribution, therefore MAE is additionally used. MAEs are 0.094 and 0.130
on the training dataset and the validation dataset, respectively (Tables 2, 3), are clearly lower than
RMSE indicating that the model has low variance of the individual errors in the two datasets.
Prediction capability of the MONF model is assessed using AUC on the validation dataset, and
AUC= 0.911 (Fig. 6) demonstrating high prediction capability.
21
Fig. 5. Structure of the proposed MONF model for flood susceptibility mapping for the study area.
Table 2 Performance of the proposed MONF model, the J48DT model, the RF model, the MLP
Neural Nets model, the SVM model, and the ANFIS model using the training dataset.
No. Flood susceptibility model Training dataset

RMSE MAE AUC
1 MONF 0.306 0.094 0.962
2 J48 DT 0.393 0.246 0.852
3 RF 0.327 0.237 0.934
4 MLP Neural Nets 0.369 0.178 0.909
5 SVM 0.318 0.207 0.938
6 ANFIS 0.497 0.211 0.905
Table 3 Prediction capability of the proposed MONF model, the J48DT model, the RF model, the
MLP Neural Nets model, the SVM model, and the ANFIS model using the validation dataset.
No. Flood susceptibility model Validation dataset

RMSE MAE AUC
1 MONF 0.362 0.130 0.911
2 J48 DT 0.352 0.230 0.895
3 RF 0.354 0.257 0.894
4 MLP Neural Nets 0.380 0.193 0.903
5 SVM 0.356 0.246 0.905
6 ANFIS 0.615 0.388 0.767
22
Fig. 6. ROC curve and AUC of the proposed MONF model using (a) the training dataset and (b)
validation dataset.
6. Model comparison and generation of flood susceptibility map
Since the major purpose of flood susceptibility modeling is to identify the zones with higher
probability of flood occurrence, therefore the usability of the proposed MONF model for flood
susceptibility mapping is further assessed through a comparison with those derived from current
state-of-the art machine learning methods using the same data, including J48DT, RF, MLP Neural
Nets, SVM, and ANFIS.
J48DT is a Java reimplementation of the C4.5 algorithm proposed by Quinlan (1993) that constructs
a hierarchical tree-like structure classifier with internal nodes, leaf nodes, and branches. For flood
susceptibility mapping in this study, the J48DT model was constructed with the minimum number
of sample per leaf is 1 and the confidence factor for pruning tree is 0.35. These are the best
parameter values determined in a test suggested by Tien Bui et al. (2014). RF is a variation of the
bagging ensemble that combines decision trees classifiers (Breiman, 2001) to form the final flood
susceptibility model. Each individual decision trees classifier is constructed from a subset that is
bootstrapped replicas of the training data but could be different feature subsets (Polikar, 2006). For
this study, number of trees is selected as 500 as suggested by Catani et al. (2013).
23
MLP Neural Nets is a popular method for the modeling of complex problems such as flood (Darras
et al., 2015; Dawson and Wilby, 2001). The structure of the MLP Neural Nets model for flood
susceptibility mapping in this study is determined using the same method in Tien Bui et al. (2016d)
and a structure with 10 input layers, one hidden layer with 6 neurons, and an output layer was
selected. The activation function is selected as the logistic sigmoid and other parameters i.e.
learning rate, momentum, and training time were selected as 0.3, 0.2, and 500, respectively (Were et
al., 2015). For the popular machine learning method of SVM, the radial basic function (RBF) kernel
(Hoang and Tien Bui, 2016) was employed. The SVM model was constructed where the best kernel
width (γ=0.195) and the regularization (C =10) were determined using the grid-search method
(Tien Bui et al., 2012a). For the popular ANFIS, the Gaussian membership function was used,
whereas the parameters for the subtractive clustering algorithm (Akbulut et al., 2004) were 0.55,
1.52, 0.5, and 0.25 for range of influence, squash factor, accept ratio, and reject ratio, respectively.
They are the best values for the study area and were determined based on a trial-and-error process.
For training ANFIS, the default hybrid learning algorithm was used and the training data was split
into a building dataset a checking dataset as suggested by Tien Bui et al. (2012c) to control
overfitting.
Fig. 7. Graphic curve of the flood susceptibility map for the study area.
24
Fig. 8. Flood susceptibility map using the proposed MONF model for the study area.
The results of the flood susceptibility models using J48DT, RF, SVM, MLP Neural Nets, and
ANFIS are shown in Table 2. The results show that all the five flood models have high
performances with the training data. The highest degree of fit is for the SVM model (RMSE =0.318,
MAE= 0.207, and AUC= 0.938), closely followed by the RF model (RMSE =0.327, MAE= 0.237,
and AUC= 0.934), the MLP Neural Nets model (RMSE =0.369, MAE= 0.178, and AUC= 0.909),
the J48DT model (RMSE =0.393, MAE= 0.246, and AUC= 0.852), and the ANFIS model (RMSE
25
=0.497, MAE= 0.211, and AUC= 0.905). Compared to the proposed MONF model, performance of
the five flood models is clearly lower.
The prediction capability of the five flood models was assessed using the validation dataset and the
results are shown in Table 3. It could be observed that the prediction capability of all five models is
lower than that of the MONF model. Although RMSE of the MONF model (0.362) is slightly
higher than the J48DT (0.352), the RF model (0.354), the SVM model (0.356), the MLP Neural
Nets model (0.380), and the ANFIS model (0.615), however, MAE and AUC of the MONF model
are better than the other models (Table 3). Based on the above analysis, it could be concluded that
the proposed MONF model has attained the most desirable performance in both the training dataset
and the validating dataset; therefore the MONF model is an effective tool for flood susceptibility
modeling.
Since the MONF model has deemed best suited for the flood dataset, the model is then used to
calculate flood susceptibility index for all the pixels of the study area. The result was transformed to
a GIS format to derive a flood susceptibility map in ArcGIS 10.2. The map was overlaid on the
flood inventory map to calculate percentage of the flood location and percentage of the
susceptibility map, and then, a graphic curve was constructed (Fig. 7). The map (Fig. 8) was then
visualized by five susceptibility classes such as very high (10%), high (10%), moderate (15%), low
(15%), and very low (50%) (Table 4).
Table 4 Description of the five flood susceptibility classes obtained from the MONF model for
the study area.
No. Flood index range Flood susceptibility (%) Verbal expression Flood location (%) Areas (km2)
1 1.465 – 0.689 100-90 Very high 67.1 280.3
2 0.688– 0.505 90-80 High 19.7 280.3
3 0.504– 0.334 80-65 Medium 6.6 420.5
4 0.333– 0.192 65-50 Low 6.6 420.5
5 0.192– 0.000 50-0 Very low 0.0 1401.6
Analysis of the graphic curve and Table 1 shows that 10% of the study area was classified to the
very high class but this class contains 67.1% of the total flood locations whereas around 19.7% of
26
the total flood locations located in the high class that covers 10% of the total study area. The
medium class and the low class cover 30% of the total study area, however, only 6.6% of the total
flood locations located on each class. In contrary, 50% of the total study area are classified to the
very low class with contains no flood location.
7. Concluding remarks
This study proposes and verifies a new hybrid intelligent approach (named as MONF) that is an
integration of two meta-heuristic optimization algorithms, EG and PSO, and the neural fuzzy for
flood susceptibility mapping with a case study of the Tuong Duong district, the Nghe An province
of Vietnam. This is a typical mountainous highland district that receives high frequency tropical
cyclones in Vietnam and floods occur every year during the tropical cyclone season.
The proposed MONF model was established and evaluated based on the GIS database of the study
area with 76 flood locations and ten influencing factors. The degree-of-fit and the prediction
capability of the MONF model were assessed using RMSE, MSE, ROC curve, and AUC. In
addition, five flood models derived from J48DT, RF, MLP Neural Nets, SVM, and ANFIS were
used to compare and confirm the usability of the proposed model.
Using the GIS database, the MONF model was established. Experimental results have demonstrated
that the proposed model attained high performance with both the training dataset and the validation
dataset. The degree-of-fit and the prediction capability are 96.2% and 91.1%, respectively,
indicating that a satisfactory accuracy for flood susceptibility mapping. Ten flood influencing
factors revealed positive predictive powers to the flood indicating that these factors have been
selected, processed, and coded successfully. The elevation, the stream density, and the NDVI have
the highest predictive powers for the flood susceptibility in this study.
The flood susceptibility index that produced from the MONF model varies from 0 to 1.465
however; flood indices below a threshold of 0.5 are dominant. The flood susceptibility index was
transformed to the susceptibility map by mean of five classes. Interpretation of these classes shows
27
that the very high and high classes cover small study area (20%) but contain 86.8% of the total
flood locations, whereas 50% of the study area is for the very low class with no flood location,
indicating that the MONF model produced high accuracy result.
Overall, the contribution of this research to the body of flood modelling knowledge can be
highlighted as bellows: (i) the proposed neural fuzzy model is capable of producing high quality of
flood susceptibility map therefore it should be used for mapping of flood susceptibility in other
prone regions; (ii) the integration of the two meta-heuristic algorithms, EG and PSO, with the
neural fuzzy provides an solution to build the flood susceptibility model autonomously; (iii) the
MONF model outperforms the J48DT, the RF, the MLP Neural Nets, the SVM, and the ANFIS
models, therefore, the proposed MONF model is an alternative tool for flood susceptibility
modeling.
In summary, the integration of advantages neural fuzzy plus goodness of two optimization methods
created higher efficiency of the proposed model for flood susceptibility mapping for the tropical
cyclone area of Vietnam. The result may be accommodating for planners and decision makers for
sustainable management of flood-prone areas in the study area. It is also recommended that meta-
heuristic algorithms be utilized as tools for pertinent studies in hydrological sciences.
Acknowledgement
This research was funded by the Project No. B2014-02-21 (Ministry of Training and Education of
Vietnam) and was partially supported by the Geographic Information System group, University
College of Southeast Norway.
Conflict of Interest: The authors declare that there is no conflict of interest.
References
Abdulshahed, A.M., Longstaff, A.P., Fletcher, S., 2015. The application of ANFIS prediction models for thermal error
compensation on CNC machine tools. Applied Soft Computing, 27: 158-168.
DOI:http://dx.doi.org/10.1016/j.asoc.2014.11.012
Akbulut, S., Hasiloglu, A.S., Pamukcu, S., 2004. Data generation for shear modulus and damping ratio in reinforced
sands using adaptive neuro-fuzzy inference system. Soil Dynamics and Earthquake Engineering, 24(11): 805-
814.
Alfieri, L., Feyen, L., Dottori, F., Bianchi, A., 2015. Ensemble flood risk assessment in Europe under high end climate
scenarios. Global Environmental Change, 35: 199-212.
Arouri, M., Nguyen, C., Youssef, A.B., 2015. Natural Disasters, Household Welfare, and Resilience: Evidence from
Rural Vietnam. World Development, 70: 59-77.
28
Ayalew, L., Yamagishi, H., 2005. The application of GIS-based logistic regression for landslide susceptibility mapping
in the Kakuda-Yahiko Mountains, Central Japan. Geomorphology, 65(1-2): 15-31.
DOI:10.1016/j.geomorph.2004.06.010
Bai, T. et al., 2015. Synergistic gains from the multi-objective optimal operation of cascade reservoirs in the Upper
Yellow River basin. Journal of Hydrology, 523: 758-767.
Beven, K., Kirkby, M., Schofield, N., Tagg, A., 1984. Testing a physically-based flood forecasting model
(TOPMODEL) for three UK catchments. Journal of Hydrology, 69(1): 119-143.
Breiman, L., 2001. Random forests. Machine Learning, 45(1): 5-32. DOI:10.1023/a:1010933404324
Brown, K., Pty, R., 2009. Vietnam water sector review (Report). Hanoi: Ministry of Natural Resources and
Environment.
Canuto, A.M.P., Santana, L.E.A., Abreu, M.C.C., Xavier Jr, J.C., 2008. An analysis of data distribution in the ClassAge
system: An agent-based system for classification tasks. Neurocomputing, 71(16–18): 3319-3325.
DOI:http://dx.doi.org/10.1016/j.neucom.2008.01.032
Catani, F., Lagomarsino, D., Segoni, S., Tofani, V., 2013. Landslide susceptibility estimation by random forests
technique: sensitivity and scaling issues. Nat. Hazards Earth Syst. Sci., 13(11): 2815-2831. DOI:10.5194/nhess-
13-2815-2013
Ceola, S., Laio, F., Montanari, A., 2014. Satellite nighttime lights reveal increasing human exposure to floods
worldwide. Geophysical Research Letters, 41(20): 7184-7190.
Chai, T., Draxler, R.R., 2014. Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against
avoiding RMSE in the literature. Geoscientific Model Development, 7(3): 1247-1250.
Chang, F.-J., Tsai, M.-J., 2016. A nonlinear spatio-temporal lumping of radar rainfall for modeling multi-step-ahead
inflow forecasts by data-driven techniques. Journal of Hydrology, 535: 256-269.
DOI:http://dx.doi.org/10.1016/j.jhydrol.2016.01.056
Cloke, H., Pappenberger, F., 2009. Ensemble flood forecasting: a review. Journal of Hydrology, 375(3): 613-626.
Cook, A., Merwade, V., 2009. Effect of topographic data, geometric configuration and modeling approach on flood
inundation mapping. Journal of Hydrology, 377(1): 131-142.
Correia, F.N., Da Silva, F.N., Ramos, I., 1999. Floodplain management in urban developing areas. Part II. GIS-based
flood analysis and urban growth modelling. Water Resources Management, 13(1): 23-37.
Darras, T., Borrell Estupina, V., Vayssade, B., Johannet, A., Pistre, S., 2015. Identification of spatial and temporal
contributions of rainfalls to flash floods using neural network modelling: case study on the Lez basin (southern
France). Hydrology and Earth System Sciences, 19(10): 4397-4410.
Dawson, C., Wilby, R., 2001. Hydrological modelling using artificial neural networks. Progress in physical Geography,
25(1): 80-108.
Dottori, F., Martina, M.L.V., Figueiredo, R., 2016. A methodology for flood susceptibility and vulnerability analysis in
complex flood scenarios. Journal of Flood Risk Management: n/a-n/a. DOI:10.1111/jfr3.12234
Du, J., Fang, J., Xu, W., Shi, P., 2013. Analysis of dry/wet conditions using the standardized precipitation index and its
potential usefulness for drought/flood monitoring in Hunan Province, China. Stochastic environmental research
and risk assessment, 27(2): 377-387.
Eberhart, R.C., Shi, Y., 2001. Particle swarm optimization: developments, applications and resources, Evolutionary
Computation, 2001. Proceedings of the 2001 Congress on. IEEE, pp. 81-86.
Fekete, A., 2009. Validation of a social vulnerability index in context to river-floods in Germany. Nat. Hazards Earth
Syst. Sci., 9(2): 393-403. DOI:10.5194/nhess-9-393-2009
Fenicia, F., Savenije, H.H., Matgen, P., Pfister, L., 2008. Understanding catchment behavior through stepwise model
concept improvement. Water Resources Research, 44(1).
Fernandez, G., Uy, N., Shaw, R., 2012. Chapter 11 Community-Based Disaster Risk Management Experience of the
Philippines. Community-Based Disaster Risk Reduction (Community, Environment and Disaster Risk
Management, Volume 10) Emerald Group Publishing Limited, 10: 205-231.
Fortin, J.-P. et al., 2001. Distributed watershed model compatible with remote sensing and GIS data. I: Description of
model. Journal of Hydrologic Engineering, 6(2): 91-99.
Ghalkhani, H., Golian, S., Saghafian, B., Farokhnia, A., Shamseldin, A., 2013. Application of surrogate artificial
intelligent models for real‐time flood routing. Water and Environment Journal, 27(4): 535-548.
Glenn, E.P. et al., 2012. Roles of saltcedar (Tamarix spp.) and capillary rise in salinizing a non-flooding terrace on a
flow-regulated desert river. Journal of Arid Environments, 79: 56-65.
Goldberg, D.E., Holland, J.H., 1988. Genetic algorithms and machine learning. Mach Learn, 3(2): 95-99.
Güçlü, Y.S., Şen, Z., 2016. Hydrograph estimation with fuzzy chain model. Journal of Hydrology, 538: 587-597.
DOI:http://dx.doi.org/10.1016/j.jhydrol.2016.04.057
Guyon, I., Elisseeff, A., 2003. An introduction to variable and feature selection. The Journal of Machine Learning
Research, 3: 1157-1182.
Haq, M., Akhtar, M., Muhammad, S., Paras, S., Rahmatullah, J., 2012. Techniques of remote sensing and GIS for flood
monitoring and damage assessment: a case study of Sindh province, Pakistan. The Egyptian Journal of Remote
Sensing and Space Science, 15(2): 135-141.
29
Heerdegen, R.G., Beran, M.A., 1982. Quantifying source areas through land surface curvature and shape. Journal of
Hydrology, 57(3-4): 359-373.
Heitmuller, F.T., Hudson, P.F., Asquith, W.H., 2015. Lithologic and hydrologic controls of mixed alluvial–bedrock
channels in flood-prone fluvial systems: Bankfull and macrochannels in the Llano River watershed, central
Texas, USA. Geomorphology, 232: 1-19.
Hellendoorn, H., Driankov, D., 2012. Fuzzy model identification: selected approaches. Springer Science & Business
Media.
Hoang, N.-D., Tien Bui, D., 2016. A Novel Relevance Vector Machine Classifier with Cuckoo Search Optimization for
Spatial Prediction of Landslides. Journal of Computing in Civil Engineering. doi:10.1061/(ASCE)CP.1943-
5487.0000557. DOI:doi:10.1061/(ASCE)CP.1943-5487.0000557
Hong, Y.-S.T., White, P.A., 2009. Hydrological modeling using a dynamic neuro-fuzzy system with on-line and local
learning algorithm. Advances in water resources, 32(1): 110-119.
Hostache, R., Lai, X., Monnier, J., Puech, C., 2010. Assimilation of spatially distributed water levels into a shallow-
water flood model. Part II: Use of a remote sensing image of Mosel River. Journal of hydrology, 390(3): 257-
268.
Jang, J.S.R., Sun, C.T., Mizutani, E., 1997. Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning
and Machine Intelligence (Matlab Curriculum Series). Prentice Hall, 614 pp.
Jayakrishnan, R., Srinivasan, R., Santhi, C., Arnold, J., 2005. Advances in the application of the SWAT model for water
resources management. Hydrological processes, 19(3): 749-762.
Kantardzic, M., 2011. Data mining: concepts, models, methods, and algorithms. John Wiley & Sons, Hoboken, New
Jersey.
Kazakis, N., Kougias, I., Patsialis, T., 2015. Assessment of flood hazard areas at a regional scale using an index-based
approach and Analytical Hierarchy Process: Application in Rhodope–Evros region, Greece. Science of the Total
Environment, 538: 555-563.
Kia, M.B. et al., 2012. An artificial neural network model for flood simulation using GIS: Johor River Basin, Malaysia.
Environ Earth Sci, 67(1): 251-264.
Kreft, S., Eckstein, D., Junghans, L., Kerestan, C., Hagen, U., 2014. Global climate risk index 2015. Who suffers most
from extreme weather events: 1-31.
Lee, M.-J., Kang, J.-e., Jeon, S., 2012. Application of frequency ratio model and validation for predictive flooded area
susceptibility mapping using GIS, Geoscience and Remote Sensing Symposium (IGARSS), 2012 IEEE
International. IEEE, pp. 895-898.
Li, X.H., Zhang, Q., Shao, M., Li, Y.L., 2012. A comparison of parameter estimation for distributed hydrological
modelling using automatic and manual methods, Advanced Materials Research. Trans Tech Publ, pp. 2372-2375.
Liu, K. et al., 2016. Coupling the k-nearest neighbor procedure with the Kalman filter for real-time updating of the
hydraulic model in flood forecasting. International Journal of Sediment Research.
Liu, Y., De Smedt, F., 2005. Flood modeling for complex terrain using GIS and remote sensed information. Water
Resources Management, 19(5): 605-624.
Lloyd, A.J., Beckmann, M., Favé, G., Mathers, J.C., Draper, J., 2011. Proline betaine and its biotransformation products
in fasting urine samples are potential biomarkers of habitual citrus fruit consumption. British Journal of
Nutrition, 106(06): 812-824.
Lohani, A., Kumar, R., Singh, R., 2012. Hydrological time series modeling: A comparison between adaptive neuro-
fuzzy, neural network and autoregressive techniques. Journal of Hydrology, 442: 23-35.
Loo, Y.Y., Billa, L., Singh, A., 2015. Effect of climate change on seasonal monsoon in Asia and its impact on the
variability of monsoon rainfall in Southeast Asia. Geoscience Frontiers, 6(6): 817-823.
DOI:http://dx.doi.org/10.1016/j.gsf.2014.02.009
Mitchell, M., 1998. An introduction to genetic algorithms. MIT press.
Mukerji, A., Chatterjee, C., Raghuwanshi, N.S., 2009. Flood forecasting using ANN, neuro-fuzzy, and neuro-GA
models. Journal of Hydrologic Engineering, 14(6): 647-652.
Musharavati, F., 2010. Process planning optimization in reconfigurable manufacturing systems. Universal-Publishers.
Nayak, P., Sudheer, K., Rangan, D., Ramasastri, K., 2005. Short‐term flood forecasting with a neurofuzzy model. Water
Resources Research, 41(4).
Nayak, P.C., Sudheer, K.P., Rangan, D.M., Ramasastri, K.S., 2004. A neuro-fuzzy computing technique for modeling
hydrological time series. Journal of Hydrology, 291(1-2): 52-66. DOI:10.1016/j.jhydrol.2003.12.010
Niedda, M., 2004. Upscaling hydraulic conductivity by means of entropy of terrain curvature representation. Water
resources research, 40(4).
Pham, B., Tien Bui, D., Pourghasemi, H., Indra, P., Dholakia, M.B., 2015. Landslide susceptibility assesssment in the
Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer
perceptron neural networks, and functional trees methods. Theoretical and Applied Climatology: 1-19.
DOI:10.1007/s00704-015-1702-9
Poli, R., Kennedy, J., Blackwell, T., 2007. Particle swarm optimization. Swarm intelligence, 1(1): 33-57.
Polikar, R., 2006. Ensemble based systems in decision making. Circuits and systems magazine, IEEE, 6(3): 21-45.
30
Pradhan, B., Shafiee, M., Pirasteh, S., 2009. Maximum Flood Prone Area Mapping using RADARSAT Images and
GIS: Kelantan River Basin. International Journal of Geoinformatics, 5(2).
Pulvirenti, L., Pierdicca, N., Chini, M., Guerriero, L., 2011. An algorithm for operational flood mapping from Synthetic
Aperture Radar (SAR) data based on the fuzzy logic. Natural Hazard and Earth System Sciences.
Qi, S. et al., 2009. Inundation extent and flood frequency mapping using LANDSAT imagery and digital elevation
models. GIScience & Remote Sensing, 46(1): 101-127.
Quinlan, J.R., 1993. C4.5: programs for machine learning Morgan Kaufmann San Mateo, CA, USA
Reynaud, A., Nguyen, M.-H., 2012. Monetary Valuation of Flood Insurance in Vietnam, Mimeo, Toulouse School of
Economics.
Reynaud, A., Nguyen, M.-H., 2016. Valuing Flood Risk Reductions. Environmental Modeling & Assessment: 1-15.
DOI:10.1007/s10666-016-9500-z
Rezaeianzadeh, M., Tabari, H., Yazdi, A.A., Isik, S., Kalin, L., 2014. Flood flow forecasting using ANN, ANFIS and
regression models. Neural Computing and Applications, 25(1): 25-37.
Sahoo, B., Chatterjee, C., Raghuwanshi, N.S., Singh, R., Kumar, R., 2006. Flood estimation by GIUH-based Clark and
Nash models. Journal of Hydrologic Engineering, 11(6): 515-525.
Sarker, R.A., Ray, T., 2010. Agent-Based Evolutionary Search, 5. Springer Science & Business Media.
Sattari, M.T., Pal, M., Apaydin, H., Ozturk, F., 2013. M5 model tree application in Daily River flow forecasting in Sohu
Stream, Turkey. Water Resources, 40(3): 233-242.
Savage, N. et al., 2013. Air quality modelling using the Met Office Unified Model (AQUM OS24-26): model
description and initial evaluation. Geoscientific Model Development, 6(2): 353-372.
Seckin, N., Cobaner, M., Yurtal, R., Haktanir, T., 2013. Comparison of artificial neural network methods with L-
moments for estimating flood flow at ungauged sites: the case of East Mediterranean River Basin, Turkey. Water
resources management, 27(7): 2103-2124.
Shaw, R., Ishiwatari, M., Arnold, M., 2011. Community-based Disaster Risk Management.
Shu, C., Ouarda, T., 2008. Regional flood frequency analysis at ungauged sites using the adaptive neuro-fuzzy inference
system. Journal of Hydrology, 349(1): 31-43.
Smith, K., Ward, R., 1998. Floods: physical processes and human impacts. John Wiley and Sons Ltd.
Talei, A., Chua, L.H.C., Quek, C., Jansson, P.-E., 2013. Runoff forecasting using a Takagi–Sugeno neuro-fuzzy model
with online learning. Journal of Hydrology, 488: 17-32.
Tehrany, M.S., Lee, M.-J., Pradhan, B., Jebur, M.N., Lee, S., 2014a. Flood susceptibility mapping using integrated
bivariate and multivariate statistical models. Environ Earth Sci, 72(10): 4001-4015.
Tehrany, M.S., Pradhan, B., Jebur, M.N., 2013. Spatial prediction of flood susceptible areas using rule based decision
tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS. Journal of Hydrology, 504:
69-79. DOI:http://dx.doi.org/10.1016/j.jhydrol.2013.09.034
Tehrany, M.S., Pradhan, B., Jebur, M.N., 2014b. Flood susceptibility mapping using a novel ensemble weights-of-
evidence and support vector machine models in GIS. Journal of Hydrology, 512: 332-343.
Tehrany, M.S., Pradhan, B., Jebur, M.N., 2015a. Flood susceptibility analysis and its verification using a novel
ensemble support vector machine and frequency ratio method. Stochastic Environmental Research and Risk
Assessment, 29(4): 1149-1165.
Tehrany, M.S., Pradhan, B., Mansor, S., Ahmad, N., 2015b. Flood susceptibility assessment using GIS-based support
vector machine model with different kernel types. Catena, 125: 91-101.
The MathWorks, I., 2014. Fuzzy Logic Toolbox User's Guide R2014b.
Tien Bui, D., Le, K.-T., Nguyen, V., Le, H., Revhaug, I., 2016a. Tropical Forest Fire Susceptibility Mapping at the Cat
Ba National Park Area, Hai Phong City, Vietnam, Using GIS-Based Kernel Logistic Regression. Remote
Sensing, 8(4): 347.
Tien Bui, D., Nguyen, Q.-P., Hoang, N.-D., Klempe, H., 2016b. A Novel Fuzzy K-Nearest Neighbor Inference model
with Differential Evolution for Spatial Prediction of Rainfall-Induced Shallow Landslides in a Tropical Hilly
Area using GIS. Landslides. Doi: 10.1007/s10346-016-0708-4.
Tien Bui, D., Pham, T.B., Nguyen, Q.-P., Hoang, N.-D., 2016c. Spatial Prediction of Rainfall-Induced Shallow
Landslides Using Hybrid Integration Approach of Least Squares Support Vector Machines and Differential
Evolution Optimization: A Case Study in Central Vietnam. International Journal of Digital Earth. Doi:
10.1080/17538947.2016.1169561.
Tien Bui, D., Pradhan, B., Lofman, O., Revhaug, I., Dick, O.B., 2012a. Application of support vector machines in
landslide susceptibility assessment for the Hoa Binh province (Vietnam) with kernel functions analysis, iEMSs
2012 - Managing Resources of a Limited Planet: Proceedings of the 6th Biennial Meeting of the International
Environmental Modelling and Software Society, pp. 382-389.
Tien Bui, D., Pradhan, B., Lofman, O., Revhaug, I., Dick, O.B., 2012b. Landslide susceptibility assessment in the Hoa
Binh province of Vietnam: A comparison of the Levenberg-Marquardt and Bayesian regularized neural
networks. Geomorphology, 171–172(0): 12–29.
Tien Bui, D., Pradhan, B., Lofman, O., Revhaug, I., Dick, O.B., 2012c. Landslide susceptibility mapping at Hoa Binh
province (Vietnam) using an adaptive neuro-fuzzy inference system and GIS. Computers & Geosciences, 45(0):
199-211. DOI:10.1016/j.cageo.2011.10.031
31
Tien Bui, D., Pradhan, B., Revhaug, I., Trung Tran, C., 2014. A Comparative Assessment Between the Application of
Fuzzy Unordered Rules Induction Algorithm and J48 Decision Tree Models in Spatial Prediction of Shallow
Landslides at Lang Son City, Vietnam. In: Srivastava, P.K., Mukherjee, S., Gupta, M., Islam, T. (Eds.), Remote
Sensing Applications in Environmental Research. Society of Earth Scientists Series. Springer International
Publishing, Cham, Switzerland, pp. 87-111. DOI:10.1007/978-3-319-05906-8_6
Tien Bui, D., Tuan, T.A., Klempe, H., Pradhan, B., Revhaug, I., 2016d. Spatial prediction models for shallow landslide
hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel
logistic regression, and logistic model tree. Landslides, 13: 361-378. DOI:10.1007/s10346-015-0557-6
Tokar, A.S., Johnson, P.A., 1999. Rainfall-runoff modeling using artificial neural networks. Journal of Hydrologic
Engineering, 4(3): 232-239.
Toth, E., Brath, A., Montanari, A., 2000. Comparison of short-term rainfall prediction models for real-time flood
forecasting. Journal of Hydrology, 239(1): 132-147.
Tottrup, C., 2004. Improving tropical forest mapping using multi-date Landsat TM data and pre-classification image
smoothing. International Journal of remote sensing, 25(4): 717-730.
Tropeano, D., Turconi, L., 2004. Using historical documents for landslide, debris flow and stream flood prevention.
Applications in Northern Italy. Natural Hazards, 31(3): 663-679.
Tsai, W.-P., Chang, F.-J., Chang, L.-C., Herricks, E.E., 2015. AI techniques for optimizing multi-objective reservoir
operation upon human and riverine ecosystem demands. Journal of Hydrology, 530: 634-644.
Tucker, C., Sellers, P., 1986. Satellite remote sensing of primary production. International journal of remote sensing,
7(11): 1395-1416.
Were, K., Tien Bui, D., Dick, Ø.B., Singh, B.R., 2015. A comparative assessment of support vector regression, artificial
neural networks, and random forests for predicting and mapping soil organic carbon stocks across an
Afromontane landscape. Ecol. Indic., 52: 394-403.
32
Highlights:
► MONF is proposed for flood susceptibility modeling.
► MONF has high performance on the training and validation datasets.
► MONF outperforms benchmark models i.e. J48DT, RF, MLP Neural Nets, and SVM.
33

10 1016@j Jhydrol 2016 06 027

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

10 1016@j Jhydrol 2016 06 027

Загружено:

Авторское право:

Доступные форматы

Accepted Manuscript

Hybrid Artificial Intelligence Approach Based on Neural Fuzzy Inference Mod-

To appear in: Journal of Hydrology

Received Date: 4 April 2016

Model and Metaheuristic Optimization for Flood Susceptibility Modelling

in A High-Frequency Tropical Cyclone Area using GIS

* Corresponding author to D. Tien Bui; e-mail: Dieu.T.Bui@hit.no/BuiTienDieu@gmail.com

makers for sustainable management of flood-prone areas.

Optimization; GIS; Vietnam

loss equivalent to 1.01% of GDP (PPP) or $ 2.9 billion.

in regional studies (Li et al., 2012).

model trees (Sattari et al., 2013).

2. The study area and the flood database

2.1 Description of the study area

Fig. 1. Location of the study area and flood inventories.

2.2 Flood inventory map

2.3 Flood conditioning factor

constructed by 3900 columns and 4125 rows.

categories: <-2; -2 to -0.05; -0.05 to 0.05; 0.05 to 2; > 2.

Beven et al. (1984) within the runoff model is expressed as below:

the local slope.

2e) was built with ten categories.

(Tucker and Sellers, 1986) as follows:

portion of the electromagnetic spectrum (0.63–0.69 µm).

Forecast System Reanalysis (CFSR) database (available at https://www.ncdc.noaa.gov/). An

3. Theoretical background of the methods used

3.1 Neural fuzzy inference model

Jang et al. (1997) were used to construct flood susceptibility model.

the defuzzification is performed in the layer 4 using Eq. 7

Flood susceptibility values  ii . fi ; i  1, 2 (8)

(Tien Bui et al., 2012c):

Rule 1: if x is A1 and y is B1, then f1 = p1x + q1y + r1.

Rule 2: if x is A2 and y is B2, then f2 = p2x + q2y + r2.

3.2 Metaheuristic optimization algorithms

parameters of the neural fuzzy model.

3.2.1 Evolutionary Genetic optimization algorithm

called the population size.

called offsprings that inherits some characteristics of their parents.

3.2.2 Particle Swarm optimization

dimensional position for particle i at iteration t can be represented as ,

whereas the velocity for particle i at iteration t can be described as Let

represents the best position that particle i at iteration t and

learning factor; and are random numbers uniformly distributed in [0,1].

Step 1(Initialization): An initial swarm was randomly generated.

of each particle is updated using Eq.7.

number of iteration reaches the pre-determined maximum number of iteration.

3.3 Performance assessment

and area under the curve (AUC).

individual errors in the samples.

algorithms and neural fuzzy Inference model (MONF) for flood

(1) GIS database

categorical values are large (Tien Bui et al., 2012c).

capability of the models.

fuzzy Inference model for flood susceptibility mapping.

(2) Model configuration:

determined in the metaheuristic optimization in the next step.

(3) Metaheuristic optimization

(4) Evaluation of fitness and (5) Stopping criteria

and consequent parameters are derived.

(6). Final flood susceptibility model

5. Results and discussion

be enhanced (Tien Bui et al., 2016d).

correlation values in a 10-fold cross-validation process.