Вы находитесь на странице: 1из 34

Accepted Manuscript

Hybrid Artificial Intelligence Approach Based on Neural Fuzzy Inference Mod-


el and Metaheuristic Optimization for Flood Susceptibility Modelling in A
High-Frequency Tropical Cyclone Area using GIS

Dieu Tien Bui, Biswajeet Pradhan, Haleh Nampak, Thanh Quang Bui, Quynh-
An Tran, Quoc Phi Nguyen

PII: S0022-1694(16)30378-X
DOI: http://dx.doi.org/10.1016/j.jhydrol.2016.06.027
Reference: HYDROL 21345

To appear in: Journal of Hydrology

Received Date: 4 April 2016


Revised Date: 8 June 2016
Accepted Date: 14 June 2016

Please cite this article as: Bui, D.T., Pradhan, B., Nampak, H., Quang Bui, T., Tran, Q-A., Nguyen, Q.P., Hybrid
Artificial Intelligence Approach Based on Neural Fuzzy Inference Model and Metaheuristic Optimization for Flood
Susceptibility Modelling in A High-Frequency Tropical Cyclone Area using GIS, Journal of Hydrology (2016),
doi: http://dx.doi.org/10.1016/j.jhydrol.2016.06.027

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers
we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and
review of the resulting proof before it is published in its final form. Please note that during the production process
errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Hybrid Artificial Intelligence Approach Based on Neural Fuzzy Inference

Model and Metaheuristic Optimization for Flood Susceptibility Modelling

in A High-Frequency Tropical Cyclone Area using GIS

Dieu Tien Bui a,*; Biswajeet Pradhan b,f; Haleh Nampak b; Thanh Quang Bui c; Quynh-An Tran d ; Quoc Phi

Nguyene

a
Geographic Information System Group, Department of Business Administration and Computer,
Science, University College of Southeast Norway, Hallvard Eikas Plass 1, N-3800 Bø i Telemark,
Norway
b
Department of Civil Engineering, Geospatial Information Science Research Center (GISRC),
Faculty of Engineering, University Putra Malaysia, Serdang, Selangor Darul Ehsan 43400,
Malaysia
c
Faculty of Geography, VNU University of Science, 334 Nguyen Trai, Thanh Xuan, Hanoi,
Vietnam
d
Faculty of Geomatics and Land Administration , Hanoi University of Mining and Geology, Duc
Thang, Bac Tu Liem, Hanoi, Vietnam
e
Department of Environmental Sciences, Hanoi University of Mining and Geology, Duc Thang,
Bac Tu Liem, Hanoi, Vietnam
f
Department of Geoinformation Engineering, Choongmu-gwan, Sejong University, 209
Neungdong-ro Gwangjingu, Seoul 05006 Republic of Korea

* Corresponding author to D. Tien Bui; e-mail: Dieu.T.Bui@hit.no/BuiTienDieu@gmail.com

Abstract

This paper proposes a new artificial intelligence approach based on neural fuzzy inference system

and metaheuristic optimization for flood susceptibility modeling, namely MONF. In the new

approach, the neural fuzzy inference system was used to create an initial flood susceptibility model

and then the model was optimized using two metaheuristic algorithms, Evolutionary Genetic and

Particle Swarm Optimization. A high-frequency tropical cyclone area of the Tuong Duong district

in Central Vietnam was used as a case study. First, a GIS database for the study area was

constructed. The database that includes 76 historical flood inundated areas and ten flood influencing

1
factors was used to develop and validate the proposed model. Root Mean Square Error (RMSE),

Mean Absolute Error (MAE), Receiver Operating Characteristic (ROC) curve, and area under the

ROC curve (AUC) were used to assess the model performance and its prediction capability.

Experimental results showed that the proposed model has high performance on both the training

(RMSE = 0.306, MAE = 0.094, AUC = 0.962) and validation dataset (RMSE = 0.362, MAE =

0.130, AUC = 0.911). The usability of the proposed model was evaluated by comparing with those

obtained from state-of-the art benchmark soft computing techniques such as J48 Decision Tree,

Random Forest, Multi-layer Perceptron Neural Network, Support Vector Machine, and Adaptive

Neuro Fuzzy Inference System. The results show that the proposed MONF model outperforms the

above benchmark models; we conclude that the MONF model is a new alternative tool that should

be used in flood susceptibility mapping. The result in this study is useful for planners and decision

makers for sustainable management of flood-prone areas.

Key words: Flood susceptibility mapping; Neural Fuzzy; Genetic Evolutionary; Particle Swarm

Optimization; GIS; Vietnam

1. Introduction

Flood is commonly known as one of the most frequent and destructive natural hazards due to its

huge economic loss and spatial extent (Du et al., 2013). According to Ceola et al. (2014), around

200 million people were affected by floods in two years 2011 and 2012 throughout the world, and a

total losses reached to $ 95 billion. Southeast Asian countries are considered as the most frequently

flood affected region due to monsoonal rainfalls (Loo et al., 2015). Hence, identification of

susceptible areas to the flood can significantly contribute to reduce its damages to settlement,

agriculture, and livelihood by avoiding more construction and developments in the flooded-prone

areas.

Vietnam is one of the most disaster-prone countries in the world, and in particular, the central

region part is highly susceptible to natural disasters i.e. tropical cyclones, tropical depressions,
2
floods, landslides, droughts, whirlwinds and salinity intrusion (Arouri et al., 2015; Brown and Pty,

2009; Fernandez et al., 2012). Flooding and tropical cyclones are considered to be the most

devastating natural disasters in this area. More than 71% of the Vietnam’s population and 59 % of

the total land area of Vietnam are susceptible to the impacts of such hazards (Shaw et al., 2011).

According to Kreft et al. (2014), from 1994-2013, this country endured a gauged yearly economic

loss equivalent to 1.01% of GDP (PPP) or $ 2.9 billion.

Prevention of flood events may not be completely possible but they can be predicted, and flood risk

and vulnerable areas can be mapped out through prediction models (Alfieri et al., 2015; Cloke and

Pappenberger, 2009). Traditionally, the main purpose of producing flood models is to obtain an

accurate assessment of discharge over the watersheds (Smith and Ward, 1998). Literature review

shows that many hydrological techniques have been applied in flood prediction (Fenicia et al.,

2008). However, the complexity, dynamic and non-linearity structures of watersheds prevents from

an accurate flood modelling through simple and non-linear hydrological methods and techniques

(Sahoo et al., 2006). In addition, extensive field works are required for data collection, which is

often time taking and costly (Fenicia et al., 2008). Therefore, it can be deduced that conventional

hydrological methods are not capable to fulfil the requirements of comprehensive flood evaluation

in regional studies (Li et al., 2012).

Application of remote sensing and GIS techniques has made important contributions in hydrological

studies, and in particular, for flood management (Correia et al., 1999; Haq et al., 2012; Pradhan et

al., 2009). Some hydrological models integrated with geospatial techniques have received great

attentions such as HYDROTEL (Fortin et al., 2001), WetSpa (Liu and De Smedt, 2005), and SWAT

(Jayakrishnan et al., 2005). However, conventional hydrological models are still required to be

developed and optimized or replaced by robust and automated methods in order to solve the

limitations of the traditional hydrological techniques (Bai et al., 2015; Hostache et al., 2010; Tsai et

al., 2015).

3
Due to the critical problem of flood, statistical and data driven approaches have been proposed in

flood studies such as analytic hierarchy process (Kazakis et al., 2015), frequency ratio (Lee et al.,

2012; Tehrany et al., 2015a), logistic regression (Fekete, 2009; Tehrany et al., 2014a), weights-of-

evidence (Tehrany et al., 2014b), and fuzzy logic (Pulvirenti et al., 2011). However, flood is

complex process that difficult to assess and model (Dottori et al., 2016), therefore non-linear

machine learning algorithms have been proposed for flood modelling with promising results, such

as artificial neural networks (Kia et al., 2012; Seckin et al., 2013), decision tree (Tehrany et al.,

2013), support vector machines (Tehrany et al., 2015b), k-nearest neighbors (Liu et al., 2016), M5

model trees (Sattari et al., 2013).

Among machine learning techniques, artificial neural networks have been the most widely used

technique due to its computational efficiency (Ghalkhani et al., 2013; Rezaeianzadeh et al., 2014;

Tokar and Johnson, 1999). However, errors in the process of modelling and poor predictions due to

length of dataset and dissimilar value ranges of validation and training dataset are the weak points

of artificial neural networks (Toth et al., 2000). Therefore neural fuzzy models that combine fuzzy

logic and neural networks have been proposed with high accuracy (Chang and Tsai, 2016; Güçlü

and Şen, 2016; Lohani et al., 2012; Shu and Ouarda, 2008). Several authors compared neural fuzzy

models with neural network models with conclusions that performance of neural fuzzy models was

better (Mukerji et al., 2009; Nayak et al., 2005; Nayak et al., 2004). However, application of neural

fuzzy models has some restrictions due to its inability to find best weight parameters that heavily

influence the prediction performance of these neural fuzzy models. In addition, these neural fuzzy

models have slow training speed and sensitivity to noise in hydrological modeling (Hong and

White, 2009), therefore local learning paradigms were proposed (Talei et al., 2013).

Nevertheless, it is still difficult to find optimal parameters for neural fuzzy models and thus the best

parameters should be searched through soft computing optimization processes (Tien Bui et al.,

2016b). We address this issue in this paper by proposing a hybrid artificial intelligence approach

based on metaheuristic optimization and neural fuzzy inference model (namely MONF) for flood
4
susceptibility modelling with a case study at the Tuong Duong district in Central Vietnam. In the

proposed soft computing methodological approach, the neural fuzzy was used to create an initial

flood susceptibility model, and then, two metaheuristic algorithms, Evolutionary Genetic (Goldberg

and Holland, 1988) and Particle Swarm Optimization (Eberhart and Shi, 2001), were adopted to

optimize the model. The overall performances of the proposed model were assessed using the

training and validation datasets, Root Mean Square Error (RMSE), Mean Absolute Error (MAE),

Receiver Operating Characteristic (ROC) curve, and area under the ROC curve (AUC). Finally, the

usability of the proposed model flood susceptibility mapping is verified through a comparison with

current state-of-the art machine learning methods such as J48 Decision Tree (J48DT), Random

Forests (RF), Multi-layer perceptron neural networks (MLP Neural Nets), Support vector machines

(SVM), and popular Adaptive Neuro-Fuzzy Inference System (ANFIS), and conclusion remarks

were given.

2. The study area and the flood database

2.1 Description of the study area

The study area is the Tuong Duong district that is located in the mountainous region of the Nghe An

province (around 350 km to the south of Hanoi city), one of the most affected flood province in

Vietnam (Reynaud and Nguyen, 2016). The district covers an area of around 2803.1 km2 (Fig. 1),

and lies between the longitudes 18°58'42''N and 19°39'16''N, and between the latitudes 104°15''58'E

and 104°55''57'E. The topography of the district is quite complex and divided by mountains, hills,

rivers, and streams. The altitude ranges from 2.9 m to 2122.2 m a.s.l with the mean altitude of 527.8

m and the standard deviation of 315.1 m. Slope angles vary from 0o to 84.7o. Areas with slope angle

larger than 20o occupy 62.6% of the total study area, whereas areas with slopes less than 5 o and

from 5o and 20o cover 14% and 23.5% of the total study, respectively.

The district belongs to the tropical coastal highland area with two separated seasons: (i) a cold and

dry season due to the north-east monsoon lasts from November until March and (ii) a hot, dry
5
winds, and humid season influenced by the south-west monsoon is from April to October (Tottrup,

2004). The highest temperatures can peak of 42.7°C (in June and July) whereas the lowest

temperatures can down to 0.5o (in December, January, and February). The mean temperature is

from 23–24°C. Our analysis of rainfall from the years 1979 to 2010 shows that the annual rainfall

of the study area varies from 1679 mm to 3259 mm. The rainfall is mainly concentrated in the rainy

season from April to October that accounts for 88.6-93.3% of the total rainfall yearly.

Fig. 1. Location of the study area and flood inventories.

Due to the geographical location and topographic characteristics of the mountainous highlands, the

district is highly vulnerable to food hazards with large losses life and properties. Floods happen

every year due to extreme rainfalls during the tropical cyclone season. Investigations of Reynaud

and Nguyen (2012) showed that around 40.4% of households have been flooded and approximately

20.3% of households were evacuated, the average cost of flood accounting for 24.1% of the total

6
household income annually. Statistics of Ministry of Agriculture and Rural Development of

Vietnam shows that floods have killed 30 people each year on average during the last 10 years in

the Nghe An province (including the study area) (Reynaud and Nguyen, 2016).

2.2 Flood inventory map

In order to estimate future flood zones, analyzing the past records of its occurrence is essential

(Toth et al., 2000; Tropeano and Turconi, 2004), therefore, an inventory map is considered as the

most important factor for prediction of future disaster occurrence and it can represent single or

multiple events in a specific area (Tien Bui et al., 2012a). The flood inventory map in this study,

which contains the historical flood records, was constructed based on: (i) documentary sources of

the Tuong Duong district; (ii) flood locations collected during field works; and (iii) interpretation of

Landsat 8 Operational Land Imagery that that acquired in from 2010-2014 (30 m resolution,

available at the USGS archive at http://earthexplorer.usgs.gov). In addition, the flood inventory map

was checked during field work conducted in 2014 using handhold GPS. A total of 76 flood

locations that occurred during the last five years were prepared.

2.3 Flood conditioning factor

It is essential to determine the flood conditioning factors in order to perform flood susceptibility

mapping. The selection of the conditioning factors varies from one study area to another based on

different characteristics of each place. This is because one variable can have high degree of impact

in flooding in a specific area, but it can be without any influence in another regions (Kia et al.,

2012). For current research, these variables were chosen based on field survey and the information

derived from the literature. Hence, ten flood conditioning factors were selected for the susceptibility

analysis and a GIS database of these factors was compiled. Those factors are: slope, elevation,

curvature, topographic wetness index (TWI), stream power index (SPI), distance to river, stream

density, NDVI, lithology, rainfall. Once the dataset was prepared, each conditioning factor was

7
transformed into a grid spatial database by 20 m size and the grid of the Tuong Duong district was

constructed by 3900 columns and 4125 rows.

Topographical factors play significant role to distinguish the prone areas to flood occurrence and

have direct impact on the output of modeling and many studies (Cook and Merwade, 2009),

therefore a digital Elevation Model (DEM) for the study area was generated first. The DEM has a

resolution of 20 m and was generated from the topographical maps in 1:50,000 scale having a

contour interval of 10 m. Frequently, runoff volume is influenced by slope and curvature, and

floods are usually identified in low elevations (Heerdegen and Beran, 1982; Niedda, 2004; Qi et al.,

2009), therefore, slope, elevation, and curvature that derived from the DEM were used for this

analysis. The slope map (Fig. 2a) was classified into eight classes e.g., 0°-0.5°, 0.5°-2°, 2°-5°, 5°-

8°, 8°-13°, 13°-20°, 20°-30°, > 30°. The elevation map (Fig. 2b) was grouped into ten different

classes between < 100 m and >1300 m. The slope curvature map (Fig. 2c) was compiled with five

categories: <-2; -2 to -0.05; -0.05 to 0.05; 0.05 to 2; > 2.

8
Fig. 2. Flood influencing factor: (a) slope; (b) Elevation; (c) Curvature; (d) TWI; (e) SPI; (f)
Distance to river.
The spatial variation of floods may be influenced by hydrological conditions; therefore TWI and

SPI should be used in flood susceptibility analysis. TWI that is a topographic indices developed by

Beven et al. (1984) within the runoff model is expressed as below:

(1)
where is the local upslope area draining through a certain point per unit contour length and β is

the local slope.

SPI that measures erosive powers of streams is calculated using the following equation:

(2)
where is the specific catchment area and β is the local slope gradient in degrees.

9
In this study, the TWI map (Fig. 2d) was constructed with eight classes whereas the SPI map (Fig.

2e) was built with ten categories.

The distance from the river and stream density has significant impacts on the spread and magnitude

of flooding (Glenn et al., 2012), therefore the two factors were selected. In this study, the river

network was extracted from the aforementioned topographical maps and was used to construct the

distance to river map (Fig. 2f) and river density map (Fig. 2g), respectively.

NDVI is a proxy for vegetation attributes and areas with dense vegetation are less prone to flooding

due to the negative relationship between flooding and vegetation density (Tehrany et al., 2013),

therefore, NDVI should be used for flood modeling. The NDVI map with 8 classes (Fig. 2h) of the

current study was produced from the aforementioned Landsat 8 OLI imagery using the equation

(Tucker and Sellers, 1986) as follows:

(3)
where NIR is the near-infrared portion of the electromagnetic spectrum (0.76-0.90 µm), R is the red

portion of the electromagnetic spectrum (0.63–0.69 µm).

10
Fig. 2. (continues): (g) Stream density; (h) NDVI; (i) Lithology; (j) Rainfall.

Lithology is considered to be a main control of channel shape that influences the development of

inset floodplains (Heitmuller et al., 2015), therefore the lithology was selected for flood

susceptibility modeling. The lithology map (Fig. 2i) in this study was generated from Geological

and Mineral Resources Map of Vietnam on a scale of 1:50,000. The lithology map was constructed

with 12 classes based on lithological similarities (Ayalew and Yamagishi, 2005; Pham et al., 2015).

The rainfall data for the study area for from the year 1979 to 2010 were extracted from the Climate

Forecast System Reanalysis (CFSR) database (available at https://www.ncdc.noaa.gov/). An

average rainy season which was recorded from the above period was used to construct the rainfall

map with 8 classes (Fig. 2j) using the Inverse Distance Weighed method.

3. Theoretical background of the methods used

3.1 Neural fuzzy inference model

Neural fuzzy inference model is an amalgamation of fuzzy logic and neural networks that has

capability to map the input space to output space through approximating functions. The behavior of

the neural fuzzy system is strongly influenced by the fuzzy inference engine and algorithms used to

find its optimal weights. Two types of inference engine models that are widely used are the

11
Mamdani fuzzy model and the Takagi and Sugeno fuzzy model (Hellendoorn and Driankov, 2012).

The main difference between them is that the Mamdani model used the fuzzy proposition for both

the antecedent and the consequent parts, whereas in the Takagi and Sugeno fuzzy model, the fuzzy

proposition is used for the premise part, but an affine linear function is used for the consequent part.

In this research, the Takagi and Sugeno fuzzy model and the neural fuzzy structure proposed by

Jang et al. (1997) were used to construct flood susceptibility model.

Structurally, this is a five-layered feed-forward neural fuzzy network in which the first and forth

layers contain adaptive nodes, whereas the second, the third, and the fifth layers contains fixed

nodes. Fig. 3 shows the structure of the neural fuzzy model with two inputs (x, y), one output, and

two rules. In our case study, the input layer consists of ten factors whereas the output is flood

susceptibility index.

The operating mechanism of the neuro-fuzzy network can be summarized as follows: first, the

fuzzification process is performed that transforms the values of 10 flood influencing factors into

fuzzy membership values using a fuzzy membership function. In this study the Gaussian function

(Eq. 4) is used.

 ( x  c )2

 A ( x)  e 2 2 (4)
i

where  Ai ( x ) and  Bi ( y ) are the membership functions; A and B are the linguistic labels of the inputs;

 and c are the width and the centre of the function, respectively.

12
Fig. 3. Architecture of the neuro fuzzy network.

Second, the antecedent (or premise) part of the rules is constructed in the second layer. The output

is called the firing strengths ( i ) and is computed by multiplying all the incoming signals (Eq. 5):

i   A ( y ) . B ( y ) ; i  1, 2
i i
(5)

In the next step, the normalized firing strengths ( i ) are computed the layer 3 using Eq. 6, and then,

the defuzzification is performed in the layer 4 using Eq. 7

i  i / (1  2 ); i  1, 2 (6)
i . fi  i .( pi x  qi y  ri ) (7)
where pi, qi, and ri are the consequent parameters of the affine linear function.

Finally, flood susceptibility values are calculated in the layer 5 (the aggregation layer) using Eq. 8

as follows:

Flood susceptibility values  ii . fi ; i  1, 2 (8)


The set of two if–then rules that is generated from the neural fuzzy model can be written as bellows

(Tien Bui et al., 2012c):

Rule 1: if x is A1 and y is B1, then f1 = p1x + q1y + r1.

Rule 2: if x is A2 and y is B2, then f2 = p2x + q2y + r2.

3.2 Metaheuristic optimization algorithms

It is noted that the neural fuzzy network described in the above section was proposed by Jang et al.

(1997), in which the best values of antecedent parameters ( , c) the consequent parameters (p, q, r)

were determined using the hybrid learning algorithm. In this study, we propose two relatively new

algorithms, Evolutionary Genetic optimization (Goldberg and Holland, 1988) and Particle Swarm

optimization (Eberhart and Shi, 2001), for searching the best values for the aforementioned

parameters of the neural fuzzy model.

3.2.1 Evolutionary Genetic optimization algorithm


13
Evolutionary Genetic optimization (EG) is a stochastic algorithm based on Darwin's Evolution

Theory and Natural Selection (Mitchell, 1998) that has been widely used for solving optimization

problems. The purpose of EG is to find the best population that has the lowest the difference

between output and target values. In current research context, RMSE is used to measure the

difference between the output of the neural fuzzy model (flood susceptibility values) and the flood

target values.

The evolutionary processes of EG for flood susceptibility modeling can be summarized as bellows:

Step 1(Initialization): An initial population is generated where the total number of chromosomes is

called the population size.

Step 2 (GA procedure): This step aims to update and create a new population through three

operators (selection, crossover, and mutation) that has higher fitness values (lower RMSE):

- Selection: The fitness (RMSE) of each individual on the population is calculated, and

then, all of them are used to create a mating pool population. These individuals are called

parents.

- Crossover: The parents are randomly paired to produce a new generation of chromosomes

called offsprings that inherits some characteristics of their parents.

- Mutation: Some chromosomes randomly changed to produce offsprings and the new

population is updated.

Step 3 (Fitness evaluation): The fitness of each offspring in the new population is evaluated.

This is an iteration process and the offspring in the final population that has the lowest RMSE will

be used to extract the best antecedent and consequent parameters of the neural fuzzy model.

3.2.2 Particle Swarm optimization

Particle Swarm optimization (PSO) is a relatively new algorithm that simulates social behavior, for

instance birds flocking, to obtain the best position in a multidimensional space (Kennedy &

Eberhard, 1997). In current context of flood susceptibility modeling, PSO aims to find the best

position of the population (called swarm). This position will have the lowest RMSE.
14
In the swarm, position of each individual (called particle) are updated from iteration to iteration to

find the best one through modifying its position and velocity, then these positions were compared to

select the globally best position for the swarm. Thus, a particle is considered as a point in a D-

dimension space and its status is characterized according to its position and velocity. The D-

dimensional position for particle i at iteration t can be represented as ,

whereas the velocity for particle i at iteration t can be described as Let

represents the best position that particle i at iteration t and

denotes the globally best position at iteration t. To search for the optimal solution,

each particle changes its velocity and position based on equations as follows:

, d = 1,2,…,D (6)
where is the inertia weight; indicates the cognition learning factor; indicates the social

learning factor; and are random numbers uniformly distributed in [0,1].

(7)
The basic process of the PSO algorithm is given as follows:

Step 1(Initialization): An initial swarm was randomly generated.

Step 2 (Fitness evaluation): The fitness each particle in the swarm is valuated based on RMSE

Step 3 (Update): The velocity computation of each particle is estimated based on Eq.6 and position

of each particle is updated using Eq.7.

Step 4 (Construction). For each particle, move to the next position based on Eq.7.

Step 5 (Termination). Stop the algorithm if termination criterion is satisfied or return to step 2

otherwise. For flood susceptibility modeling in this study, the iteration is terminated if the

number of iteration reaches the pre-determined maximum number of iteration.

3.3 Performance assessment

The quantitative evaluation of model accuracy is defined in terms of the forecasting error or the

difference between the observed and anticipated values. In this study, the efficiency of the flood

15
prediction models was evaluated employing statistical evaluation criteria such as Root Mean

Squared Error (RMSE), Mean Absolute Error (MAE), Receiver Operating Characteristic (ROC),

and area under the curve (AUC).

RMSE (Eq.8) and MAE (Eq.9) are statistical measures for evaluation of flood models performance

(Kia et al., 2012). Although many studies declared RMSE can be used as a standard metric for

model errors in geosciences (Savage et al., 2013), however RMSE is sensitive with large values and

outliers (Chai and Draxler, 2014), therefore MAE is further used. Thus, RMSE and MAE are used

together in this study to identify the variation in the errors in a set of predictions. RMSE will always

be larger or equal to the MAE; the greater difference between them, the greater the variance in the

individual errors in the samples.

(8)

(9)

where n is total samples in the training dataset or the validation dataset; is the target values in the

training dataset or the validation dataset; is output values from the flood susceptibility models.

In order to determine the global performance of prediction models, the ROC curve is used (Tien Bui

et al., 2016a). The ROC curve is a bi-dimensional graph plotting sensitivity (true-positive rate)

versus false-positive rate (1-specificity) with various cut-off values, and the closer the curve to the

upper left corner, the better the model is. Global performance of the final neural fuzzy model is

quantified using (AUC). AUC of 1 indicates a perfect model whereas a model with AUC of 0.5 is

non-informative. The qualitative correlation between AUC and prediction capability of models can

be categorized into the following groups: 0.5–0.6 (poor); 0.6–0.7 (average); 0.7–0.8 (good); 0.8–0.9

(very good); and 0.9–1 (excellent) (Kantardzic, 2011; Tien Bui et al., 2016c) .

16
4. Proposed novel integration model based on metaheuristic optimization

algorithms and neural fuzzy Inference model (MONF) for flood

susceptibility mapping

This research aims to propose and verify a novel soft computing methodological approach (namely

MONF) through combination of neural fuzzy inference system and two metaheuristic optimization

algorithms (EG and PSO) for flood susceptibility modeling. This part of the paper describes the

proposed MONF model. It is noted that the data preparation and processing were carried out using

ArcGIS10.2 and IDRISI Selva 17.01. The neural fuzzy inference system is available in the

MatlabTM Fuzzy Logic Toolbox (The MathWorks, 2014). The proposed MONF model was

programmed by the authors in Matlab 2014b environment. In addition, a C++ application was

developed to convert the flood susceptibility indices to a GIS format for ArcGIS10.2. The structure

of the proposed MONF model for flood susceptibility mapping is shown in Fig. 4.

(1) GIS database

The first stage is the data preparation and processing in which historical flood locations;

topographic maps, Landsat 8 OLI imagery, geological maps, and precipitation data were collected,

processed, and compiled. The result was then used to construct a GIS database that consists of the

flood inventory map and 10 influencing factors (slope, elevation, curvature, TWI, SPI, distance to

river, stream density, NDVI, lithology, and rainfall). Since these factors were constructed with

categorical classes (see section 3.2), a transformation process suggested by Tien Bui et al. (2012b)

was used to convert these categorical classes to numeric values. Accordingly, each class of these

maps was assigned an attribute value, and then, rescaled into a range 0.01 to 0.99 with the use of the

max-min normalization. This will avoid negative effects when the differences in magnitudes of the

categorical values are large (Tien Bui et al., 2012c).

In flood modeling, prediction models should be validated concerning unknown future flood

occurrence and without this task the models will have no scientific meaning. Thus, flood locations
17
that are not employed during the model building are used to validate the resulting map. For this

purpose, in this study, these flood points in the flood inventory map was randomly partitioned into

two subsets with a ratio of 70/30 for training (54 locations) and validating (22 locations) models,

respectively (Hoang and Tien Bui, 2016; Tien Bui et al., 2016c). These flood points were assigned

to ‘1’. The same amount of non-flood points were randomly generated from non-flood areas of the

study area and assigned to ‘0’. Finally, a sampling process was carried out on the ten influencing

factor to constructed the training and validation datasets. The training dataset was used to build

flood susceptibility models whereas the validation dataset was employed to estimate the prediction

capability of the models.

Fig. 4. The structure of the proposed MONF model based on metaheuristic optimization and neural

fuzzy Inference model for flood susceptibility mapping.

(2) Model configuration:

The initial neural fuzzy model was generated using the training dataset where fuzzy sets and fuzzy

membership values were derived using the Gaussian membership function. Since the neural fuzzy
18
model could work more efficient if the training data was represented more concisely (Tien Bui et

al., 2012c), therefore, fuzzy c-means clustering algorithm that has capability to distil natural groups

of data was used (Abdulshahed et al., 2015). A test based on trial-and-error processes was

conducted to analyze between RMSE and the number of clusters, and finally, 15 clusters were

found as the best for this study. These clusters will be used to generate if-then rules of the neural

fuzzy model, where the best values of antecedent and consequent parameters of these rules are

determined in the metaheuristic optimization in the next step.

(3) Metaheuristic optimization

The aim of this step is to find the optimal antecedent and consequent parameters of the neural fuzzy

model using EG and PSO algorithms. Since the diversity of initial populations is influenced by

numbers of individuals (in EG)) and particles (in PSO), therefore a trial-and-error test was

conducted; and 30 individuals and 40 particles are the best for EG and PSO, respectively. For the

case of EG, the crossover rate of 0.4 and mutation rate of 0.6 are used for all the calculations. This

is because these rates have been shown to maintain sufficient diversity of the population

(Musharavati, 2010; Sarker and Ray, 2010). For the case of PSO, the inertia weight of 0.9 is used

due to the ability to obtain high performance of the model (Poli et al., 2007).

(4) Evaluation of fitness and (5) Stopping criteria

To evaluate the performance of the neural fuzzy model, RMSE function in Eq.8 was used. It is

noted that the lower the RMSE, the better the neural fuzzy models is. During the optimization

process, the EG and PSO algorithms explore various combination of antecedent and consequent

parameters. In each generation, the EG algorithm creates chromosomes and checks them with

RMSE to select the best chromosomes population. In different way, the PSO algorithm updates

positions of particles based on RMSE to find the best position of the swarm. This is an iteration

process and a maximum of 1000 generations as the stopping criteria was used. The final RMSEs of

19
the best chromosomes and the best swarm position are compared and then the optimal antecedent

and consequent parameters are derived.

(6). Final flood susceptibility model

Using the optimal antecedent and consequent parameters, the final neural fuzzy model was

constructed, and subsequently, used to calculate flood susceptibility index for all pixels in the study

areas. The degree-of-fit of the model with the training dataset and the prediction capability of the

model are assessed using RMSE, MAE, ROC curve, and AUC, as mentioned in section 3.3.

5. Results and discussion

Although the flood influencing factors have selected based on analysis of flood occurrence and

characteristics of the study, however, it is logical to say that the degree of impact of these factors is

different, and in some case, some factors may have no influencing to flood. Therefore the predictive

power of the flood influencing factors should be analyzed and factors with non predictive value

should be eliminated. This will help to reduce noise and the performance of the resulting model will

be enhanced (Tien Bui et al., 2016d).

For this purpose, Pearson correlation technique (Guyon and Elisseeff, 2003) was adopted to

quantify the predictive power of ten flood influencing factors in this study. The Pearson technique is

selected due to its efficiency for feature selection in machine learning (Canuto et al., 2008; Lloyd et

al., 2011). The higher the Pearson correlation value, the better the influencing factor to the flood

model. The analysis result is shown in Table 1. The column ‘average predictive power’ is the

average Pearson correlation value and its standard deviation derived from calculation of the Pearson

correlation values in a 10-fold cross-validation process.

It could be observed a clear distinction of the predictive powers of 10 influencing factors in

correlations to the flood in the study area. The highest predictive power is the elevation (0.599) and

the stream density (0.596), followed by the NDVI (0.568), the distance to river (0.362), the slope

20
(0.307), the TWI (0.282), SPI (0.241), the lithology (0.134), the rainfall (0.084), and the curvature

(0.030). Because all the factors have predictive power for the flood, none of them was eliminated in

this analysis.

Table 1 Predictive power of flood influencing factors for the study area using the Pearson

correlation with the 10-fold cross validation.

No Forest fire related factor Average predictive power Standard deviation


1 Elevation 0.599 0.016
2 Stream density 0.596 0.021
3 NDVI 0.568 0.016
4 Distance to river 0.362 0.026
5 Slope 0.307 0.037
6 TWI 0.282 0.033
7 SPI 0.241 0.039
8 Lithology 0.134 0.031
9 Rainfall 0.084 0.035
10 Curvature 0.030 0.016
The final structure of the neural fuzzy model in this study is shown in Fig. 5. It could be observed

that the model consists of 10 input factors, one output, and 15 rules with 300 antecedent parameters

and 165 consequent parameters. Using the training dataset, the MONF model was trained with 1000

iterations. The results (Tables 2, 3 and Fig. 6) show that RMSEs of the MONF model are 0.306 and

0.362 on the training dataset and the validation dataset, respectively. These values are significantly

lowers than the standard deviation of the target values (0.5) in the two dataset indicating that the

model has high performance. AUC of 0.965 on the training dataset indicating the global degree-of-

fit of the model is 96.5% (Fig. 6).

Since RMSE is a quadratic scoring index (Chai and Draxler, 2014) that does not provides possible

variations of the error distribution, therefore MAE is additionally used. MAEs are 0.094 and 0.130

on the training dataset and the validation dataset, respectively (Tables 2, 3), are clearly lower than

RMSE indicating that the model has low variance of the individual errors in the two datasets.

Prediction capability of the MONF model is assessed using AUC on the validation dataset, and

AUC= 0.911 (Fig. 6) demonstrating high prediction capability.

21
Fig. 5. Structure of the proposed MONF model for flood susceptibility mapping for the study area.

Table 2 Performance of the proposed MONF model, the J48DT model, the RF model, the MLP

Neural Nets model, the SVM model, and the ANFIS model using the training dataset.

No. Flood susceptibility model Training dataset


RMSE MAE AUC
1 MONF 0.306 0.094 0.962
2 J48 DT 0.393 0.246 0.852
3 RF 0.327 0.237 0.934
4 MLP Neural Nets 0.369 0.178 0.909
5 SVM 0.318 0.207 0.938
6 ANFIS 0.497 0.211 0.905
Table 3 Prediction capability of the proposed MONF model, the J48DT model, the RF model, the

MLP Neural Nets model, the SVM model, and the ANFIS model using the validation dataset.

No. Flood susceptibility model Validation dataset


RMSE MAE AUC
1 MONF 0.362 0.130 0.911
2 J48 DT 0.352 0.230 0.895
3 RF 0.354 0.257 0.894
4 MLP Neural Nets 0.380 0.193 0.903
5 SVM 0.356 0.246 0.905
6 ANFIS 0.615 0.388 0.767

22
Fig. 6. ROC curve and AUC of the proposed MONF model using (a) the training dataset and (b)

validation dataset.

6. Model comparison and generation of flood susceptibility map

Since the major purpose of flood susceptibility modeling is to identify the zones with higher

probability of flood occurrence, therefore the usability of the proposed MONF model for flood

susceptibility mapping is further assessed through a comparison with those derived from current

state-of-the art machine learning methods using the same data, including J48DT, RF, MLP Neural

Nets, SVM, and ANFIS.

J48DT is a Java reimplementation of the C4.5 algorithm proposed by Quinlan (1993) that constructs

a hierarchical tree-like structure classifier with internal nodes, leaf nodes, and branches. For flood

susceptibility mapping in this study, the J48DT model was constructed with the minimum number

of sample per leaf is 1 and the confidence factor for pruning tree is 0.35. These are the best

parameter values determined in a test suggested by Tien Bui et al. (2014). RF is a variation of the

bagging ensemble that combines decision trees classifiers (Breiman, 2001) to form the final flood

susceptibility model. Each individual decision trees classifier is constructed from a subset that is

bootstrapped replicas of the training data but could be different feature subsets (Polikar, 2006). For

this study, number of trees is selected as 500 as suggested by Catani et al. (2013).

23
MLP Neural Nets is a popular method for the modeling of complex problems such as flood (Darras

et al., 2015; Dawson and Wilby, 2001). The structure of the MLP Neural Nets model for flood

susceptibility mapping in this study is determined using the same method in Tien Bui et al. (2016d)

and a structure with 10 input layers, one hidden layer with 6 neurons, and an output layer was

selected. The activation function is selected as the logistic sigmoid and other parameters i.e.

learning rate, momentum, and training time were selected as 0.3, 0.2, and 500, respectively (Were et

al., 2015). For the popular machine learning method of SVM, the radial basic function (RBF) kernel

(Hoang and Tien Bui, 2016) was employed. The SVM model was constructed where the best kernel

width (γ=0.195) and the regularization (C =10) were determined using the grid-search method

(Tien Bui et al., 2012a). For the popular ANFIS, the Gaussian membership function was used,

whereas the parameters for the subtractive clustering algorithm (Akbulut et al., 2004) were 0.55,

1.52, 0.5, and 0.25 for range of influence, squash factor, accept ratio, and reject ratio, respectively.

They are the best values for the study area and were determined based on a trial-and-error process.

For training ANFIS, the default hybrid learning algorithm was used and the training data was split

into a building dataset a checking dataset as suggested by Tien Bui et al. (2012c) to control

overfitting.

Fig. 7. Graphic curve of the flood susceptibility map for the study area.

24
Fig. 8. Flood susceptibility map using the proposed MONF model for the study area.

The results of the flood susceptibility models using J48DT, RF, SVM, MLP Neural Nets, and

ANFIS are shown in Table 2. The results show that all the five flood models have high

performances with the training data. The highest degree of fit is for the SVM model (RMSE =0.318,

MAE= 0.207, and AUC= 0.938), closely followed by the RF model (RMSE =0.327, MAE= 0.237,

and AUC= 0.934), the MLP Neural Nets model (RMSE =0.369, MAE= 0.178, and AUC= 0.909),

the J48DT model (RMSE =0.393, MAE= 0.246, and AUC= 0.852), and the ANFIS model (RMSE

25
=0.497, MAE= 0.211, and AUC= 0.905). Compared to the proposed MONF model, performance of

the five flood models is clearly lower.

The prediction capability of the five flood models was assessed using the validation dataset and the

results are shown in Table 3. It could be observed that the prediction capability of all five models is

lower than that of the MONF model. Although RMSE of the MONF model (0.362) is slightly

higher than the J48DT (0.352), the RF model (0.354), the SVM model (0.356), the MLP Neural

Nets model (0.380), and the ANFIS model (0.615), however, MAE and AUC of the MONF model

are better than the other models (Table 3). Based on the above analysis, it could be concluded that

the proposed MONF model has attained the most desirable performance in both the training dataset

and the validating dataset; therefore the MONF model is an effective tool for flood susceptibility

modeling.

Since the MONF model has deemed best suited for the flood dataset, the model is then used to

calculate flood susceptibility index for all the pixels of the study area. The result was transformed to

a GIS format to derive a flood susceptibility map in ArcGIS 10.2. The map was overlaid on the

flood inventory map to calculate percentage of the flood location and percentage of the

susceptibility map, and then, a graphic curve was constructed (Fig. 7). The map (Fig. 8) was then

visualized by five susceptibility classes such as very high (10%), high (10%), moderate (15%), low

(15%), and very low (50%) (Table 4).

Table 4 Description of the five flood susceptibility classes obtained from the MONF model for
the study area.

No. Flood index range Flood susceptibility (%) Verbal expression Flood location (%) Areas (km2)
1 1.465 – 0.689 100-90 Very high 67.1 280.3
2 0.688– 0.505 90-80 High 19.7 280.3
3 0.504– 0.334 80-65 Medium 6.6 420.5
4 0.333– 0.192 65-50 Low 6.6 420.5
5 0.192– 0.000 50-0 Very low 0.0 1401.6
Analysis of the graphic curve and Table 1 shows that 10% of the study area was classified to the

very high class but this class contains 67.1% of the total flood locations whereas around 19.7% of

26
the total flood locations located in the high class that covers 10% of the total study area. The

medium class and the low class cover 30% of the total study area, however, only 6.6% of the total

flood locations located on each class. In contrary, 50% of the total study area are classified to the

very low class with contains no flood location.

7. Concluding remarks

This study proposes and verifies a new hybrid intelligent approach (named as MONF) that is an

integration of two meta-heuristic optimization algorithms, EG and PSO, and the neural fuzzy for

flood susceptibility mapping with a case study of the Tuong Duong district, the Nghe An province

of Vietnam. This is a typical mountainous highland district that receives high frequency tropical

cyclones in Vietnam and floods occur every year during the tropical cyclone season.

The proposed MONF model was established and evaluated based on the GIS database of the study

area with 76 flood locations and ten influencing factors. The degree-of-fit and the prediction

capability of the MONF model were assessed using RMSE, MSE, ROC curve, and AUC. In

addition, five flood models derived from J48DT, RF, MLP Neural Nets, SVM, and ANFIS were

used to compare and confirm the usability of the proposed model.

Using the GIS database, the MONF model was established. Experimental results have demonstrated

that the proposed model attained high performance with both the training dataset and the validation

dataset. The degree-of-fit and the prediction capability are 96.2% and 91.1%, respectively,

indicating that a satisfactory accuracy for flood susceptibility mapping. Ten flood influencing

factors revealed positive predictive powers to the flood indicating that these factors have been

selected, processed, and coded successfully. The elevation, the stream density, and the NDVI have

the highest predictive powers for the flood susceptibility in this study.

The flood susceptibility index that produced from the MONF model varies from 0 to 1.465

however; flood indices below a threshold of 0.5 are dominant. The flood susceptibility index was

transformed to the susceptibility map by mean of five classes. Interpretation of these classes shows
27
that the very high and high classes cover small study area (20%) but contain 86.8% of the total

flood locations, whereas 50% of the study area is for the very low class with no flood location,

indicating that the MONF model produced high accuracy result.

Overall, the contribution of this research to the body of flood modelling knowledge can be

highlighted as bellows: (i) the proposed neural fuzzy model is capable of producing high quality of

flood susceptibility map therefore it should be used for mapping of flood susceptibility in other

prone regions; (ii) the integration of the two meta-heuristic algorithms, EG and PSO, with the

neural fuzzy provides an solution to build the flood susceptibility model autonomously; (iii) the

MONF model outperforms the J48DT, the RF, the MLP Neural Nets, the SVM, and the ANFIS

models, therefore, the proposed MONF model is an alternative tool for flood susceptibility

modeling.

In summary, the integration of advantages neural fuzzy plus goodness of two optimization methods

created higher efficiency of the proposed model for flood susceptibility mapping for the tropical

cyclone area of Vietnam. The result may be accommodating for planners and decision makers for

sustainable management of flood-prone areas in the study area. It is also recommended that meta-

heuristic algorithms be utilized as tools for pertinent studies in hydrological sciences.

Acknowledgement

This research was funded by the Project No. B2014-02-21 (Ministry of Training and Education of
Vietnam) and was partially supported by the Geographic Information System group, University
College of Southeast Norway.
Conflict of Interest: The authors declare that there is no conflict of interest.
References
Abdulshahed, A.M., Longstaff, A.P., Fletcher, S., 2015. The application of ANFIS prediction models for thermal error
compensation on CNC machine tools. Applied Soft Computing, 27: 158-168.
DOI:http://dx.doi.org/10.1016/j.asoc.2014.11.012
Akbulut, S., Hasiloglu, A.S., Pamukcu, S., 2004. Data generation for shear modulus and damping ratio in reinforced
sands using adaptive neuro-fuzzy inference system. Soil Dynamics and Earthquake Engineering, 24(11): 805-
814.
Alfieri, L., Feyen, L., Dottori, F., Bianchi, A., 2015. Ensemble flood risk assessment in Europe under high end climate
scenarios. Global Environmental Change, 35: 199-212.
Arouri, M., Nguyen, C., Youssef, A.B., 2015. Natural Disasters, Household Welfare, and Resilience: Evidence from
Rural Vietnam. World Development, 70: 59-77.

28
Ayalew, L., Yamagishi, H., 2005. The application of GIS-based logistic regression for landslide susceptibility mapping
in the Kakuda-Yahiko Mountains, Central Japan. Geomorphology, 65(1-2): 15-31.
DOI:10.1016/j.geomorph.2004.06.010
Bai, T. et al., 2015. Synergistic gains from the multi-objective optimal operation of cascade reservoirs in the Upper
Yellow River basin. Journal of Hydrology, 523: 758-767.
Beven, K., Kirkby, M., Schofield, N., Tagg, A., 1984. Testing a physically-based flood forecasting model
(TOPMODEL) for three UK catchments. Journal of Hydrology, 69(1): 119-143.
Breiman, L., 2001. Random forests. Machine Learning, 45(1): 5-32. DOI:10.1023/a:1010933404324
Brown, K., Pty, R., 2009. Vietnam water sector review (Report). Hanoi: Ministry of Natural Resources and
Environment.
Canuto, A.M.P., Santana, L.E.A., Abreu, M.C.C., Xavier Jr, J.C., 2008. An analysis of data distribution in the ClassAge
system: An agent-based system for classification tasks. Neurocomputing, 71(16–18): 3319-3325.
DOI:http://dx.doi.org/10.1016/j.neucom.2008.01.032
Catani, F., Lagomarsino, D., Segoni, S., Tofani, V., 2013. Landslide susceptibility estimation by random forests
technique: sensitivity and scaling issues. Nat. Hazards Earth Syst. Sci., 13(11): 2815-2831. DOI:10.5194/nhess-
13-2815-2013
Ceola, S., Laio, F., Montanari, A., 2014. Satellite nighttime lights reveal increasing human exposure to floods
worldwide. Geophysical Research Letters, 41(20): 7184-7190.
Chai, T., Draxler, R.R., 2014. Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against
avoiding RMSE in the literature. Geoscientific Model Development, 7(3): 1247-1250.
Chang, F.-J., Tsai, M.-J., 2016. A nonlinear spatio-temporal lumping of radar rainfall for modeling multi-step-ahead
inflow forecasts by data-driven techniques. Journal of Hydrology, 535: 256-269.
DOI:http://dx.doi.org/10.1016/j.jhydrol.2016.01.056
Cloke, H., Pappenberger, F., 2009. Ensemble flood forecasting: a review. Journal of Hydrology, 375(3): 613-626.
Cook, A., Merwade, V., 2009. Effect of topographic data, geometric configuration and modeling approach on flood
inundation mapping. Journal of Hydrology, 377(1): 131-142.
Correia, F.N., Da Silva, F.N., Ramos, I., 1999. Floodplain management in urban developing areas. Part II. GIS-based
flood analysis and urban growth modelling. Water Resources Management, 13(1): 23-37.
Darras, T., Borrell Estupina, V., Vayssade, B., Johannet, A., Pistre, S., 2015. Identification of spatial and temporal
contributions of rainfalls to flash floods using neural network modelling: case study on the Lez basin (southern
France). Hydrology and Earth System Sciences, 19(10): 4397-4410.
Dawson, C., Wilby, R., 2001. Hydrological modelling using artificial neural networks. Progress in physical Geography,
25(1): 80-108.
Dottori, F., Martina, M.L.V., Figueiredo, R., 2016. A methodology for flood susceptibility and vulnerability analysis in
complex flood scenarios. Journal of Flood Risk Management: n/a-n/a. DOI:10.1111/jfr3.12234
Du, J., Fang, J., Xu, W., Shi, P., 2013. Analysis of dry/wet conditions using the standardized precipitation index and its
potential usefulness for drought/flood monitoring in Hunan Province, China. Stochastic environmental research
and risk assessment, 27(2): 377-387.
Eberhart, R.C., Shi, Y., 2001. Particle swarm optimization: developments, applications and resources, Evolutionary
Computation, 2001. Proceedings of the 2001 Congress on. IEEE, pp. 81-86.
Fekete, A., 2009. Validation of a social vulnerability index in context to river-floods in Germany. Nat. Hazards Earth
Syst. Sci., 9(2): 393-403. DOI:10.5194/nhess-9-393-2009
Fenicia, F., Savenije, H.H., Matgen, P., Pfister, L., 2008. Understanding catchment behavior through stepwise model
concept improvement. Water Resources Research, 44(1).
Fernandez, G., Uy, N., Shaw, R., 2012. Chapter 11 Community-Based Disaster Risk Management Experience of the
Philippines. Community-Based Disaster Risk Reduction (Community, Environment and Disaster Risk
Management, Volume 10) Emerald Group Publishing Limited, 10: 205-231.
Fortin, J.-P. et al., 2001. Distributed watershed model compatible with remote sensing and GIS data. I: Description of
model. Journal of Hydrologic Engineering, 6(2): 91-99.
Ghalkhani, H., Golian, S., Saghafian, B., Farokhnia, A., Shamseldin, A., 2013. Application of surrogate artificial
intelligent models for real‐time flood routing. Water and Environment Journal, 27(4): 535-548.
Glenn, E.P. et al., 2012. Roles of saltcedar (Tamarix spp.) and capillary rise in salinizing a non-flooding terrace on a
flow-regulated desert river. Journal of Arid Environments, 79: 56-65.
Goldberg, D.E., Holland, J.H., 1988. Genetic algorithms and machine learning. Mach Learn, 3(2): 95-99.
Güçlü, Y.S., Şen, Z., 2016. Hydrograph estimation with fuzzy chain model. Journal of Hydrology, 538: 587-597.
DOI:http://dx.doi.org/10.1016/j.jhydrol.2016.04.057
Guyon, I., Elisseeff, A., 2003. An introduction to variable and feature selection. The Journal of Machine Learning
Research, 3: 1157-1182.
Haq, M., Akhtar, M., Muhammad, S., Paras, S., Rahmatullah, J., 2012. Techniques of remote sensing and GIS for flood
monitoring and damage assessment: a case study of Sindh province, Pakistan. The Egyptian Journal of Remote
Sensing and Space Science, 15(2): 135-141.

29
Heerdegen, R.G., Beran, M.A., 1982. Quantifying source areas through land surface curvature and shape. Journal of
Hydrology, 57(3-4): 359-373.
Heitmuller, F.T., Hudson, P.F., Asquith, W.H., 2015. Lithologic and hydrologic controls of mixed alluvial–bedrock
channels in flood-prone fluvial systems: Bankfull and macrochannels in the Llano River watershed, central
Texas, USA. Geomorphology, 232: 1-19.
Hellendoorn, H., Driankov, D., 2012. Fuzzy model identification: selected approaches. Springer Science & Business
Media.
Hoang, N.-D., Tien Bui, D., 2016. A Novel Relevance Vector Machine Classifier with Cuckoo Search Optimization for
Spatial Prediction of Landslides. Journal of Computing in Civil Engineering. doi:10.1061/(ASCE)CP.1943-
5487.0000557. DOI:doi:10.1061/(ASCE)CP.1943-5487.0000557
Hong, Y.-S.T., White, P.A., 2009. Hydrological modeling using a dynamic neuro-fuzzy system with on-line and local
learning algorithm. Advances in water resources, 32(1): 110-119.
Hostache, R., Lai, X., Monnier, J., Puech, C., 2010. Assimilation of spatially distributed water levels into a shallow-
water flood model. Part II: Use of a remote sensing image of Mosel River. Journal of hydrology, 390(3): 257-
268.
Jang, J.S.R., Sun, C.T., Mizutani, E., 1997. Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning
and Machine Intelligence (Matlab Curriculum Series). Prentice Hall, 614 pp.
Jayakrishnan, R., Srinivasan, R., Santhi, C., Arnold, J., 2005. Advances in the application of the SWAT model for water
resources management. Hydrological processes, 19(3): 749-762.
Kantardzic, M., 2011. Data mining: concepts, models, methods, and algorithms. John Wiley & Sons, Hoboken, New
Jersey.
Kazakis, N., Kougias, I., Patsialis, T., 2015. Assessment of flood hazard areas at a regional scale using an index-based
approach and Analytical Hierarchy Process: Application in Rhodope–Evros region, Greece. Science of the Total
Environment, 538: 555-563.
Kia, M.B. et al., 2012. An artificial neural network model for flood simulation using GIS: Johor River Basin, Malaysia.
Environ Earth Sci, 67(1): 251-264.
Kreft, S., Eckstein, D., Junghans, L., Kerestan, C., Hagen, U., 2014. Global climate risk index 2015. Who suffers most
from extreme weather events: 1-31.
Lee, M.-J., Kang, J.-e., Jeon, S., 2012. Application of frequency ratio model and validation for predictive flooded area
susceptibility mapping using GIS, Geoscience and Remote Sensing Symposium (IGARSS), 2012 IEEE
International. IEEE, pp. 895-898.
Li, X.H., Zhang, Q., Shao, M., Li, Y.L., 2012. A comparison of parameter estimation for distributed hydrological
modelling using automatic and manual methods, Advanced Materials Research. Trans Tech Publ, pp. 2372-2375.
Liu, K. et al., 2016. Coupling the k-nearest neighbor procedure with the Kalman filter for real-time updating of the
hydraulic model in flood forecasting. International Journal of Sediment Research.
Liu, Y., De Smedt, F., 2005. Flood modeling for complex terrain using GIS and remote sensed information. Water
Resources Management, 19(5): 605-624.
Lloyd, A.J., Beckmann, M., Favé, G., Mathers, J.C., Draper, J., 2011. Proline betaine and its biotransformation products
in fasting urine samples are potential biomarkers of habitual citrus fruit consumption. British Journal of
Nutrition, 106(06): 812-824.
Lohani, A., Kumar, R., Singh, R., 2012. Hydrological time series modeling: A comparison between adaptive neuro-
fuzzy, neural network and autoregressive techniques. Journal of Hydrology, 442: 23-35.
Loo, Y.Y., Billa, L., Singh, A., 2015. Effect of climate change on seasonal monsoon in Asia and its impact on the
variability of monsoon rainfall in Southeast Asia. Geoscience Frontiers, 6(6): 817-823.
DOI:http://dx.doi.org/10.1016/j.gsf.2014.02.009
Mitchell, M., 1998. An introduction to genetic algorithms. MIT press.
Mukerji, A., Chatterjee, C., Raghuwanshi, N.S., 2009. Flood forecasting using ANN, neuro-fuzzy, and neuro-GA
models. Journal of Hydrologic Engineering, 14(6): 647-652.
Musharavati, F., 2010. Process planning optimization in reconfigurable manufacturing systems. Universal-Publishers.
Nayak, P., Sudheer, K., Rangan, D., Ramasastri, K., 2005. Short‐term flood forecasting with a neurofuzzy model. Water
Resources Research, 41(4).
Nayak, P.C., Sudheer, K.P., Rangan, D.M., Ramasastri, K.S., 2004. A neuro-fuzzy computing technique for modeling
hydrological time series. Journal of Hydrology, 291(1-2): 52-66. DOI:10.1016/j.jhydrol.2003.12.010
Niedda, M., 2004. Upscaling hydraulic conductivity by means of entropy of terrain curvature representation. Water
resources research, 40(4).
Pham, B., Tien Bui, D., Pourghasemi, H., Indra, P., Dholakia, M.B., 2015. Landslide susceptibility assesssment in the
Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer
perceptron neural networks, and functional trees methods. Theoretical and Applied Climatology: 1-19.
DOI:10.1007/s00704-015-1702-9
Poli, R., Kennedy, J., Blackwell, T., 2007. Particle swarm optimization. Swarm intelligence, 1(1): 33-57.
Polikar, R., 2006. Ensemble based systems in decision making. Circuits and systems magazine, IEEE, 6(3): 21-45.

30
Pradhan, B., Shafiee, M., Pirasteh, S., 2009. Maximum Flood Prone Area Mapping using RADARSAT Images and
GIS: Kelantan River Basin. International Journal of Geoinformatics, 5(2).
Pulvirenti, L., Pierdicca, N., Chini, M., Guerriero, L., 2011. An algorithm for operational flood mapping from Synthetic
Aperture Radar (SAR) data based on the fuzzy logic. Natural Hazard and Earth System Sciences.
Qi, S. et al., 2009. Inundation extent and flood frequency mapping using LANDSAT imagery and digital elevation
models. GIScience & Remote Sensing, 46(1): 101-127.
Quinlan, J.R., 1993. C4.5: programs for machine learning Morgan Kaufmann San Mateo, CA, USA
Reynaud, A., Nguyen, M.-H., 2012. Monetary Valuation of Flood Insurance in Vietnam, Mimeo, Toulouse School of
Economics.
Reynaud, A., Nguyen, M.-H., 2016. Valuing Flood Risk Reductions. Environmental Modeling & Assessment: 1-15.
DOI:10.1007/s10666-016-9500-z
Rezaeianzadeh, M., Tabari, H., Yazdi, A.A., Isik, S., Kalin, L., 2014. Flood flow forecasting using ANN, ANFIS and
regression models. Neural Computing and Applications, 25(1): 25-37.
Sahoo, B., Chatterjee, C., Raghuwanshi, N.S., Singh, R., Kumar, R., 2006. Flood estimation by GIUH-based Clark and
Nash models. Journal of Hydrologic Engineering, 11(6): 515-525.
Sarker, R.A., Ray, T., 2010. Agent-Based Evolutionary Search, 5. Springer Science & Business Media.
Sattari, M.T., Pal, M., Apaydin, H., Ozturk, F., 2013. M5 model tree application in Daily River flow forecasting in Sohu
Stream, Turkey. Water Resources, 40(3): 233-242.
Savage, N. et al., 2013. Air quality modelling using the Met Office Unified Model (AQUM OS24-26): model
description and initial evaluation. Geoscientific Model Development, 6(2): 353-372.
Seckin, N., Cobaner, M., Yurtal, R., Haktanir, T., 2013. Comparison of artificial neural network methods with L-
moments for estimating flood flow at ungauged sites: the case of East Mediterranean River Basin, Turkey. Water
resources management, 27(7): 2103-2124.
Shaw, R., Ishiwatari, M., Arnold, M., 2011. Community-based Disaster Risk Management.
Shu, C., Ouarda, T., 2008. Regional flood frequency analysis at ungauged sites using the adaptive neuro-fuzzy inference
system. Journal of Hydrology, 349(1): 31-43.
Smith, K., Ward, R., 1998. Floods: physical processes and human impacts. John Wiley and Sons Ltd.
Talei, A., Chua, L.H.C., Quek, C., Jansson, P.-E., 2013. Runoff forecasting using a Takagi–Sugeno neuro-fuzzy model
with online learning. Journal of Hydrology, 488: 17-32.
Tehrany, M.S., Lee, M.-J., Pradhan, B., Jebur, M.N., Lee, S., 2014a. Flood susceptibility mapping using integrated
bivariate and multivariate statistical models. Environ Earth Sci, 72(10): 4001-4015.
Tehrany, M.S., Pradhan, B., Jebur, M.N., 2013. Spatial prediction of flood susceptible areas using rule based decision
tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS. Journal of Hydrology, 504:
69-79. DOI:http://dx.doi.org/10.1016/j.jhydrol.2013.09.034
Tehrany, M.S., Pradhan, B., Jebur, M.N., 2014b. Flood susceptibility mapping using a novel ensemble weights-of-
evidence and support vector machine models in GIS. Journal of Hydrology, 512: 332-343.
Tehrany, M.S., Pradhan, B., Jebur, M.N., 2015a. Flood susceptibility analysis and its verification using a novel
ensemble support vector machine and frequency ratio method. Stochastic Environmental Research and Risk
Assessment, 29(4): 1149-1165.
Tehrany, M.S., Pradhan, B., Mansor, S., Ahmad, N., 2015b. Flood susceptibility assessment using GIS-based support
vector machine model with different kernel types. Catena, 125: 91-101.
The MathWorks, I., 2014. Fuzzy Logic Toolbox User's Guide R2014b.
Tien Bui, D., Le, K.-T., Nguyen, V., Le, H., Revhaug, I., 2016a. Tropical Forest Fire Susceptibility Mapping at the Cat
Ba National Park Area, Hai Phong City, Vietnam, Using GIS-Based Kernel Logistic Regression. Remote
Sensing, 8(4): 347.
Tien Bui, D., Nguyen, Q.-P., Hoang, N.-D., Klempe, H., 2016b. A Novel Fuzzy K-Nearest Neighbor Inference model
with Differential Evolution for Spatial Prediction of Rainfall-Induced Shallow Landslides in a Tropical Hilly
Area using GIS. Landslides. Doi: 10.1007/s10346-016-0708-4.
Tien Bui, D., Pham, T.B., Nguyen, Q.-P., Hoang, N.-D., 2016c. Spatial Prediction of Rainfall-Induced Shallow
Landslides Using Hybrid Integration Approach of Least Squares Support Vector Machines and Differential
Evolution Optimization: A Case Study in Central Vietnam. International Journal of Digital Earth. Doi:
10.1080/17538947.2016.1169561.
Tien Bui, D., Pradhan, B., Lofman, O., Revhaug, I., Dick, O.B., 2012a. Application of support vector machines in
landslide susceptibility assessment for the Hoa Binh province (Vietnam) with kernel functions analysis, iEMSs
2012 - Managing Resources of a Limited Planet: Proceedings of the 6th Biennial Meeting of the International
Environmental Modelling and Software Society, pp. 382-389.
Tien Bui, D., Pradhan, B., Lofman, O., Revhaug, I., Dick, O.B., 2012b. Landslide susceptibility assessment in the Hoa
Binh province of Vietnam: A comparison of the Levenberg-Marquardt and Bayesian regularized neural
networks. Geomorphology, 171–172(0): 12–29.
Tien Bui, D., Pradhan, B., Lofman, O., Revhaug, I., Dick, O.B., 2012c. Landslide susceptibility mapping at Hoa Binh
province (Vietnam) using an adaptive neuro-fuzzy inference system and GIS. Computers & Geosciences, 45(0):
199-211. DOI:10.1016/j.cageo.2011.10.031

31
Tien Bui, D., Pradhan, B., Revhaug, I., Trung Tran, C., 2014. A Comparative Assessment Between the Application of
Fuzzy Unordered Rules Induction Algorithm and J48 Decision Tree Models in Spatial Prediction of Shallow
Landslides at Lang Son City, Vietnam. In: Srivastava, P.K., Mukherjee, S., Gupta, M., Islam, T. (Eds.), Remote
Sensing Applications in Environmental Research. Society of Earth Scientists Series. Springer International
Publishing, Cham, Switzerland, pp. 87-111. DOI:10.1007/978-3-319-05906-8_6
Tien Bui, D., Tuan, T.A., Klempe, H., Pradhan, B., Revhaug, I., 2016d. Spatial prediction models for shallow landslide
hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel
logistic regression, and logistic model tree. Landslides, 13: 361-378. DOI:10.1007/s10346-015-0557-6
Tokar, A.S., Johnson, P.A., 1999. Rainfall-runoff modeling using artificial neural networks. Journal of Hydrologic
Engineering, 4(3): 232-239.
Toth, E., Brath, A., Montanari, A., 2000. Comparison of short-term rainfall prediction models for real-time flood
forecasting. Journal of Hydrology, 239(1): 132-147.
Tottrup, C., 2004. Improving tropical forest mapping using multi-date Landsat TM data and pre-classification image
smoothing. International Journal of remote sensing, 25(4): 717-730.
Tropeano, D., Turconi, L., 2004. Using historical documents for landslide, debris flow and stream flood prevention.
Applications in Northern Italy. Natural Hazards, 31(3): 663-679.
Tsai, W.-P., Chang, F.-J., Chang, L.-C., Herricks, E.E., 2015. AI techniques for optimizing multi-objective reservoir
operation upon human and riverine ecosystem demands. Journal of Hydrology, 530: 634-644.
Tucker, C., Sellers, P., 1986. Satellite remote sensing of primary production. International journal of remote sensing,
7(11): 1395-1416.
Were, K., Tien Bui, D., Dick, Ø.B., Singh, B.R., 2015. A comparative assessment of support vector regression, artificial
neural networks, and random forests for predicting and mapping soil organic carbon stocks across an
Afromontane landscape. Ecol. Indic., 52: 394-403.

32
Highlights:

► MONF is proposed for flood susceptibility modeling.

► MONF has high performance on the training and validation datasets.

► MONF outperforms benchmark models i.e. J48DT, RF, MLP Neural Nets, and SVM.

33

Вам также может понравиться