You are on page 1of 9

Prediction of Driver Injury Severity in Road Crashes using Artificial Neural Networks- Literature Review

Ashutosh Arun QHS Trainee Central Road Research Institute, New Delhi Email:

The two papers that have been reviewed in this paper investigate the factors governing the injury severity outcome of a road crash using the non-parametric Artificial Neural Networks. In the first paper1, the authors have tried two well-known neural networks paradigms, namely, Multi-Layer Perceptron (MLP) and Fuzzy Adaptive Resonance Theory (ART) neural networks, to establish relationships between the driver injury severity and driver, vehicle, road and environment factors. Utilizing the crash data for 1997 on two-vehicle crashes at signalized intersections for the Central Florida area, they showed that MLP neural network has a better generalization performance of 65.6 and 60.4 percent for the training and testing phases, respectively than Fuzzy ART. Then comparing the outcomes with that of an Ordered Logit Model, they argued that MLP was indeed the better of the two. The same authors then attempted to study the traffic safety at the toll plazas using MLP and Radial Basis Function (RBF) neural networks. They used these two paradigms firstly, to examine the positional distribution of the crash locations (i.e. before, at or after toll plaza) and secondly, to predict the driver injury severity using the same covariates as in the previous research. They also developed Ordered and Nested Logit Models and compared the performances of both the techniques. They found Nested Logit to be the best model for crash location analysis while RBF neural networks work best for Driver Injury Severity Analysis.

Several methods have been employed to determine the important factors that affect the severity outcome of a road crash. Conventional statistical models such as Multiple Linear Regression, Generalized Linear Models and Generalized Estimating Equations have been extensively used in the past. Abdelwahab and Abdel-Aty1 reviewed extant literature and reported that various forms of logistic regression such as the log-linear model, ordered logit model, sequential logit model and nested logit model have been generally used for the purpose of analysis. As per their review the driver and vehicle-related factors largely govern the severity outcome. Artificial Neural Networks are non-parametric approaches and hence the knowledge of an appropriate probability distribution characterizing the input-p\output relationships apriori is not required. Abdelwahab and Abdel-Aty1 report them to have been used successfully in several transportation problems including road crashes. Thus they use ANNs to predict the driver injury severity at the signalized intersections given that a crash has occurred. Two popular ANN architectures were investigated for the purpose; Multi-Layer Perceptron (MLP) and Fuzzy Adaptive Resonance Theory (ART) neural networks. The output was classified in one of the three injury severity levels considered; no injury, possible/evident injury (minor injury), and disabling injury/fatality. The factors that they studied were: Driver characteristics: age, gender, alcohol usage (driving under the influence or not), fault (at fault or not), seat belt usage, and speed. Vehicle characteristics are vehicle type and point of impact. Roadway/environment conditions are area type (rural versus urban), day (weekend versus weekday), time (off peak versus peak), light condition (daylight versus night), and weather condition (clear versus not clear). However, Abdelwahab and Abdel-Aty2 note that not much research effort has been dedicated towards studying the traffic safety aspect related to toll plazas and the impact of Electronic Toll Collection (ETC) systems on highway safety. They cite various papers to argue that though numerous studies have been dedicated towards developing tools and guidelines for toll plaza design, there are no design standards at present and hence the designs are mostly ad-hoc depending upon the

experiences of the toll operators. Further, they argue, while many studies have only brought forth the traffic operational benefits of the ETC systems, the safety aspects have mostly been neglected though there have been studies showing that the Epass system increased the potential for all crash severity types. Thus, they applied MLP and a potentially stronger technique, the Radial Basis Function Technique to study the above two aspects. Moreover, they also investigated the suitability of Ordered Logit and Nested Logit Functions as well.

Artificial Neural Networks
An ANN is developed mimicking the problem solving capabilities of a human brain. Like a human brain, the ANNs also feature a network of simple processing units (neurons). They are connected by unidirectional links (connections), which carry numeral data. The processing ability of the network is then stored in the strength of these connections known as synaptic weights, obtained by a process of learning from a set of training patterns. ANNs have been highly successful in mapping non-linear input-output relationships through their hidden layers of neurons. Besides that, Abdelwahab and Abdel-Aty1 they offer certain other advantages by way of their generalization, adaptive and fault tolerance capabilities. An ANN is characterized by three features: network architecture, model of a neuron, and learning algorithms. A multilayer ANN has three layers of neurons; an input layer (to which input is presented), an output layer (at which the output is produced) and hidden layer/s (where activation functions are applied on the input). They can have any number of neurons. While the input and hidden layers have one bias node each, no such node is required for the output layer. The ANN architectures reviewed here are all feedforward type. The neurons are the information-processing units and two elements are essential for its functioning;

A formula for calculating the net input of a neuron An activation function to define the output of a neuron in terms of activity level at its input. For example, logistic, hyperbolic tangent, and Gaussian functions are generally adopted for representing the activation.

The operations of a multilayer ANN is divided in two phases: the training phase and the testing phase. A learning algorithm is used to train the network, while a testing algorithm is used to test the network. The training phase works as follows: given a input-desired output patterns [X(1), D(1)], . . . , [X(N), D(N)], the objective is to map [X(1) to D(1)], . . . , and [X(N) to D(N)]. A quadratic error function is defined as

where, Q() = error function to be minimized, = vector of network parameters and Yi(n) = actual output calculated by the network at node i. The objective is then to minimize this error function by changing the values of . In the testing phase a testing pattern [X(1), D(1)], ... [X(M), D(M)], where M is the number of patterns in the testing list, is presented to the trained network to find an output. Abdelwahab and Abdel-Aty2 provide the following formula for checking the classification accuracy:

where; %CC = percentage of correct classification of the trained neural network. In both the papers, the authors have preferably used the Back-propagation algorithm for the training phase, although in (1) they also use Levenberg-Marquadt algorithm as well for training the MLP. Abdelwahab and Abdel-Aty2 provide detailed

procedures for the computation through BP algorithm for MLP and RBF neural networks.

Figure 1: A simple BP algorithm (Courtesy: Abdelwahab and Abdel-Aty, 2001)

Abdelwahab and Abdel-Aty1 define Fuzzy ARTMAP as a clustering algorithm that maps a set of input vectors to a set of clusters. They used it because they were the most recent models in the ART family. The fuzzy ARTMAP neural networks consist of two fuzzy ART modules, ARTa and ARTb, and an inter-ART module1. Inputs are presented at the ARTa module, and their corresponding outputs to the ARTb module. The Inter-ART module is equipped with a MAP field to determine whether correct mapping has been established. The authors cite Carpenter et al. study for further reference on Fuzzy ARTMAP. In their study, Abdelwahab and Abdel-Aty1 used K-Means clustering to identify a fixed order of presenting the input pattern to fuzzy ARTMAP for training because its performance is affected by the order of pattern presentation. Hence, they called their model ordered fuzzy ARTMAP or O-ARTMAP.

Abdelwahab and Abdel-Aty1 analyzed the 1997 crash data for the Central Florida area focusing on two-vehicle crashes occurring at signalized intersections. The data had 1168 cases involving 2336 drivers. Of these 2000 cases (85%) were used for the training phase and the remaining 336 (15%) for the testing phase. They used Categorical Data Analysis as well as ANN approach to identify the candidate variables from the full list of variables for analysis. With ANN, they started with a full model containing all variables and then compared its performance with a model which had excluded a particular variable. In this way, the significance of each input variable was assessed.

The MLP was used for normalized data (in the range -1 to 1) with LevenbergMarquardt algorithm. All runs had been carried out with a maximum number of epochs of 100 and with 0.0001 MSE as the goal value. The input layer had nine neurons that represent eight different factors, and the output layer had three neurons that represent the three injury levels. All transfer functions at the hidden layer and the output layer were hyperbolic tangent sigmoid transfer functions. For the O-ARTMAP, again the data was normalized in the range of [0,1]. Then Kmeans clustering was used for ordering purpose. The size of the network that OARTMAP created was 285 nodes at the ARTa module and 3 nodes at the ARTb module. In the second paper (2), they used the same method for screening the significant variables in case of ANN. However, they made use of Wald and likelihood-ratio tests in case of Logit model. Here, they used the 1999 and 2000 accident reports for the Central Florida expressway system consisting of 10 main-line toll plazas and 42 on/off-ramp toll plazas. They used the following variables in this study: Accident-related factors: accident location with respect to plaza structure, accident type, and number of vehicles involved in the accident. Driver factors: age, gender, driver license type, alcohol involvement, driver violation, stopped in E-Pass lane or not, and whether the driver is an E-Pass user or not. Vehicle characteristics: vehicle type, point of impact, number of impacts, and speed ratio (ratio between running speed at the time of crash and the posted speed at that section). Plaza factors include the type of toll plaza (main-line versus ramp). The road/environmental factors include weather condition, lighting condition, and time and day of the accident. They used 725 cases for the modeling purpose, of which 600 (85%) were used for the training phase and 125 (15%) were used for the testing phase. For crash location models, they found that MLP worked best with 10 hidden nodes and RBF with 35 hidden nodes. For injury severity models, the MLP had 15 hidden nodes and RBF had 40 hidden nodes.

In both the studies MATLAB neural networks toolbox was used to develop the ANN models. Besides, for nested logit models, the full information maximum likelihood (FIML) estimation method was used.

Abdelwahab and Abdel-Aty1 compared MLP and O-ARTMAP and found the former to have higher classification accuracy (65.6 and 60.4% respectively for training and testing phases, respectively). When compared to an Ordered Logit model, the logit modelhad an accuracy of 58.9 and 57.1 percent for the training and testing phases, respectively. Thus, MLP gave the best performance of all models. To start prediction using MLP, they carried out a simulation experiment and found the following results: Rural intersections are more dangerous in terms of driver injury severity than urban intersections. Female drivers are more likely to experience a severe injury than are male drivers. Speed ratio increases the likelihood of injury severity. At-fault drivers have less likelihood of experiencing severe injury than do those not at fault. Wearing a seat belt decreases the chance of having severe injuries. Drivers of passenger cars are more likely to experience a higher injury severity level than are drivers of passenger vans or pickup trucks. Drivers exposed to impact at their side experience a higher injury severity level than those exposed to impact elsewhere. Abdelwahab and Abdel-Aty2 showed that the Nested Logit models were the best for crash location analysis (testing classification accuracy of 63.8%). The model elasticity values showed that plaza type, peak period, vehicle type, and E-Pass use affect the likelihood of crash location. They found the RBF neural network as the best model for analyzing driver injury severity (testing classification accuracy of 79.2%). They performed a simulation experiment with the RBF network gave the following results:

Older drivers tend to have a higher risk of being injured in traffic accidents than younger drivers. Female drivers have a higher chance of experiencing a severe injury compared to male drivers. E-Pass users have a higher chance of being injured when involved in accidents. Wearing a seat belt decreases the chance of having severe injuries. Drivers in passenger cars are more likely to experience higher injury severities. Also, drivers stopped in E-Pass lanes are more likely to experience severe injuries.

They also suggested that the lane markings and illumination in the approach zone of a toll plaza be improved and essential warning signs shall be placed at the approach and downstream of toll plaza. The drivers should not stop in the ETC lanes and the width of these lanes should be proper to accommodate large trucks.

The two papers reviewed provide a useful insight into the application of ANNs into crash modeling. Through their results they have argued that MLP provides a reasonable statistical accuracy when used for modeling driver injury severity. Though, in (2) they showed RBF to be providing superior results than MLP, they have used the same training algorithms for the two models and hence, there remains a possibility of better performance by the MLP in case some other training algorithms (LM algorithm for example) are used. Moreover, since they do not provide details about the decision criteria to stop the iterations, another possibility for performance improvement is if the criteria are appropriately chosen. Driver related factors such as age, gender, speed and wearing of seat belts etc, have been shown to be significant in both the studies and hence should be thoroughly investigated in subsequent research.

1. Abdelwahab, H.T., and Abdel-Aty, M.A. (2001). Development of Artificial Neural Network Models to Predict Driver Injury Severity in Traffic Accidents at Signalized Intersections. Transportation Research Record: 1746, pp. 613. 2. Abdelwahab, H.T., and Abdel-Aty, M.A. (2002). Artificial neural networks and logit models for traffic safety analysis of toll plazas. Transportation Research Record 1784, pp. 115-125.