Вы находитесь на странице: 1из 7

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 3, MARCH 2011, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 130

Data Mining Based on Extreme Learning Machines 
for the Classification of Premium and Regular 
Gasoline in Arson and Fuel Spill Investigation
S. O. Olatunji, Imran A. Adeleke, Alaba Akingbesote

Abstract—  In this work, we developed a data mining approach based on extreme learning 
machines   (ELM)   for   identifying   gasoline   types.  Detection   and   correct   identification   of 
gasoline types during Arson and Fuel Spill Investigation are very important in forensic 
science.  As the number of arson and spillage becomes a common place, it becomes more 
important to have an accurate means of detecting and classifying gasoline found at such 
sites of incidence. However, currently only a very few number of classification models have 
been explored in this germane field of forensic science, particularly as relates to gasoline 
identification.   The   proposed   model   was   constructed   using   gas   chromatography–mass 
spectrometry   (GC–MS)   spectral   data   obtained   from   gasoline   sold   in   Canada   over   one 
calendar year. Prediction accuracy of the model was evaluated and compared with earlier 
used methods  on the same datasets. Empirical results from simulation showed that  the 
proposed   ELM   based   approach   achieved   better   performance   compared   to   other   earlier 
implemented techniques. 
Index Terms— Data mining, Extreme Learning Machine; gas chromatography–mass spectrometry; Pattern recognition;
Principal Component Analysis; Artificial Neural Networks.
• S.O. Olatunji  is on study leave from the Computer Science Department, Adekunle Ajasin University Akungba Akoko, Ondo State Nigeria
• Imran A. Adeleke is with the Faculty of Computer Science and Information Systems, University technology Malaysia
• Alaba Akingbesote is with the Computer Science Department, Adekunle Ajasin University, Akungba Akoko, Nigeria
——————————  —————————— is used to fingerprint fuel spills, with the gas
I. INTRODUCTION chromatograms of the spill sample and the
The importance of detection and accurate different candidate fuels compared visually in
classification of gasoline for both arson and order to seek a best match. This method has some
environmental spills investigation cannot be over- shortcomings. One problem with this technique is
emphasized. In this work, ELM was used to that the interpretation and classification of the data
classify premium and regular gasolines from gas is limited by the skill and experience of the
chromatography–mass spectrometry (GC–MS) analyst. Also, visual analysis of gas
spectral data [1, 2] obtained from gasoline sold in chromatograms is subjective and is not always
Canada over one calendar year [3, 4]. In arson, persuasive in a court of law. Pattern recognition
petroleum-based accelerants such as gasoline, methods offer a better approach to the problem of
kerosene, and paint thinners are often used to matching gas chromatograms of weathered fuels
accelerate a fire. In some cases, liquid accelerant [5]. Pattern recognition methods involve less
is left at the scene, which may be matched to subjectivity in the interpretation of the data and
samples that are associated with the suspect. In the are capable of identifying the samples correctly.
environment, gasoline spills are commonplace, but Among the pattern recognition methods that
identification of the source is not always were used for this identification purpose was
straightforward. artificial neural network and principal component
The identification of gasoline is crucial for the analysis (PCA) [6]. Unfortunately, accuracy of
successful prosecution of an offending individual some of these earlier approaches is often limited
and/or company. Frequently, gas chromatography and is sometimes bedeviled with problems like
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 3, MARCH 2011, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 131

over-fitting. Recently, ELM has been proposed as II. EXTREME LEARNING MACHINES
a new intelligence framework for both prediction Extreme learning machine (ELM) is a recently
and classification [7-11]. It has featured in a wide introduced learning algorithm for single-hidden
range of journals, often with promising results. layer feed-forward neural networks (SLFNs)
In this work, we developed ELM based which randomly chooses hidden nodes and
identification model for identifying gasoline types. analytically determines the output weights of
The model is constructed using gas SLFNs. In general, the learning rate of feed-
chromatography–mass spectrometry (GC–MS) forward neural networks (FFNN) is time-
spectral data obtained from gasoline sold in consuming than required and this is becoming
Canada over one calendar year as contained in [6]. bottleneck in their applications. According to [9],
Prediction accuracy of the model is evaluated and there are two main reasons behind this behavior,
compared with earlier used methods on the same one is slow gradient based learning algorithms
datasets. Based on the excellent performance of used to train neural network (NN) and the other is
ELM on various identification problems surveyed the iterative tuning of the parameters of the
in the literature coupled with the empirical results networks by these learning algorithms. To
from our simulation, we found out that ELM overcome these problems, [7-9] proposed a
based model produced accurate and promising learning algorithm called extreme learning
results better than or at least equal to the best machine (ELM) for single hidden layer feed-
among the other earlier implemented methods on forward neural networks (SLFNs). It is stated that
the same datasets. To demonstrate the usefulness “In theory, this algorithm tends to provide the best
of the Extreme Learning Machines technique as generalization performance at extremely fast
regards spill and arson investigation in particular learning speed since it is a simple tuning-free
and forensic science in general, we described both algorithm” [8].
the steps and the use of Extreme Learning The advent of ELM has been viewed as good
Machine as a Pattern Recognition modeling and timely happening, as in the past, it seems that
approach for identifying liquid accelerant left at there exists an unbreakable virtual speed barrier
the scene of arson and spill which may be matched which classic learning algorithms cannot break
to samples that are associated with the suspects. through and therefore feed-forward network
An ELM classifier has been developed and used implementation then take a very long time to train
to identify gasoline types in arson and spill itself, independent of the application type whether
investigation. Comparative studies were also simple or complex. Also ELM tends to reach the
carried out to compare the performance of ELM as minimum training error as well as it considers
a classifier with the performance of other magnitude of weights which is opposite to the
classifiers that were already used for this same classic gradient-based learning algorithms which
purpose, using the same datasets, such as ANN only intend to reach minimum training error but
and PCA. This study also presents a comparison do not consider the magnitude of weights. Also
of ELM to PCA and ANNs for the classification of unlike the classical gradient-based learning
liquid premium and regular gasoline from their algorithms which only work for differentiable
GC–MS chromatograms. The ANN referred to activation functions ELM learning algorithm can
was trained using the classical back-propagation be used to train SLFNs with non-differentiable
learning algorithms [6], for interested readers, activation functions [7]. According to [9], “Unlike
details and structures of both PCA and ANN the traditional classic gradient-based learning
referred to can be found in [6]. ELM have been algorithms, like back-propagation method, facing
used in the past for many different types of pattern several issues like local minimum, improper
recognition problems, but this is the first report of learning rate and over-fitting, etc, the ELM is a
applying ELM for the classification of summer simple tuning-free three-step algorithm that tends
and winter, premium and regular grade gasoline to reach the solutions straightforward without such
from GC–MS chromatograms data. trivial issues”.
A. The Learning Process for the Proposed
ELM BASED Framework
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 3, MARCH 2011, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 132

Let us first define the standard SLFN (single- thus the training procedures for the proposed ELM
hidden layer feed-forward neural networks). If we based framework can be summarized in the
have N samples (xi, ti), where xi = [xi1, xi2, … , following algorithmic steps. See [8, 9] for further
xin]T ∈ Rn and ti = [ti1, ti2, … , tim]T ∈ Rm, then the details on the workings of ELM algorithm.
~
standard SLFN with N hidden neurons and
activation function g(x) is defined as: Input- The inputs to the system, include the
N% inputs parameters-i.e. independent variables (input
∑ β i g ( wi . x j + bi ) = o j , j =1,..., N 1 xi ∈ Rn ) and target variable – i.e. dependent
i =1 variable (ti ∈ Rm), activation function, and the
~
T number of hidden neuron, N.
where wi = [wi1, wi2, … , win] is the weight
vector that connects the ith hidden neuron and the Output- The outputs of the ELM system are the
input neurons, βi = [βi1, βi2, … , βim]T is the weight target output values and weights of the layer.
vector that connects the ith neuron and the output Mathematically, given a training set
{
neurons, and bi is the threshold of the ith hidden N = ( xi , t i ) | xi ∈ R , t i ∈ R , i =1, ..., N ,
n m
}
neuron. The “.” in wi . xj means the inner product activation function g(x), and the number of hidden
of wi and xj. neuron = N%, then, do the following:
SLFN aims to minimize the difference between oj
and tj. This can be expressed mathematically as: Step 0: Initialization. Assign random values to
N% ~
∑ β i g ( wi . x j + bi ) = t j , j =1,..., N (2) the input weight wj and the bias bj, j = 1, ... , N
i =1 Step 1: Find the hidden layer output matrix H.
Step 2: Find the output weight β as follows:
Or, more compactly, as: β = H†T
Hβ =T (3) where β, H and T are defined in the same way they
were defined in the SLFN specification above
(equations 1, 2, and 3).
where
III. EMPIRICAL STUDIES
H ( w1 , ... , w
%
N
, b1 , ...,N
b
% , x1 N, =
..., Ψ
x ) (4a)
A. Experimental data
With The experimental data was originally taken
from a Canadian Petroleum Products Institute
 g ( w1 . x1 + b 1 ) ... g ( wN%. xN%+ b N%)  Report of the composition of unleaded summer
  and winter gasolines in 1993 [5] and also reported
 . .  in [6]. In this report, 44 samples of regular
Ψ = . ... .  gasoline (22 winter, 22 summer), and 44 samples
 
 . .  of premium gasoline (22 winter, 22 summer) were
 g ( w . x + b ) ... g ( w %. x + b %)  analyzed by GC–MS. The gasolines were
 1 N 1 N N N  N × N% collected over the course of one year from
β1T T1T different regions across the country. Forty-four
    compounds that may be present in automotive
 .  . gasoline in concentrations of greater than 1% were
β =  . and T =  . (4b) reported.
   
 .  .
β T T T%
 B. Determining the Optimal Parameters through
 N%
N%×m  N N ×m Cross-Validation

As proposed by Huang and Babri [12], H is Percentage of correctly classified samples is


called the neural network output matrix. employed as a criterion for the determination of
With the above SLFN specification background, the ELM parameters using test-set-cross-
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 3, MARCH 2011, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 133

validation method. These criteria also provide a parameters. The optimal values of the parameters
more accurate evaluation of the classifier. and the kernel option associated with the best
An experimental procedure based on test-set- performance measure were identified. A summary
cross-validation was employed in our study. We
of the procedure is as follows.
used the stratified sampling approach to divide the
data set into both training and testing data, such Step 1: Choose the initial “activation function”
that the size of the training set is 70% of the option from the list of available options.
available data and the testing is the rest. The Step 2: Identify the best values of the number
parameters associated with the ELM were hidden neuron through a test-set –cross-
optimized through a test-set-cross-validation on validation and store the corresponding
the available data set. This entire process was performance measures.
repeated 10 times with different random splitting Step3: If there is no activation function option
of the training and testing data sets using the left, then go to Step 4. Otherwise, add the
stratified sampling approach; the final results were next activation function option and go to
averaged over 10 runs. Step 2.
The details of the test-set-cross-validation for Step 4: Identify the best performance measure and
optimizing the ELM parameters goes thus: For its associated parameters values.
each run of generated training and testing set, the Step 5: Use the optimized activation function
percentage of correctly classified samples for a option and the parameters values to train
group of parameters, viz: number of hidden the final ELM
neuron and activation function type were noted. Step 6: Calculate the performance measures for
Searching through all possible values of the both the training and testing sets using the
parameters in a given range will identify the best system obtained in the previous Step 5.
value of the performance measure and the This is presented in mathematical form as
corresponding values of the parameters for the follows:
fixed set of features. The optimal values of the Let the set A contains all the possible activation
parameters with the best performance measure functions options, the element of A is of
were identified. A summary of the procedure is as
follows. the shape Ai ( j ) , where i is the activation
C. Optimal parameters search procedure for function number, j is the selected number
ELM of hidden neuron, nf is the total number of
The parameters associated with the ELM were activation functions available, and nh is the
optimized through a test-set-cross-validation on maximum number of hidden neuron
the available data set. The details of the test-set- assumed. Also pm represents performance
cross-validation for optimizing the ELM measure taken, ix represents index for best
parameters goes thus: For each run of generated activation function, jx represents index
training and testing set, the values of RMSE and for best number of layer.
The algorithm then goes thus:
Correlation Coefficient were monitored for a
group of parameters that include number of hidden Initialization; jx=0, vx=0, ix=0
neurons (N) and activation functions (AF). for i = 1 → nf
Searching through all possible values of the for j=1 j = 1 → nh
parameters in a given range will identify the best pm = f ( Ai ( j )) {Performance measure
performance measures and the corresponding
for the present parameters combination}
values of the parameters for the fixed set of
if pm is better than vx then vx = pm
features. In our experiment, this process was
ix = i ; jx = j {storing the index of the
repeated for every ELM activation function
better parameter}
available, each time with an incremental step of
end
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 3, MARCH 2011, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 134

end

The optimum parameters identified through the


above procedures are then used to build the final
ELM whose results were compared with the
results of the earlier implemented models. The
ELM experimentation carried out were subjected
to the same conditions as earlier implemented
models, for instance the data set were divided into
50% training set and 50% testing set.

IV. RESULTS AND DISCUSSION


We present here the results of classifying the
gasoline into regular and premium in the first
instance and then the classification into premium Figure 1: Graphical illustration of the percentage
winter, regular winter, premium summer and of correctly classified regular and premium
regular summer, making four classes in this gasoline
second case while the first case has just two
classes. B. Results of Classification into four groups of
premium winter, premium summer, regular
A. Results of Classification into two groups of winter and regular summer gasoline
premium and regular gasoline
In this section, the results of classifying
In this section, the results of classifying gasoline to either premium winter, regular winter,
gasoline to either premium unleaded or regular premium summer, or regular summer are
unleaded are presented as follow using PCA, ANN presented as follow in table 2 (illustrated in figure
and ELM. 2), using PCA, ANN and ELM. This second
From the results in table 1 (illustrated in figure classification is necessary because in a climate
1), we found that ELM and ANN performed at par such as the one in Canada, where the current
with PCA performing lower than the two. sample under investigation was obtained, there is
a requirement for a higher fuel vapour pressure for
Table 1: The percentage of correctly classified cold engine-starting in the winter and a lower fuel
regular and premium gasoline vapour pressure to prevent vapour lock in the fuel
line in the summer. Therefore, it was expected that
the winter gasolines would tend to have more
volatile compounds than the summer gasolines
and that significant variations between winter and
summer gasoline would be detectable.
From the reported results of the two different
classification cases, contained in table 1 & 2 and
figure 1 & 2, we found out that, as the number of
classes increases from two to four, so also the
performance of PCA and ANN decreases
compared to that of ELM. In fact PCA performed
very badly as for the four-class case with just
47.72 % correct classification for the testing data
set. Though ANN still performed but its testing
performance is far lesser than that of ELM. Thus,
ELM has again distinguished itself here as a
viable tool for correctly classifying gasoline in the
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 3, MARCH 2011, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 135

field of forensic science, during arson and oil correct classification just like the ELM did. But on
spillage investigation. four-class classification of gasoline, ANN
performance on the testing set reduced to 84.09%
which is far lower than that of ELM that stayed at
Table 2: The percentage of correctly classified 96.47%. Thus we conclude here that ELM has
PUW, RUW, PUS, and RUS again distinguished itself as a viable tool in the
field of forensic science for correct classification
of gasoline samples in the course of arson and oil
spill investigations. Also it could also serve as
powerful classificatory tool in other fields of
forensic science like in the identification of broken
glasses found at the scene of crime during arson
investigation and the likes.

REFERENCES

1. ASTM Method E1618-97: ‘Standard Guide for Identification of


Ignitable Liquid Residues in Extracts from Fire Debris Samples by
Gas Chromatography–Mass Spectrometry. ’ 1997
2. Ichikawa, M.N., Nonaka, I., and Takada, S.I.: ‘Mass
spectrometric analysis for distinction between regular and premium
motor gasolines’, Anal. Sci, 1993, 9, pp. 261–266
3. Lavine, B.K., Brzozowski, D., Moores, A.J., Davidson, C.E., and
Mayfield, H.T.: ‘Genetic algorithm for fuel spill identification’,
Analytica Chemical Acta, Elsevier, 2001, 437, (2), pp. 233-246
4. Canadian Petroleum Products Institute: ‘Composition of
Canadian Summer and Winter Gasolines’, 1993, Report No. 945
5. Tan, B.J., and Hardy, R.S.: ‘Accelerant classification by gas
chromatography / mass spectrometry and multivariate pattern
recognition’, Analytica Chemical Acta, Elsevier, 2000, 422, pp. 37-
46
6. Philip, D., Mark, S., Eric, D.P., Peter, P., Claude, R., and
Figure 2: Graphical illustration of the Michael, D.: ‘Classification of premium and regular gasoline by
percentage of correctly classified PUW, RUW, gas chromatography/mass spectrometry, principal component
analysis and artificial neural networks’, Forensic Science
PUS, and RUS International, Elsevier, 2003, 132, pp. 26-39
7. Huang, G.B., Zhu, Q.Y., Mao, K.Z., Siew, C.K., Saratchandran,
P.N., and Sundararajan, N.: ‘"Can threshold networks be trained
directly?”’, IEEE Trans. Circuits Syst. II, 2006, 53, (3), pp. 187-
V. CONCLUSION 191
8. Huang, G.B., Zhu, Q.Y., and Siew, C.K.: ‘Extreme learning
A data mining approach based on extreme machine: Theory and applications 70 (2006) 489–501’, Neuro-
learning machine has been implemented as computing, ELSEVIER, 2006, 70, pp. 489-501
classificatory model in this work, for classifying 9. Huang, G.B., Zhu, Q.Y., and Siew, C.K.: ‘Extreme learning
machine: a new learning scheme of feedforward neural networks’.
gasoline types. It has been shown that PCA Proc. Proceedings of International Joint Conference on Neural
performed to some extents in the classification of Networks (IJCNN2004), Budapest, Hungary, 25–29 July 2004 pp.
gasoline to either premium or regular, though with Pages
10. Li, M.B., Huang, G.B., Saratchandran, P.N., and Sundararajan,
lower accuracy compared to ANN and ELM. But N.: ‘Fully complex extreme learning machine’, Elsevier,
PCA performed very poorly when it was used to Neurocomputing, 2005, 68, pp. 306-314
sub-classify the gasoline samples into their 11. Zhu, Q.Y., Qin, A.K., Suganthan, P.N., and Huang, G.B.:
‘Evolutionary extreme learning machine’, Elsevier, Pattern
respective summer/ winter grouping thereby Recognition, 2005, 38, pp. 1759-1763
resulting in four classes. As for ANN, it 12. Huang, G.B., and Babri, H.A.: ‘Feedforward neural networks
performed excellently for the two-class with arbitrary bounded nonlinear activation functions. 9(1):224–
229’, IEEE Trans Neural Network, 1998, 9, (1), pp. 224-229
classification of the gasoline samples with 100%
JOURNAL OF COMPUTING, VOLUME 3, ISSUE 3, MARCH 2011, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 136

S. O. Olatunji received the B.Sc. (Hons) Degree in Computer Science,


Ondo State University Ado Ekiti, Nigeria (1999), M.Sc. Computer Science,
University of Ibadan, Nigeria (2003), M.S. Degree in Information and
Computer Science, King Fahd University of Petroleum and Minerals
(KFUPM), Saudi Arabia (2008). He is currently pursuing his PhD in
Computer Science. He is a member of ACM and IEEE. He has several years
of experience as a lecturer of computer science, and authored several
research papers.

Imran A. Adeleke is a PhD candidate at the Department of information


Systems, Faculty of Computer Science and Information Systems, Universiti
Teknologi Malaysia. He received his M.Sc. and B.Sc. in Computer Science
at University of Agriculture, Abeokuta and University of Benin, Benin‐city,
Nigeria respectively. He is a member of AIS and ACM.

Alaba Akingbesote is with the Computer Science Department, Adekunle  
Ajasin University, Akungba Akoko, Nigeria. He received the B. Sc. And 
M.Sc. in Computer Science from  Ogun State University, Ago­Iwoye and  
Federal   University   of   Technology,   Akure,   Nigeria   respectively. He is
currently pursuing his PhD in Computer Science. He has several years of
experience as a lecturer of computer science.

Вам также может понравиться