Вы находитесь на странице: 1из 20

Credit Card Fraud Detection System by using Naive

Bayes, Generative Adversarial Network and Neural


Ishtiaq Alia , Sohail Asghara , Aziz ul Hassana , Ijaz Hussaina,∗ .

a Computer Science Department, COMSATS University Islamabad, 44000, Pakistan.


During the past few years, the use of e-commerce has grown to a large scale.
Due to which, the use of credit card has also been increased. Many people now
use credit cards for online shopping, e-billing and other online payments. This
frequent use of credit cards is pushing the organizations and banks to implement
credit card fraud detection systems to distinguish between illicit and legitimate
transactions. These systems have trained in pre-existed datasets and then ap-
plied to the new transactions. Many techniques are used to detect fraudulent
transactions such as Genetic Algorithm (GA), Support Vector Machine (SVM)
and Artificial Immune System (SVM). In most of the techniques, the classifica-
tion results are biased towards the majority class due to this biasness False Pos-
itive Rate (FPR) and False Negative Rate (FNR) are maximized. To overcome
this problem, we have implemented three techniques, i.e. Naive Bayes (NB),
Generative Adversarial Networks (GAN) and Neural Networks (NN). The final
results are then compared in terms of accuracy, precision, recall and f-measure.
Our main objectives are to minimize the FPR and FNR, which ultimately im-
proves the identification of fraudulent and legitimate transactions. The results
show that NN outperforms NB and GAN in terms of accuracy, precision and
Keywords: Credit Card Fraud Detection, Classification, Naive Bayes,

∗ Corresponding author
Email address: ijazhussain7979@hotmail.com (Ijaz Hussaina,∗ .)

Preprint submitted to Elsevier May 24, 2018

Generative Adversarial Network and Neural Networks.

1. Introduction

Over the last decade, e-commerce has grown astonishingly for online pay-
ments [1]. For online payments, credit cards are commonly used now, which
consequently opens doors to many types of frauds [2]. Now for all credit card
issuing and online payment management organizations, an implementation of
an effective fraud detection solution is very important, in order to simultane-
ously improve customers confidence and reduce losses [3]. The purpose of fraud
detection system is to find out the doubtful usage pattern from a bunch of trans-
actions, where legitimate transactions are combined with illicit transactions, by
using data mining techniques and sophisticated analytics [4]. This means to an-
alyze large datasets and performing machine learning to discriminate legitimate
transactions from illicit transactions. For credit card fraud detection, machine
learning is extremely effective, particularly supervised classification techniques,
where for building a detection model a classifier is trained by pre-classified
datasets containing labeled transactions. The build model is capable to find
out anomalous transactions among legitimate ones. The class imbalance prob-
lem affects the supervised classification approach because there are very small
amount of illicit transactions available against legitimate ones [5]. Because of
class imbalance, legitimate transaction class is more represented in binary clas-
sification. On the other hand, for the rare class the number of examples is very
low which is discarded by learning algorithm, consider the minority class as
noise and ignore it and classify all records as majority class instances [6]. Ma-
chine learning algorithms are not appropriate for imbalanced datasets because
it typically aims to maximize accuracy[7]. In machine learning classification bi-
asness has recorded toward the class with a majority number of instances in the
training data. The classification error for majority class in training dataset is
lower than the classification error of classes with fewer instances. However, the
prior probabilities of classes are not always reflected by class frequencies in the

training data. In many binary classification systems implemented practically
like network intrusion prevention, fraud detection or medical diagnosis support,
False Positive Rate (FPR) or False Negative Rate (FNR) may have many ben-
efits or costs [8] and it might be necessary to control, the compromise between
these errors to a certain degree. Neyman-Pearson (NP) framework [9], the goal
is to minimize FNR, to achieve the goal the condition has implied on FPR that
it will not exceed the maximum value of FPR which is set as α.
When the class of minority is the class of interest it occurs quite naturally, as
in our case. The FPR and FNR will increase because traditional techniques like
a Support Vector Machine (SVM), Artificial Immune System (AIS) and many
machine learning techniques aims to maximize accuracy and treat minority class
as noise and ignore its instances.
Our main objectives are to minimize FPR, FNR and improve the accuracy to
detect fraudulent transactions. For this, we used Naive Bayes (NB), Generative
Adversarial Network (GAN) and Neural Networks (NN). GAN is appropriate
for credit card fraud detection because GAN generate fraud transaction from
noise value. These values are then combined with the training set to remove
class imbalance from the data. We used GAN for removing class imbalance
from data and then train a binary classifier on that data. The main problem in
GAN is stability, stability means that both the generator and discriminator can
be trained equally. If the generator outperforms discriminator then the GAN
will not be trained correctly and if the discriminator outperforms generator
then GAN will not trained accurately. Back propagation algorithm has used
for training NN. During training NN by using back propagation algorithm the
predicted values are compared with actual values to find out the error. If, the
error is greater than a constant value, then back propagate, re-assign weights
to layers and train the network again. By using back propagation in NN the
accuracy has increased and FPR and FNR have decreased.
The next section of our paper is dedicated to related work. In section 3
we described our used techniques. In section 4, we discussed our performed
simulations and results. Finally, we conclude our paper in section 5.

2. Related Work

Data mining based credit card fraud detection, has gained a serious atten-
tion from the researchers. Large volumes of data made available by many data
warehouses need to be carefully analyzed. The most promising and effective
solutions for credit card fraud detection are machine learning based supervised
approaches. In [10], for the first time credit card fraud detection has been
proposed using several data-visualization methods with supervised learning. In
[11], NB, K-Nearest neighbor algorithm and SVM has been applied for credit
card fraud detection. The authors also introduced bagging ensemble classifier,
based on decision tree which gives accurate results than traditional decision
tree algorithms. In [12], a model has been proposed to cure the credit card
fraud detection problem by combining Simple K Means and Principle Compo-
nent Analysis (PCA) algorithms. The geographical position of the client and
the transaction is added to traditional studied data to augment the model. The
proposed model gives best results for accuracy. However, the execution time
is increased because of k means process repetition for different initial clusters.
In [13], a hybrid approach of danger theory and SVM has been proposed for
credit card fraud detection. Danger theory removes a fraudulent transaction
and SVM then classify these transactions. The accuracy of credit card fraud
detection is improved by using danger theory because it removes bad causes
of data. However, execution time is increased because extra time is needed
to remove the abnormality from data. In [14], NN has been trained by using
an evolutionary simulated annealing algorithm to detect credit card frauds in
real time scenario. The proposed solution performed well in terms of time and
cost for both users and organization. However, by using this solution many
transactions are mis-classified, i.e. a fraudulent transaction is classified as gen-
uine or vise-versa. In [15], two data-driven approaches for real time scenario
based on optimal anomaly detection techniques have been proposed for fraud
detection. The efficiency of the approaches has been checked on real data of
European credit card holder. Both approaches provide good results on real

time data in terms of accuracy and false alarm rate. Both approaches provide
benefits to individual users and organizations in terms of time efficiency and
cost. However, when these approaches are applied to large datasets they did
not give good results in terms of accuracy. In [3], using SVM a personalized
system has been proposed which prevent credit card from fraud by using the
data which is collected in advance based on the behavior of consumers. In [16],
an algorithm has been proposed to successfully apply SVM on class imbalanced
data and results are compared with different algorithms. In [17], models based
on decision tree and SVM for credit card fraud detection have been proposed
and compare the results of the proposed models. In [18], a mechanism has been
implemented using Neuroph IDE, which uses NN to detect credit card fraud.
By implementing this mechanism, the classification is very accurate and errors
are within the maximum error rate. However, the iteration numbers are not
limited in advance, the mechanisms trained itself in the number of iterations
which are required, so when the number of iterations are greater the execution
time will be greater. In [19], a neural network has been designed for credit card
fraud detection, which use Genetic Algorithm (GA) for designing. GA has used
to find out best network topology, the number of nodes and the hidden layers
that are used in designing neural networks. In [20], AIS has been implemented
for credit card fraud detection, for parameter optimization GA and exhaustive
search have used and the results are compared with NN, NB and decision tree.
In [21], an AIS based fraud detection model has been proposed by using AIS
and immune system inspired algorithm. In [22], a case based genetic artificial
immune system for credit card fraud detection has been proposed which can
learn online with limited cost and time. In [7], minority class has been over-
sampled by duplicating minority class instances. However, by using this strategy
no informative contents are added to the dataset. ”A data mining with hybrid
approach based Transaction Risk Score Generation Model (TRSGM) for fraud
detection of online financial transactions” has been implemented in [23]. In [24]
and [25], for rule-based fraud detection system Decision tree classifiers has used
in which modified C4.5 algorithms has used. Real time fraud detection using

genetic algorithms and also minimize false alert [26]. In [27], GA and scatter
search have used to develop a credit card fraud detection system to minimize
mis-classification cost instead of mis-classification error.
We have the same goal to re-balance the training set and minimize FPR
and FNR as many of the above techniques. We are using GAN and NN for
re-balancing the training set. In, GAN the example, instances generated by
the generator are injected to the training set for balancing the minority class
instead of over-sampling. In, NN every time the predicted results are compared
with actual labels in a recursive pattern to minimize false predictions. Our
techniques, especially GAN and NN not only increase accuracy like machine
learning techniques, but also minimize FPR and FNR.

3. Credit Card Fraud Detection System

The credit card payment system is continuously targeted by fraudsters, that

is why enterprises have to govern transactions, to detect suspicious behavior and
try to prevent that behavior to preserve the trust of their customers in electronic
payment system [28]. We have proposed a credit card fraud detection system
to monitor transactions by using NB, GAN and NN. Our proposed system is
presented in Figure 1. The dataset has inserted as an input then preprocessing
has applied to that dataset and the data in a processed form are obtained
which are free from ambiguities and this preprocessed dataset has passed to the
techniques used.
NB, GAN and NN are trained and tested on a dataset of credit card transac-
tions taken from KAGGLE and their results are compared in terms of accuracy,
recall, precision and f-measure. We have taken precision, recall and f-measure
into account because we have to minimize FPR and FNR. We have compared
the results to find out which technique have high accuracy and minimum FPR
and FNR.



30% 70%
Testing Set Training Set


Adversarial Neural Network Naïve Bayes

Combined Generated
Data Data

XgBoost XgBoost
Classifier Classifier Predictions Predictions

Predictions Predictions

Figure 1: Credit Card Fraud Detection System.

3.1. Naive Bayes

NB classification is based on the Bayesian theorem. From 1950s NB has

been studied extensively. In 1960s, in text retrieval community it was introduced
with a different name and remains a popular method for text categorization; the

process to divide text documents into different categories. NB classifiers are very
scalable; in learning process it requires a number of parameters which are linear
in the number of variables or features. When the dimensionality of data is high,
it is well suited. In NB method of maximum likelihood is used for parameter
estimation. In spite of oversimplified assumptions, NB often performs well in
many complex real world situations. The data inserted is split into training and
testing set. Then, prior probabilities are calculated and on the basis of that
prior probabilities trained the model to get predictions. For testing the model,
test set has passed through the model and predictions are compare actual values
to find out accuracy.

3.2. Generative Adversarial Networks

GAN is composed of two feed forward neural networks, a Generator (G)

and a Discriminator (D) which are in competition with each other. G produces
new data instances from noise data and on the other hand D is used for quality
evaluation of these data instances. Generator and Discriminator both are deep
learning, neural networks with many layers, which are connected with each
other in a way that the result of one layer is the input to the immediate layer
units. This shows at each layer what is learned from the original inputs, layers
are related to composition capabilities or level of abstraction. By changing the
layers size and number of layers, one can achieve different levels of abstraction.
The level, which is afar from the original input the level of abstraction of its
representation is higher. By applying the concept which is mentioned above
deep networks can discover good hierarchical models, where abstract concepts
are learned at higher levels. These types of mechanisms are best for the precise
dataset in many real-world scenarios.
The main idea of GAN is to refine G that outface D, which has a goal to
distinguish between real and generated examples. The input of the generator
is a random noise n and use a function to transform it and produce examples,
while the discriminator job is to differentiate between the instances generated by
the generator and real instances. Generator learn the training data, probability

distribution by mapping n to these distributions and generate new examples
that look like the real data instances. The adversarial discriminator network
on the other hand has to correctly differentiate between the produced artificial
examples and the real data instances, to beat the artificial candidate production
activity of the generator.
The generator training goal is to trick the discriminator to believe that
the examples generated by the generator are real. Discriminator is trained to
minimize its prediction rate. On the other hand, the generator is trained in
such a way to maximize the prediction rate of the discriminator. This looks like
a competition between generator and discriminator and can be formalized as a
minimax game:

min max(Ex∼pD [log D(x)] + Ez∼pZ [log(1 − D(G(z)))]) (1)

θG θD

Where PD is the distribution of data, PZ is the generator network prior

distribution, θG is the parameter of generator network and θD is the parameter
of discriminator network.
The generator goal is to keep the difference between generated and real data
lower and the goal of the discriminator is the maximization of probability to
distinguish generated ones from the real one [4].
Training the GAN is not an easy task. Stability is the main critical issue
in training GAN. When training GAN, if the discriminator outperforms its
counterpart generator then the whole GAN would not be trained correctly. On
the other hand, if the discriminator performance is weaker than the generator it
will also result in a bad setting of GAN. Both the networks compete with each
other to beat one another that means that both networks for effective training
are highly dependent on each other. When a component fails against the other
due to the presence of high unbalancing, the whole GAN fails. Following are
the steps how GAN works.

1. Train a binary classification technique on original data, to identify hyper-

parameters for classifier that provides the best performance on the training

2. Take all the illicit transactions from the training set T and put it in another
set denoted by F.
3. Train GAN by using the set F, tuning its hyper-parameters.
4. Synthetic examples F is generated from random noise n by using the
trained generator G* of GAN.
5. Merge the training set with F and compare the performance of C trained
with augmented set and trained with the original training set.

3.3. Neural Networks

NN is an information processing network inspired by the biological nervous

system which enables a computer to learn from observational data. Novel struc-
ture of the information processing system is the key element information pro-
cessing paradigm. NN field was established before the advent of computers, but
computer simulation of NN has been started recently. Many advances have been
made in the field because of inexpensive computer emulations. Before computer
emulation the field survived a period of disrepute and frustration because of less
funding and lack of professionals.
NN has the ability to derive the meaning from imprecise or complicated data
because of that it can be used to extract pattern from data, which are hard for
humans and other computer to extract. NN is called to be an expert in the type
of information which is given for analysis when it is trained. This expert is then
used to predict when new information is given. NN is used because of the ability
to learn adaptively, self-organization, fault tolerance via redundant information
coding and real time operation. Neural networks provide the best solutions
to many problems such as speech recognition, natural language processing and
image recognition.
A common NN consists of three layers or groups of units: input units, hidden
units and output units. The input units activity represents the raw information
which is given to the hidden units of the network. Each hidden unit activity is
determined by the input unit activities and the weights which are assigned to

the connections between hidden and input units. The behavior of output unit
depends on the hidden units activity and the weights between output and hidden
units. This simple network is interesting because of the freedom of hidden units
to make their own representation of the input. The weights between hidden and
input units determine the activation each hidden unit, what to represent has
chosen by the hidden units by modifying these weights. First layer’s dimensions
are initialized randomly, weights between layers are initialized as zero and the
data has inserted. For training NN we are using back propagation algorithm.
During training through back propagation algorithm produced results are com-
pared with actual results in recursive pattern and calculate Mean Squared Error
(MSE). If MSE is less than a constant threshold, then return the network else
back propagate to re-assign the weights and train the network again.

Algorithm 1: Neural Network Back Propagation Algorithm.

Data: Credit Card Fraud Dataset
Result: Trained Neural Network

1 initialize weights ;
2 for Each example in training examples do
3 prediction = output of neural network (network, example);
4 actual = actual-output(exsmple) ;
5 compute ∆(wh ) from hidden layer to output layer ;
6 compute ∆(wi ) from input layer to hidden layer ;
7 repeat update network weights until stopping criteria is satisfied ;

8 return network

4. Simulation and Results

Credit fraud datasets are difficult to obtain, because banks do not share
their data in public. We performed experiments on credit card dataset which is

available publicly [29], which contain 284,807 transactions made by European
card holders over two days in September 2013. The dataset contains 0.172%
fraudulent transactions which are 492 transactions. Time, is the time in seconds
between two transactions, Amount, is the amount of the transaction and Class,
is the predictive values of the transaction that it is legitimate or illicit. Features
with numerical values from V1 to V28 are the resulting principal component
values, which are obtained by applying principal component analysis on original
attributes due to privacy request of the releasing institution. We further divided
the dataset in the training and test set by the split ratio of 0.7 means 70% data
for training and 30% data for testing.
Experiments are performed in Spyder by using python 3.6. For NB we
loaded the data, then split the data with 0.7 split ratio in training and testing
set. Afterward, we calculate, prior probabilities of the classes and on the basis of
these prior probabilities assign class labels to new instances and then calculate
the accuracy.
For GAN we loaded the dataset of credit card transactions, remove dupli-
cation, perform data exploration and apply xgboost classifier on that dataset.
After that we isolate all fraudulent transactions from the dataset and train the
generator on that fraudulent set. We randomly set the number of generation
steps from 1 to 2000 and generate example instances through the generator.
Then combine the generated examples with the training set and train xgboost
with that augmented set. Then compare the performance of classifiers trained
with training set and with augmented set.

σ(z) = (2)
1 + e−z
For NN we split the loaded data with the split ratio of 0.7 in training and testing
sets. We model a NN which consist of 5 layers, in which 3 are hidden layers. Use
sigmoid function to introduce non linearity in the model. A linear combination
of its input signals computed by an NN element, and applies a sigmoid function
on the obtained result[30]. Use backward propagation algorithm for supervised

learning of NN and forward propagation to find the error in the model. At last
we compute the accuracy of the model.

Table 1: Precision and recall of GAN, NB and NN with respect to Number of Generations Ng
Ng Precision Recall
1 0.9492 0.6900 0.9878 0.9878 0.9739 0.9876
81 0.9493 0.6920 0.9865 0.9878 0.9739 0.9867
161 0.9493 0.6910 0.9874 0.9883 0.9746 0.9870
301 0.9644 0.7010 0.9876 0.9918 0.9748 0.9871
651 0.9717 0.7030 0.9878 0.9796 0.9747 0.9874
1000 0.9959 0.7100 0.9897 0.9918 0.9746 0.9876
2000 0.9457 0.7130 0.9899 0.9920 0.9742 0.9878

Simulations are performed in python and the results are presented in Table.
1 and Table. 2.

Table 2: Accuracy and f-measure of GAN, NB and NN with respect to Number of Generations
Ng Accuracy F-measure
1 0.9710 0.9739 0.9867 0.9681 0.9867 0.9882
81 0.9700 0.9742 0.9982 0.9681 0.9867 0.9884
161 0.9720 0.9744 0.9984 0.9682 0.9869 0.9886
301 0.9810 0.9740 0.9987 0.9779 0.9870 0.9889
651 0.9820 0.9736 0.9979 0.9717 0.9872 0.9887
1000 0.9900 0.9737 0.9980 0.9939 0.9871 0.9888
2000 0.9730 0.9734 0.9985 0.9682 0.9870 0.9884

We are implementing credit card fraud detection system not only to improve
the accuracy, but we will also focus to minimize FPR and FNR in classification.



0.85 NN




1 81 161 301 651 1000 2000
No. of Steps

Figure 2: Precision of GAN, NB and NN.

In Figure 2, the precision of NN is high from other two techniques which

means that in NN the FPR is well minimized as compare to other techniques
that is why precision of NN is high from NB and GAN.





1 81 161 301 651 1000 2000
No. of Steps

Figure 3: Accuracy of GAN, NB and NN

Figure 3, shows the accuracy of all three techniques. In terms of accuracy

NN outperforms NB and GAN because back propagation algorithm has used
for training NN. In back propagation training, after every iteration predictions
are compared with actual data labels.




0.986 GAN






1 81 161 301 651 1000 2000
No. of Steps

Figure 4: Recall of GAN, NB and NN

GAN accuracy is high at step 1000 because that is the point where the gen-
erator and discriminator are equally trained. The fake examples generated by
generator look like real and the discriminator also perform well to distinguish
between real and fake generations. That is why accuracy has increased. Figure
4, shows recall of NN, NB and GAN. Recall of GAN is better than NN and NB
it means that in GAN FNR is less as compared to NN and NB. Recall of GAN
at step 651 is low because the generator is trained well than discriminator and
fool the discriminator to assign real labels to the examples generated by the
generator that is why FNR is increased and recall is decreased.

Figure 5, shows f-measure of NN, NB and GAN. In terms of, f-measure

NN is better than GAN and NB because NN classify all records accurately as
compare to NB and GAN, the FPR is lower than GAN and NB. So, the overall
performance of NN is better than GAN and NB.






1 81 161 301 651 1000 2000
No. of Steps

Figure 5: F-Measure of GAN, NB and NN

5. Conclusion:

In this paper, we proposed credit card fraud detection system using different
techniques like NB, GAN and NN to deal with the class imbalance problem.
Because of class imbalance machine learning ignore the minority class and treat
it as noise which affect the accuracy, FPR and FNR. We used GAN to over-
sample minority class in the data by generating example instances from noise
value and combine these examples with the training set to train a binary clas-
sifier. The performance of the classifier has been improved in terms of accuracy
FPR and FNR. In NN, back propagation algorithm is used to train the network.
In, back propagation algorithm the predictions are compared with actual labels
in a recursive pattern. By using back propagation algorithm for training NN the
FPR and FNR are fully minimized and the accuracy is increased. The overall
performance of NN is better than NB and GAN. We used these techniques for
credit card fraud detection and in future we are planning to use these techniques
in other domains where the class of interest is the minority class.


[1] S. Maes, K. Tuyls, B. Vanschoenwinkel, B. Manderick, Credit Card Fraud

Detection. Applying Bayesian and Neural networks, in: Proceedings of 1st
International Naiso Congress on Neuro Fuzzy Technologies, 2002, pp. 261-

[2] N. W. Wen-Fang YU, Research on Credit Card Fraud Detection Model

Based on Distance Sum, International Joint Conference on Artificial Intel-
ligence, 2009, pp. 230-234.

[3] R.-C. Chen, M.-L. Chiu, Y.-L. Huang, L.-T. Chen, Detecting Credit Card
Fraud by Using Questionnaire-Responded Transaction Model Based on
Support Vector Machines., Intelligent Data Engineering and Automated
Learning (2004) 800 – 806.

[4] U. Fiore, A. De Santis, F. Perla, P. Zanetti, F. Palmieri, Using generative

adversarial networks for improving classification effectiveness in credit card
fraud detection, Information Sciences (000) (2017) 1–8.

[5] O. C. C. A. Andrea Dal Pozzolo, Giacomo Boracchi, G. Bontempi, Credit

card fraud detection and concept-drift adaptation with delayed supervised
information, In Neural Networks (IJCNN), International Joint Conference
on. IEEE, 2015, pp. 305-312.

[6] S. S. N. Japkowicz, The class imbalance problem: a systematic study, Intell.

Data Anal. 6 (2002) 429–449.

[7] H. He, E. A. Garcia, Learning from imbalanced data, IEEE Transactions

on Knowledge and Data Engineering 21 (9) (2009) 1263–1284.

[8] C. Elkan, Proceedings of the 17th International Joint Conference on Arti-

ficial Intelligence, Morgan Kaufmann Publishers Inc., 1, 2001, pp. 973-978.

[9] C. Scott, R. Nowak, A Neyman Pearson Approach to Statistical Learning,

IEEE Transactions on Information Theory 51 (11) (2005) 3806–3819.

[10] B. G. Becker, Using mineSet for knowledge discovery, IEEE Computer
Graphics and Applications 17 (4) (1997) 75–78.

[11] M. Zareapoor, P. Shamsolmoali, Application of credit card fraud detection:

Based on bagging ensemble classifier, In: Preceeding of International Con-
ference on Intelligent Computing, Communication and Convergence, 2015,
pp. 679-685.

[12] M. R. Lepoivre, C. O. Avanzini, G. Bignon, L. Legendre, A. K. Piwele,

Credit Card Fraud Detection with Unsupervised Algorithms, Journal of
Advances in Information Technology 7 (1) (2016) 34–38.

[13] I. Rajak, K. J. Mathai, Immune Support Vector Machine Approach for

Credit Card Fraud Detection System, International Journal of Advance
Foundation and Research in Computer 1 (9) (2014) 32–37.

[14] A. Khan, N. Akhtar, M. Qureshi, Real-Time Credit-Card Fraud Detection

using Artificial Neural Network Tuned by Simulated Annealing Algorithm,
In: Proceeding of Inerenational Conference on Recent Trends in Informa-
tion, Telecommunication and Computing, 2014.

[15] K. P. TRAN, T. Thu Huong, C. Heuchenne, Real Time Data-Driven ap-

proaches for Credit Card Fraud Detection, In: Proceeding of International
Conference on E-Business and Applications, 2018, pp. 23-28.

[16] R. Akbani, S. Kwek, N. Japkowicz, Applying Support Vector Machine to

Imbalanced Datasets, Machine Learning: ECML (2004) 39–50.

[17] Y. Sahin, E. Duman, Detecting Credit Card Fraud by Decision Trees and
Support Vector Machines, International Multiconference of Engineers and
computer scientists, 2011, pp. 442-447.

[18] D. M. S. J. D. J. S. Nath, Credit Card Fraud Detection Using Neural Net-

work, International Journal of Soft Computing and Engineering (IJSCE)
2 (April) (2014) 84–88.

[19] R. Oberoi, Improving a credit card fraud detection system using genetic
algorithm, International Journal of Computer and Mathematical Sciences
6 (6) (2017) 436–440.

[20] M. F. A. Gadi, X. Wang, A. P. D. Lago, Credit Card Fraud Detection with

Artificial Immune System, Proceedings of 7th International Conference,
2008, pp. 119-131.

[21] N. Soltani Halvaiee, M. K. Akbari, A novel model for credit card fraud de-
tection using Artificial Immune Systems, Applied Soft Computing Journal
24 (2014) 40–49.

[22] J. Tuo, S. Ren, W. Liu, X. Li, B. Li, L. Lei, Artificial immune system for
fraud detection, Proceeding of IEEE International Conference on Systems,
Man and Cybernetics, 2004, pp. 1407-1411.

[23] A. R. Jyotindra, N.D., A data mining with hybrid approach based Transac-
tion Risk Score Generation Model (TRSGM) for fraud detection of online
financial transaction, Int. J. Comput. Appl.l 16 18–25.

[24] S. Rosset, U. Murad, E. Neumann, Y. Idan, G. Pinkas, Discovery of fraud

rules for telecommunications—challenges and solutions, Proceedings of the
fifth ACM SIGKDD international conference on Knowledge discovery and
data mining, 1999, pp. 409-413.

[25] H. S. H. Shao, H. Z. H. Zhao, G.-R. C. G.-R. Chang, Applying data mining

to detect fraud behavior in customs declaration, Proceedings. International
Conference on Machine Learning and Cybernetics, 3, 2002, pp. 1241-1244.

[26] K. Ramakalyani, D. Umadevi, Fraud Detection of Credit Card Payment

System by Genetic Algorithm, International Journal of Scientific & Engi-
neering Research 3 (7) (2012) 1–6.

[27] E. Duman, M. H. Ozcelik, Detecting credit card fraud by genetic algorithm

and scatter search, Expert Systems with Applications 38 (10) (2011) 13057–

[28] S. Bhattacharyya, S. Jha, K. Tharakunnel, J. C. Westland, Data mining for
credit card fraud: A comparative study, Decision Support Systems 50 (3)
(2011) 602–613.

[29] A. D. Pozzolo, O. Caelen, R. A. Johnson, G. Bontempi, Calibrating proba-

bility with undersampling for unbalanced classification, Proceedings - IEEE
Symposium Series on Computational Intelligence (2015) 159–166.

[30] G. . Cybenko, Approximation by superpositions of a sigmoidal function,

Springer International 18 (2006) 303–314.