Вы находитесь на странице: 1из 4

Proceedings of the 2nd International Conference on Inventive Communication and Computational Technologies (ICICCT 2018)

IEEE Xplore Compliant - Part Number: CFP18BAC-ART; ISBN:978-1-5386-1974-2

Database Intrusion Detection System Using


Octraplet and Machine Learning

Souparnika Jayaprakash Kamalanathan Kandasamy


Amrita Center for Cyber Security Systems and Networks Amrita Center for Cyber Security Systems and Networks
Amrita School Of Engineering Amrita School Of Enginering
Amrita Vishwavidyapeetham Amrita Vishwavidyapeetham
Kollam,Kerala-690525 Kollam,Kerala-690525
Email: souparnikassj@gmail.com Email: ammaskamal@gmail.com

Abstract—Over the years digitization has increased to such an where a person inside the organization is misusing the existing
extent that each and every service is being continuously privileges. External attacks are done by persons outside the
automated and made online. Online services gained immense organization by gaining excessive privileges. This happens
popularity and trust that every information both personal and when users are granted database privileges that exceed the
private related to a user is stored in databases. This in turn requirements. DIDS system thwarts both these attacks. Unau-
changed the focus of the attackers towards the databases that
thorized privilege elevation is another attack where attacker
stores valuable information. Although Security mechanisms exists
for host based systems as well as networks, security breaches still escalates his privilege against company policies. This can be
occur every day and data are being stolen. Thus focus towards done using inference Techniques. In Privilege abuse attack
database security becomes a necessity. This Paper proposes fully users access for unauthorized purposes like copying data or
automated database intrusion detection system that addresses taking screen captures One of the most dangerous one is the
both insider and outsider attacks that can thwart breaches that sql injection attack that exploits front end vulnerabilities like
goes undetected by network or host based intrusion detection weak authentication and weak validation schemes. Our
systems. Proposed System is a flexible one that can be fine -tuned Proposed System addresses all these vulnerabilities and can
with increasing complexity and dynamic nature of databases. Our detect any of these attacks on occurrence and raise an alarm
Architecture is an anomaly based detection mechanism that to the database administrator. The Octraplet structure stores
implements Role based Access control(RBAC). A new Data
all information related to the database transaction wise. In
Structure called Octraplet is used for storing the sql queries. This
system uses Naive Bayes Classifier which is a supervised
transaction based approach all queries in a transaction will be
Machine Learning method for Detecting anomalous queries. stored as a single Octraplet. Naive Bayes classifier(NBC) is a
Proposed approach can improve the detection rates as well as supervised machine learning algorithm that is used for training
performance of the system. and detection of malicious queries. NBC approach is used in
our system.
Keywords—Database Security,Octraplet,Role based
access con-trol,Naive Bayes Classifier. II. RELATED WORK
Several Intrusion detection systems have been developed for
I. INTRODUCTION host systems[1] and networks[2].But relatively few notable works
The recent advances in digitization has increased the are there for Database intrusion detection. One of the earlier works
popularity of online services. The companies have to store published was a method by Chung et al.[3] proposes a misuse
sensitive information in their databases., like bank account details, detection approach for database intrusion detection. Here frequent
Transactions, patient medical records, Private data, Contracts and data patterns are mined and stored as normal profiles.The main
so on. It is the duty of the companies to preserve the disadvantage is it does not create role profiles. The users perform
confidentiality, Integrity and availability of the data. Any violation of different actions based on their roles.User profiles cannot be used
the Policies or data breach brings big financial loss as well as loss as the only criteria. Users can perform actions based on roles and
of good will of the company. A Database Intrusion Detection they will be detected malicious.Lee et al.[4] proposed a real time
System (DIDS) can detect attackers firing malicious queries to intrusion detection system based on time signatures.Real time
database and report the same. There are two kinds of Intrusion database systems use temporal data objects and their values
Detection Mechanism-Signature based and anomaly based. change with time.Thus each time value is updated a sensor
Signature based mechanism checks pre-defined patterns and transaction is generated.The temporal data is updated over a
detects those ones only. The variety of attacks spanned here is period of time.If a transaction tries to change the temporal data
very less since new attack patterns are detected every now and that has been updated already over that period, an alarm is
then. So This mechanism cannot be applied in the case of raised.But the disadvantage of this method is that it focuses on
dynamic databases. Anomaly based detection keeps a track of all updates only and not role profiles.Hu Panda.[5] uses log files to
normal queries and any deviation from normal query is categorized generate user profiles.Frequently accessed data and tables and
as an attack. Our Proposed System makes use of anomaly stored for comparison.The problem with this approach is that,
detection scheme. In general, two kinds of attacks are possible in main-tenance of data is very difficult when the size of database is
databases. Insider attack is

978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1413


Proceedings of the 2nd International Conference on Inventive Communication and Computational Technologies (ICICCT 2018)
IEEE Xplore Compliant - Part Number: CFP18BAC-ART; ISBN:978-1-5386-1974-2

too large and number of users also increase dynamically.In the The profiles will be stored Separately for comparison.
method suggested by Bertino et al.[6] creates normal profile for .Every time a transaction is being issued by the user,the
each role for anomalous pattern detection.Also they used naive Octraplet is generated from the transaction. Based on
Bayes classifier[7] to detect anomalous sql queries.But the the octraplet generated a role profile is generated. Then
problem encountered here was the false detection rate.Even the generates structure is compared with the existing
normal queries were marked as malicious queries.Thus makes structure. The architecture is shown in Fig 1.If there is a
data unavailable to legitimate users.The method proposed by positive Match, then the query is executed.Otherwise The
Kamra et al. [8] creates a new data structure called Triplet that Response Engine can generate three kinds of Responses
stores three information about the sql query.Here also naive Bayes based on the Matching Percentage.An alarm is given to
classifier is used to generate normal role profiles. But the the administrator if the severity rate is above 8.0 on a
disadvantage of this system is that it does not consider correlation scale of 10.The query is blocked automatically if the rate is
among queries and also information about where clause.Hashemi between 6.0 and 8.0.Otherwise the query is executed.In the
et al.[9] mines correlation among data items and each data item is architecture diagram(Fig 1)the data for analysis is taken from
a time series data.The method proposed by CA. Ronao et.al.[10] is the database log files.The log file has all information regarding
based on principle component analysis[10] and Random forests the past queries and transactions.Then the logs are transferred
[10] to classify malicious queries.The major disadvantage is that to a transaction processor where the queries are processed to
no changes in detection rate compared with naive Bayes binary values,so that these can be applied to the Naive bayes
classifier.It is not effective against sql injection and insider attacks. classifier[8].Based on the roles normal profiles are
The method proposed by Indu Singh et.al.[11] is based on generated.Any deviation from normal profiles is considered as
Counting Bloom Filter[12] and Token Management[11].The an abnormality.Octraplets are generated based on the log
drawback of this approach is that it cannot Perform Inference queries.Similarly from the queries entered by user Octraplets
detection of complex dynamic queries and privilege escalation not are generated and feeded to naive bayes classifier for getting
addressed.The method suggested by Niklas Rappel et.al.[13] uses probabilities.A comparison is performed and based on the
weighted naive Bayes classifier and MLMS Approach [14].The threshold alarm values alarm is generated.
disadvantage is that Poor performance during frequent weight
updates, insider attack detection not effective. Then comes the
approach suggested by Saad M Darwish[15] which uses Naive
Bayes classifier and Hexplet data structure for storing sql
statement related information. His paper utilizes a transaction
based approach rather than query based approach. The
disadvantage of this paper is that the role related information must
be present in log files.Also there will be significant amount of
overhead with the dynamically changing structure and size of
database.

III. PROPOSED SYSTEM


Our System uses a new Data Structure called Octraplet for
storing information related to sql transactions. The naive Bayes
classifier. [] is the Machine Learning Algorithm used for training
and classification. The Advantage of our proposed approach is
that it creates a new data structure Octraplet for ef-ficient
storage of data thus improving Performance irrespective of
Dynamic changes in structure and size of Database. This
system also addresses the database attacks like sql Injection
attack, Privilege Escalation and Unauthorized Privilege Abuse.
When the user fires a query into the database, it is Processed
to generate Octraplet. The Roles will be kept by the Database
Administrator and is dynamically changeable in accordance
with requirements. Octraplet is an eight array structure. Based
on octraplet role profile is generated. Then it is compared with
the existing Profiles generated by classifier from log file. The
Checking System does the comparison. The response Engine B. OCTRAPLET DATA STRUCTURE
generates the appropriate response or informs the For extracting the user profiles and conveniently storing the
administrator in accordance with the comparison. SQL Query Attributes Our proposed system uses octraplet.
Using the attributes stored in octraplet normal user behavior is
A. SYSTEM DESIGN extracted. The octraplet of user entered query is calculated.
The Proposed System Consists of 4 Components. Log files are examined and the data is also converted to
Database log files, Profile generator [15], Comparison octraplet for further processing of data. Octraplet uses a
Mechanism and a Response Engine. The Normal Profiles Transaction based approach where all queries under a
are generated from Database log files.The normal profiles transaction is stored as a single octraplet. The attributes
are learned from log files using Naive Bayes classifier [6]. are stored as a binary relation.The Octraplet is an Eight

978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1414


Proceedings of the 2nd International Conference on Inventive Communication and Computational Technologies (ICICCT 2018)
IEEE Xplore Compliant - Part Number: CFP18BAC-ART; ISBN:978-1-5386-1974-2

Array Relation,which is represented as O (m, S R,S A, tables.The result analysis was done using weka,
W R,W A,I,S,C). Consider a database schema that consists Waikato Environment for Knowledge Analysis.[16].The
of two relations R1A1,B1,C1,D1 and R2A2,B2,C2,D2,then training set has 188 queries as training set and
the octraplet is represented as (SQL CMD 1, remaining test set.The fig 1 shows the classifier results
PROJ ATTR[],PROJ ATTR 1[][],SELECT ATTR[], SELECT from weka[16]explorer.Out of 239 queries 142 where
ATTR 1[][],ID,SUPPORT,CONFIDENCE).m represents a correctly classified and remaining false positives.
unique id for first query of each transaction. , S R
represents the Projection attribute array of each query.S A
represents Projection attributes for a transaction as
two dimensional array. W R represents the selection
attribute array for each query [].W A represents the
selection attribute for transaction as a two dimensional
array. S is the support value [16]. It indicates the
probability of appearance of a particular dataset in all
transactions. C is the confidence value for a query [16].
It indicates how often a rule is found to be true.
Consider the following example for a transaction
start transaction fig3
SELECT R1:A1; R1:C1; R2:A2 FROM R1, R2 WHERE
R1:C1=R2:C2 V. CONCLUSION AND FURURE WORK
end transaction This paper proposed a transaction based approach based on
Then the corresponding octraplet is naive Bayes classification and octraplet. In comparison with
< 1 >;< 1 : 1 >;< 1 : 0 : 1 : 0 >;< 1 : 0 : 0 : 0 >;< 1 : 1 > the previous data structures like Hexplet and triplet octraplet
can offer more performance and improve detection rate. The
, Learning Algorithm can effectively detect role violations. Naive
Bayes classifier is the simplest classification algorithm that can
< 0 : 0 : 1 : 0 >; < 0 : 0 : 1 : 0 >;< 1 >;S;C extract all information available in the log files. Our proposed
system will be able to reduce false positive rate since they
take into account support and confidence values. Overall
S and C corresponds to the respective support and performance of the system can remain unchanged with
confidence values[16]. dynamic nature of database. The current work included
implementing the Octraplet Data structure and analyzing the
C. NAIVE BAYES CLASSIFIER performance in comparison with existing data structure. Future
Naive Bayes classifier is used because of its simplicity.This work includes analyzing dependencies of attribute values[18]
classifier is an advantage when the dimension of the input is very to improve detection rate
high. Naive Bayes Classifier determines Likelihood of A Given
B.For classification a training set is required.The naive Bayes
REFERENCES
classification is mentioned in Kamra.et.al [8].The dataset consists
of all normal queries.The NBC is directly applied to the proposed [1] G. Creech and J. Hu, A semantic approach to host-based
anomaly detection framework by con-sidering the set of roles in intrusion detec-tion systems using contiguous and discontiguous
the system as classes and the log file octraplet as the system call patterns,in IEEE Transactions on Computers. ,2014.
observations,using the following Equation.[15]
[2] Poonam Sinai Kenkre,Anusha Pai and Louella Colaco , Real Time
Intrusion Detection and Prevention System,in Proceedings of the
3rd International Conference on Frontiers of Intelligent Computing:
Theory and Applications (FICTA) . ,2014.

[3] CY. Chung, M. Gertz, K. Levitt,Demids: A misuse detection


system for database systems,in Proceedings of 3rd International
Working Confer-ence on Integrity and Internal Control in
Information Systems, Nether-lands, November . ,2014

[4] Lee. V., Stankovic. J.Son. S., ,Intrusion detection in real-time database
systems via time signatures.,in Proceedings of 6th IEEE Real-Time
Technology and Applications Symposium, USA, May. ,2000

fig2 [5] Hu. Y.,Panda. BIdentification of malicious transactions in


database systems., in : Proceedings of 7th International Database
Engineering and Applications Symposium,Hong Kong,July. ,2003
IV. RESULT ANALYSIS
[6] Bertino. E,Kamra. A,Terzi. E,Vakali. A.Intrusion detection in RBAC-
For experimental purpose, a training set was created with administered databases, in : Proceedings of the 21st Annual
approximately 236 queries. The sample database has 7 Computer Security Applications Conference,USA,December . ,2005

978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1415


Proceedings of the 2nd International Conference on Inventive Communication and Computational Technologies (ICICCT 2018)
IEEE Xplore Compliant - Part Number: CFP18BAC-ART; ISBN:978-1-5386-1974-2

[7] Indu Singh,Lakshya,Kejriwal,Adithya Agarwal.Conditional adherence


based classification of transactions for database intrusion detection
and prevention,in :International Conference on Advances in Comput-
ing,Communications and Informatics (ICACCI). ,2016

[8] Kamra. A.,Bertino. E., Lebanon. G.Mechanisms for database intrusion


detection and response,in :Proceedings of the 2nd SIGMOD PhD
Work-shop on Innovative Database Research,Canada,June. ,2008

[9] Hashemi. S, Yang. Y,Zabihzadeh. D and Kangavari. M. Detecting


intrusion transactions in databases using data item dependencies
and anomaly analysis,in : Expert Sys. ,2008

[10] CA. Ronao. Mining SQLQueries to Detect Anomalous Database


Access using Random Forest and PCA,in :International
Conference on Industrial, Engineering and Other Applications of
Applied Intelligent Systems. ,2015

[11] Indu Singh,Tapasya Singh and Tanya Verma Singh.Detecting Intrusive


Malicious Transactions in Database using Session and Token Manage-
ment,in :International Conference on Computer Systems, Data Commu-
nication and Security, GRENZE Scientific Society. ,2015

[12] F. Bonomi, M. Mitzenmacher, R. Panigrahyi,Singh,Varghese.An


Im-proved Construction for Counting Bloom Filters,in :LNCS
Springer-Verlag Berlin Heidelberg. ,2006

[13] Niklas Rappel, Julius-Maximilians.Dynamic Intrusion Detection in


Database Systems: A Machine-Learning Approach,in :ICIS
Proceed-ings,Dublin,Ireland. ,2016

[14] Harel. A,Shabtai. A,Rokach. Land Elovici. Y.M-score: A


misuseability weight measure,in:IEEE Transactions on
Dependable and Secure Com-puting. ,2004

[15] SM DarwishJournal of Electrical Systems and Information


Technol-ogy,Volume 3,Issue2, September. ,2016

[16] Mostafa Doroudian, Hamid Reza Shahriari.Database intrusion


detection system for detecting malicious behaviors in transaction
and inter-transaction levels,in:7th International Symposium on
Telecommunica-tions (IST). ,2014

[17] G. Holmes,A. Donkin,I.H. Witten .WEKA: a machine learning


work-bench,in: Proceedings of ANZIIS ’94 - Australian New
Zealnd Intelligent Information Systems Conference. ,1994

[18] Aleks JakulinIvan Bratkon .Analyzing Attribute Dependen-


cies,in:.European Conference on Principles of Data Mining and
Knowledge Discovery ,2003

978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1416

Вам также может понравиться