Вы находитесь на странице: 1из 9

Available online at www.sciencedirect.

com
Available online at www.sciencedirect.com
ScienceDirect
ScienceDirect
Procedia
Available Computer
online Science 00 (2019) 000–000
at www.sciencedirect.com
Procedia Computer Science 00 (2019) 000–000 www.elsevier.com/locate/procedia
www.elsevier.com/locate/procedia
ScienceDirect
Procedia Computer Science 161 (2019) 1173–1181

The Fifth Information Systems International Conference 2019


The Fifth Information Systems International Conference 2019
Web Application Security: An Investigation on Static Analysis with
Web Application Security: An Investigation on Static Analysis with
other Algorithms to Detect Cross Site Scripting
other Algorithms to Detect Cross Site Scripting
Abdalla Wasef Marashdihaa, Zarul Fitri Zaabaa,a,*, Khaled Suwaisbb, Nur Azimah Mohdaa
Abdalla Wasef Marashdih , Zarul Fitri Zaaba *, Khaled Suwais , Nur Azimah Mohd
a
School of Computer Sciences, Universiti Sains Malaysia, 11800 Pulau Pinang, Malaysia
a b
Faculty ofSciences,
School of Computer Computer Studies, Sains
Universiti Arab Open University,
Malaysia, 11800 Saudi
PulauArabia
Pinang, Malaysia
b
Faculty of Computer Studies, Arab Open University, Saudi Arabia

Abstract
Abstract
Among web application vulnerabilities, XSS is the most frequently occurring. Where a web application accepts a user-input, it is
Among
possibleweb application
for such vulnerabilities,
vulnerability XSS is thescripts.
to inject malicious most frequently
The greateroccurring.
part of theWhere a web
literature applicationonaccepts
concentrated a user-input,
the application it is
of static
possible
analysis for such vulnerability
in order to locate XSStovulnerabilities.
inject maliciousThe
scripts. Thefor
reason greater
this ispart
its of the literature
capability concentrated
of achieving on the application
effectively of static
a 100 percent code
analysis
coveragein order
and to locate
observing XSS
every vulnerabilities.
path of the program.The reason for this
Nevertheless, is its restriction
the main capability of of static
achieving effectively
analysis, being thea 100
falsepercent
positivecode
rate
coverage
shown in andthe observing every path
results, continues. of the program.
Consequently, Nevertheless,
researchers beganthe to main
merge restriction of static
static analysis analysis,
with being the false
other algorithms, positive
such rate
as genetic
shown in the
algorithm, results,learning
machine continues.
andConsequently, researchers
pattern matching. began
This is to improveto merge
the XSSstatic analysisresults
detection with other algorithms,
as well such
as the static as genetic
analysis run
algorithm,
time. This machine learning
essay defines theand pattern matching.
algorithms This is to
which formerly improvethe
improved the static
XSS detection results as well
analysis outcomes as the XSS
regarding staticvulnerability
analysis run
time. ThisFurthermore,
detection. essay defineseach the method’s
algorithms which formerly
restriction improved
was mentioned the static
in which analysiscontinue
the studies outcomes regarding
to lack XSS vulnerability
an efficient detection of
detection. Furthermore,
XSS vulnerability in PHPeachwebmethod’s restriction was mentioned in which the studies continue to lack an efficient detection of
application.
XSS vulnerability in PHP web application.
© 2019 The Authors. Published by Elsevier B.V.
© 2019 The Authors. Published by Elsevier B.V.
© 2019
This The
is an Authors.
open accessPublished by Elsevier
article under B.V.
the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
This is an open
Peer-review access article under CC BY-NC-ND licenseThe
(http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the scientific committee ofofThe
under responsibility of the scientific committee Fifth
Fifth Information
Information Systems
Systems International
International Conference
Conference 2019
2019.
Peer-review under responsibility of the scientific committee of The Fifth Information Systems International Conference 2019
Keywords: Cross Site Scripting; Secuirty; Security Vulnerability; Software Security; Vulnerability Detection; Web Application Security
Keywords: Cross Site Scripting; Secuirty; Security Vulnerability; Software Security; Vulnerability Detection; Web Application Security

1. Introduction
1. Introduction
Web applications as now considered as the standard platform to represent data and to conduct service releases
Web the
through applications as now
entire Web. Theseconsidered as inclusive
services are the standard platformeducational,
of banking, to representfinancial
data andand
to conduct service
news sites releases
as well social
through the entire Web. These services are inclusive of banking, educational, financial and news sites as well social

* Corresponding author. Tel.: +60-46-53-4758; fax: +60-46-53-3335.


* E-mail zarulfitri@usm.my
address:author.
Corresponding Tel.: +60-46-53-4758; fax: +60-46-53-3335.
E-mail address: zarulfitri@usm.my
1877-0509 © 2019 The Authors. Published by Elsevier B.V.
This is an open
1877-0509 access
© 2019 Thearticle under
Authors. the CC BY-NC-ND
Published license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
by Elsevier B.V.
Peer-review
This under
is an open responsibility
access of the scientific
article under CC BY-NC-NDcommittee of The
license Fifth Information Systems International Conference 2019
(http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the scientific committee of The Fifth Information Systems International Conference 2019

1877-0509 © 2019 The Authors. Published by Elsevier B.V.


This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review under responsibility of the scientific committee of The Fifth Information Systems International Conference 2019.
10.1016/j.procs.2019.11.230
1174 Abdalla Wasef Marashdih et al. / Procedia Computer Science 161 (2019) 1173–1181
2 Author name / Procedia Computer Science 00 (2019) 000–000

media and television channels. Moreover, the most fundamental means of collecting information about any topic is to
utilize a web application. This simply requires an internet connection in order to access such applications from every
location.
Since the use of web application has increased, it is now more attractive to hackers as well as to users. A
considerable volume of sensitive data is stored in web applications which can be stolen by hackers in order to gain
financial advantages. The most recent research and web security reports indicate that cross-site scripting (XSS) is the
web application that is most likely to be vulnerable [1, 2, 3, 4]. XSS is regarded as an injection type of attack, leading
to the theft of sessions, cookies and sensitive data. Malicious scripts are injected into the source code in this type of
attack which may also occur in any web application which utilizes it without encoding or validating it. Consequently,
XSS vulnerability commences when malicious scripts are stored on a website or when a user is deceived by a URL
containing malicious scripts.
This type of vulnerability is detected by focus of many approaches and methods by a given source code [5, 6, 7,
8], although it currently remains a problem in web applications. Majority of the previous tools and approaches focused
on utilizing static analysis for the detection of XSS vulnerabilities. This is because it has the ability to virtually
accomplish 100% code coverage and observe the entire paths of the program. Moreover, recent studies have
discovered that static analysis is better than other approaches at detecting this type of vulnerability [4, 9]. Combining
static analysis with other algorithms (e.g. genetic algorithm (GA), pattern matching and machine learning) enhanced
the detection results and the run time of static analysis [6, 10, 11, 12]. However, the primary limitation of static analysis
can still be observed in their results, which is signified as the false positive rate found in their result [7, 8]. The false
positive is the result of some paths being detected as vulnerable paths even though they are considered safe or
infeasible paths in reality.
Therefore, this paper will highlight the enhanced approach by combine algorithms with static analysis to detect
XSS vulnerability in PHP web application. The second section discusses the associated approaches to detect XSS
vulnerability. Section XI describe XSS vulnerability followed by the analysis types used to detect XSS vulnerability
in Section IV. The combined algorithms with static analysis was highlighted in Section V. The comparison among the
various algorithms combined with static analysis and the discussion in Section VI. Finally, it ends with conclusion
and forthcoming works.

2. Related Work

Several tools and approaches have focused on detecting this kind of vulnerability in a given source code [5, 6, 7,
8]. However, it is still a current problem in web applications. Gupta et al. [5] suggested an HTML context sensitive
approach based on taint analysis and defensive programming to precisely detect XSS vulnerability from the PHP
applications’ source code. They also gave automatic suggestions for enhancing the vulnerable source code. It was
discovered that ignorance of input data utilizes context results when obtaining false detection results. Based on
preliminary experiments and results, it was observed that they had an efficient approach. However, their approach is
not fully automated which it would be difficult to be implemented on large web applications. The automated tool
offers a cost-effective, highly scalable, ongoing security baseline that starts from the initial steps of the Software
Development Lifecycle (SDLC) [13, 14].
From another perspective, Shar and Tan [6] suggested that a set of static code characteristics could be applied for
portraying these code patterns. Subsequently, the vulnerability prediction paradigms were constructed from historical
data which indicates the suggested known vulnerability data as well as static attributes in order for the XSS and SQLI
vulnerabilities to be predicted. A prototype tool known as PhpMinerI was produced for gathering data as well as to
assess their paradigm on eight open-source web applications. Their results reveal a false positive rate of 11 percent
for detecting SQLI vulnerabilities and a false positive rate of 6 percent for detecting XSS vulnerabilities.
They improved their work method by the development of a strategy for constructing construction predictors by
the utilization of machine learners [7]. Information from static and dynamic analyses and the available vulnerability
information were utilized to train these learners. Static analysis involves only the computation of program slices,
whereas dynamic analysis is used for deducing security-checking types of sanitization and validation functions.
Subsequently, this deduced data, rather than the correct analysis was utilized for the prediction. We attained a
successful outcome regarding the forecasting of XDD vulnerabilities’ sinks at approximately 72 percent. Nevertheless,
Abdalla Wasef Marashdih et al. / Procedia Computer Science 161 (2019) 1173–1181 1175
Author name / Procedia Computer Science 00 (2019) 000–000 3

these outcomes have a false positive of approximately 9 percent, which is caused by the application of static code
analysis.
From 2006 onwards, researchers began to identify the vulnerability of XSS in PHP by Pixy. However, this was the
first static analysis tool to identify XSS vulnerability, although it is now regarded as being old and unable to work on
recent versions of PHP [15]. Nevertheless, until the present time, XSS vulnerability remains because of the restriction
of the analysis; for example, the false positive rate on the outcome of the analysis. In the next part, we consider XSS
vulnerability and accentuate XSS types.

3. Cross Site Scripting (XSS)

Cross-Site Scripting indicates a Javascript code injection attack which enables an attacker to introduce injected
Javascript into the victim’s web browser, thereby enabling the attacker to access sensitive resources such as credit
card numbers, cookies and passwords. Although XSS indicates an attack against the client-side’s web browser,
exploitation of its abilities occurs on the web server side. The attacker creates and injects a malicious JavaScript
payload on the web application in order to exploit the XSS vulnerabilities. This script is injected so that it appears as
a benign feature on the website. The execution of this script finally occurs within the website’s trust domain [16].
XSS vulnerability can generally be classified into two principal categories: Client-side XSS Vulnerability (DOM-
Based XSS vulnerability) and Server-side XSS Vulnerability (Persistent and Non-Persistent XSS Vulnerability) [1, 3,
17]. For the server-side XSS vulnerability, the data is sourced directly from the server and transmitted onto the page.
For instance, the data that has the unsanitised text comes from the HTTP response that comprises the vulnerable page.
The client-side XSS vulnerability signifies that the data is taken from the Javascript that did manipulations on the
page. Therefore, the Javascript was responsible for adding the unsanitised text to the page instead of being found on
the page at that location during its first loading in the browser. Fig. 1 depicts the XSS vulnerability’s taxonomy.

Fig. 1. Taxonomy of XSS vulnerability.

A reflected XSS attack will occur during an instant referral of the user-input which contains a malicious script in
the web page response where there is no correct validation. Moreover, a stored XSS attack occurs when an invalidated
user input is stored and contains malicious scripts inside the database of the application, and the stored data initiates
an attack when it is accessed in a web page. An incorrect validation of user input on the server’s side will cause these
two types of vulnerability. An XSS attack, which is DOM-based, occurs at the application’s client-side [18]. This type
of attack causes the client-side script to act unpredictably, and also invalidated data from the DOM (document object
model) structure is utilized by the script in order to process the application [18].
The act of ensuring that the application validates every cookie, header and query string as well as form and hidden
fields (meaning all parameters) is on the basis of a precise specification which states what is permissible. Moreover,
we still regard XSS as the most frequently occurring vulnerability, particularly for PHP web applications [4]. It has
been proposed that various tools and methods be applied in order to identify this kind of vulnerability [5, 6, 7, 8].
Nevertheless, even today, vulnerability persists in web applications, and it remains the most vulnerable, especially in
PHP web applications. The next section considers the types of analysis used to identify XSS vulnerability in PHP web
applications.
1176 Abdalla Wasef Marashdih et al. / Procedia Computer Science 161 (2019) 1173–1181
4 Author name / Procedia Computer Science 00 (2019) 000–000

4. Detection of XSS vulnerability

The term ‘vulnerability detection’ means the procedure of identifying weakness inside the application’s source
code. This concept involves identifying vulnerabilities either prior to an application’s deployment or before the
attacker discovers the vulnerability by means of source code analysis [19, 20]. It is possible to classify vulnerability
analysis tools into dynamic, static and hybrid analysis tools on the basis of the type of information which they utilise
from the web application.

4.1. Static analysis

A static analysis tool examines the source of the web application code for the purpose of discovering vulnerabilities.
A static analysis tool is able to detect every potential program path along the application by analyzing the source code
of the web application. Therefore, a static analysis is able to detect any vulnerabilities in every program path [21].
However, the primary limitation of static analysis which is the false positive rate in the result still persists, which make
the results of static analysis not correct and effect on the final results of the vulnerability detection. False positive
refers to detect some paths as vulnerable paths, while in the real it is safe path and there is no vulnerability in these
paths.

4.2. Dynamic analysis

Dynamic tools interact with the web application under examination, rather than using the source code in a similar
way to the behavior of a user with a web browser [9]. Dynamic analysis tools have an advantage over static analysis
which it presents fewer false positives. This is brought about by the fact that the fuzzing attempt actually attempts to
trigger the vulnerability.
Nevertheless, the dynamic analysis tool has a disadvantage in that it cannot ensure that it will identify every
vulnerability in a web application. The reason for this restriction is that the dynamic analysis tool can identify
vulnerabilities only in the program paths which it implements [9, 22]. From another perspective, a static analysis tool
is able to detect every program path in an application.

4.3. Hybrid analysis

As its name implies, hybrid analysis tools merge the functions of dynamic and static analysis. Static analysis is
applied to create likely vulnerabilities; consequently, a confirmation stage is undertaken which will allow the tool to
attempt to gain advantage from the vulnerability. The tool will then report the vulnerability only if this stage is
successful. Nevertheless, hybrid analysis also inherits the limitation of static analysis which is false positive rate in
the results [23]. Consequently, this type of tool is less popular than static analysis and dynamic analysis tools.
Static analysis reveals security problems by examining the application’s code without executing it. Therefore, there
is no run-time overhead and there is a possibility for it to gain 100% code coverage since it can analyze all the
execution paths possible (unlike testing where code coverage is a popular problem). It can also be advantageous
because it can be applied early in the lifecycle of software development, even when there is only a portion of the code
available.
Numerous developers consider static analysis as the most efficient way for finding vulnerabilities within a web
application [24]. For instance, Microsoft believes that code review is approximately 20 to 30 times more effective
than software testing in finding bugs [25]. Furthermore, it can reveal about half of the current bugs when it is
implemented in the most adequate manner [9, 26]. To improve static analysis results, researchers started to combine
static analysis with other algorithms, next section will describe the algorithms combined with static analysis in context
of PHP web applications to detect XSS vulnerability.
Abdalla Wasef Marashdih et al. / Procedia Computer Science 161 (2019) 1173–1181 1177
Author name / Procedia Computer Science 00 (2019) 000–000 5

5. Combined static analysis with other algorithms

5.1. Genetic algorithm

We can define Genetic Algorithms (GAs) as a subset of Evolutionary Algorithms (EAs), both of which are
metaheuristic optimization algorithms inspired by biology and based on population. They adopt natural evolution
mechanisms, like natural selection, crossover, mutation and the survival of the fittest in order to discover the best
solutions in a search space. GAs differ from other EAs because they have a crossover (recombination) operation and
apply binary coding in bits or bit-strings in order to represent a population.
Andrea and Mariano [27] presented a strategy which originated from taint analysis through GA heuristics as a
method of enhancing the procedure. They applied hybrid analysis to locate execution paths in the control flows of
applications that are likely to feature vulnerabilities, by statically detecting tainted sources (such as those from
potential attackers) which were used in sensitive statements in order to sanitize the inputs. The Pixy tool [15] was
utilized to report control flow pathing from source codes, as it has the objective of reaching exposed sinks and skipping
sanitization as well as characterizing target paths in genetic searches. Since this strategy is effective only for the
identification of reflected XSS vulnerability in PHP web-based applications, it was improved by Ahmed and Ali [10]
by applying a set of patterns which uncovered likely XSS vulnerabilities. They conducted experiments only on
reflected and stored vulnerabilities. According to their findings, it is arguable that particular paths in the Control Flow
Graph (CFG) can be even executed, for non-executable paths need to be eliminated in order to minimize the rate of
false positives in outcomes.
Contrastingly, Marashdih et al. [11] improved Ahmed and Ali [10] strategy by the manual elimination of all
unfeasible paths from control flow graphs. This creates a more efficient outcome by identifying the whole XSS
vulnerability without any false positives in the results. However, they conducted an experiment which included only
15 to 20 lines of source code. Every unfeasible path may be eliminated manually subsequent to a meticulous
examination of each path which determines whether or not it is feasible. Contrastingly, it would be an impossible
challenge to adopt their method of massive web-based applications which are composed of considerable volumes of
source code. Furthermore, it is necessary to automate the work of identifying vulnerabilities in web-based applications
in order to reduce the cost and time requirements relevant to particularly large web-based applications.

5.2. Machine learning

Machine learning is definable as a class of Artificial Learning, wherein machines tend to learn things and adapt to
any dataset changes with no need for repetitive external programming [28]. The approach is further categorized
according to two types: unsupervised and supervised learning. With supervised learning, the output dataset is
provisioned as a base framework for the learning machine to adapt to and learn from. With unsupervised learning, no
existing system provides for a dataset, since these are instead clustered systematically. Supervised learning further
comprises Classification and Support Vector Machines (SVM), while Clustering comprises a portion of unsupervised
learning.
Scholte et al. [29] introduced the IPAAS method, which enhances the secure development of web-based packages
by transparently training types for website application parameters in tests and also by routinely implementing robust
validators in the parametric sets at runtime. An assessment of their PHP implementation shows that IPAAS can
routinely protect real-world packages against most XSS and SQL injection vulnerabilities while incurring lower false
positive rates. In such cases, the IPAAS parametric extractor may be unable to parse parametric key-value pairs in a
reliable manner. Secondly, these tools continue to be subject to false positives in their results, which remains the
primary constraint in usage of static analysis.
Conversely, Medeiros et al. [12] introduced a newer strategy wherein static analysis tools learn to automatically
identify vulnerabilities through machine learning. Their strategy utilized a sequence scheme that learns to characterize
vulnerabilities according to sets of marked source code slices. This analytical framework considers the order in which
encoded elements emerge and execute in the program slices. This strategy was applied in the DEKANT tool and was
assessed experimentally using sets of open-source PHP packages as well as WordPress plugins, leading to findings of
XSS vulnerability along with false positive outcomes.
1178 Abdalla Wasef Marashdih et al. / Procedia Computer Science 161 (2019) 1173–1181
6 Author name / Procedia Computer Science 00 (2019) 000–000

5.3. Data mining

Data mining refers to the task of finding interesting information and patterns from massive sets of information.
Sources of data can comprise datasets, data warehouses, internet and other data repositories, or any dynamically
streamed information feeding into systems [30]. The practice comprises knowledge discovery processes in massive
and complex datasets that involve the extraction, i.e. “mining” of discoverable knowledge through the systematic
evaluation of such massive amounts of information. Furthermore, data mining can be applied to predict outcomes for
given entities.
Through the use of data mining for identifying XSS vulnerability, Shar and Tan [6, 31] introduced the PhpMinerI
and PhpMinerII data mining tools for assessing the incidence of PHP program vulnerabilities. These toolsets extract
attribute sets from program slices and subsequently apply algorithmic methods to examine the features. Nonetheless,
their findings contain an 11% false positive rate in the detection of SQLI vulnerabilities and a 6% false positive rate
in the detection of XSS vulnerability.
WAP is rather different in that it must detect the locations of vulnerabilities in codebases in order to apply corrective
fixes [32]. Furthermore, the approach does not utilize data mining for detecting vulnerabilities but rather predicts if
any vulnerability discovered through taint analysis is real or else a false positive. Their outcomes were some 5% more
efficient than that with PhpMinerII [6], and some 45% more efficient than that with Pixy [15]. The problem of false-
positive static analysis in the outcomes does still persist.

6. Discussion

Although several studies have concentrated on identifying XSS vulnerability, XSS is still present in our lives. Static
analysis is the most frequently used type to identify XSS. Nevertheless, this type of analysis continues to be affected
by a false positive rate in the outcomes, therefore it lacks efficiency in identifying XSS vulnerability in PHP web
application. Table 1 depicts the presence of false positive rates in the results when these algorithms are merged with
static analysis.

Table 1. The False Positive Results after merging Algorithms with Static Analysis.

Author Year Automated Technique/Algorithm Type of XSS Limitation

Reflected Stored DOM-Based


XSS

Andrea and Mariano 2010 No Static analysis + √ False positive results


[26] genetic algorithm
Manual Approach
Scholte et al., [29] 2012 Yes Static analysis + False positive results.
machine learning
The type learning can
fail in the custom query
string formats.
Shar and Tan [31] 2012 Yes Static analysis + data √ √ False positive results.
mining
Shar and Tan [6] 2013 Yes Static analysis + data √ √ False positive results.
mining

Gupta et al., [33] 2015 No Static analysis + data False positive results.
mining
Manual Approach
Ahmed and Ali [10] 2016 Yes Static analysis + √ √ False positive results
genetic algorithm
Medeiros et al., [12] 2016 Yes Static analysis + √ √ False positive results.
machine learning
Abdalla Wasef Marashdih et al. / Procedia Computer Science 161 (2019) 1173–1181 1179
Author name / Procedia Computer Science 00 (2019) 000–000 7

Author Year Automated Technique/Algorithm Type of XSS Limitation

Reflected Stored DOM-Based


XSS

Medeiros et al., [32] 2016 Yes Static analysis + data √ √ False positive results.
mining
Marashdih et al. [11] 2017 No Static analysis + √ √ √ Manual Approach
genetic algorithm

Two main limitations found in the previous studies that combined algorithms with static analysis to improve static
analysis result, the first one is the false positive rate in the result, and the second is the manually implementation of
the approaches which make it limited for small datasets. As depicted in Table 1, after merging the GA algorithm,
machine learning and data mining with static analysis, the problem of false positives in static analysis continues to
persist. On the other hand, Marashdih et al. [11] proposed a manual approach based on GA and static analysis to detect
XSS vulnerability, their results have no false positive outcomes. One explanation for their efficient result is that they
manually eliminate every infeasible path in the code, and they did test cases to detect XSS on feasible paths only [33].
Static analysis outcomes were improved through the use of GA in checking for the presence of XSS in every feasible
path.
Conversely, as the approach only applies to more compact programs that comprise 15 to 20 lines of code, this
constrains its applicability in the detection of XSS vulnerabilities, given the small dataset used in their study. The
approach implemented manually on small datasets. Which means, it would be difficult to implement on different
datasets or large PHP web applications (e.g. more than 1000 lines of code), and it require more time to find the feasible
paths manually then to detect XSS vulnerability. Their approach should be fully automated in a way to be useful in
the detection of XSS vulnerability in PHP. Therefore, the main limitation of static analysis still persists even after
combining it with such algorithms, and the problem of static analysis should be solved in a way to efficiently address
the problem of XSS as well. We can consider that if the remove step of the infeasible paths in the source code can
reduce the false positive rate in the results of static analysis, and implementing the approaches automatically will also
offers a cost-effective, highly-scalable, ongoing security baseline that starts from the initial steps of the Software
Development Lifecycle (SDLC) [13, 14].
The approaches used genetic algorithm can be enhanced to get the benefits of both approaches of Ahmed and Ali
[10] (automated approach which suitable for small and large web applications) and Marashdih et al. [11] (which
generates lesser false positive). By combining these two approaches, a promise results expected from combining both
approaches. Other algorithms and approaches can also get the benefits of removing infeasible paths from the source
code, which considered as the main source to produce high false positive rate in the static analysis results.

7. Conclusion

Cross-Site Scripting (XSS) vulnerability is among the most frequently occurring vulnerabilities in web application.
This vulnerability type can lead to violations for the user or site. Many tools and methods concentrate on discovering
this vulnerability in PHP source code. Nevertheless, at the present time, this remains an issue in identifying XSS
vulnerability in PHP web applications. Majority of the previous tools and approaches focused on utilizing static
analysis for the detection of XSS vulnerabilities. This is because it has the ability to virtually accomplish 100% code
coverage and observe the entire paths of the program. Moreover, recent studies have discovered that static analysis is
better than other approaches at detecting this type of vulnerability. Combining static analysis with other algorithms
(e.g. genetic algorithm, pattern matching and machine learning) enhanced the detection results and the run time of
static analysis. However, after merging static analysis with such algorithms, the problem of static analysis still persists
which is the false positive rate in the results. Only one approach based on static analysis and genetic algorithm that
prove the efficiency to detect XSS vulnerability without false positive results. However, this approach still manual,
which means it is not suitable for large web applications. Therefore, this approach can be improved to be automated
approach. On the other hand, other algorithms such as (Particle Swarm Optimization and Ant Colony) still not entered
the area yet, which it may provide better results combined with static analysis to detect XSS vulnerability as well.
1180 Abdalla Wasef Marashdih et al. / Procedia Computer Science 161 (2019) 1173–1181
8 Author name / Procedia Computer Science 00 (2019) 000–000

References

[1] Marashdih, Abdalla Wasef, and Zarul Fitri Zaaba. (2017) “Detection and Removing Cross Site Scripting Vulnerability in PHP Web
Application.” International Conference on Promising Electronic Technologies (ICPET). pp. 26-31.
[2] Veracode. (2016) State of Software Security. Available from: https://www.veracode.com. [Accessed January 2019].
[3] Marashdih, Abdalla Wasef, Zarul Fitri Zaaba, and Khaled Suwais. (2018) “Cross Site Scripting: Investigations in PHP Web Application.”
International Conference on Promising Electronic Technologies (ICPET). pp. 25-30.
[4] OWASP. (2019) Top 10 Threats for Web Application Security. Available from: www.owasp.org/index.php/Top_10_2017-Top_10.
[Accessed January 2019].
[5] Gupta, Mukesh Kumar, Mahesh Chand Govil, and Girdhari Singh. (2014) “A Context-Sensitive Approach for Precise Detection of Cross-
Site Scripting Vulnerabilities.” International Conference on Innovations in Information Technology (IIT), 7-12.
[6] Shar, Lwin Khin, and Hee Beng Kuan Tan. (2013) “Predicting SQL Injection and Cross Site Scripting Vulnerabilities Through Mining
Input Sanitization Patterns.” Information and Software Technology 55 (10): 1767-1780.
[7] Shar, Lwin Khin, Lionel C. Briand, and Hee Beng Kuan Tan. (2015) “Web Application Vulnerability Prediction Using Hybrid Program
Analysis and Machine Learning.” IEEE Transactions on Dependable and Secure Computing 12 (6): 688-707.
[8] Bozic, Josip, and Franz Wotawa. (2013) “XSS Pattern for Attack Modeling in Testing.” International Workshop on Automation of
Software Test (AST). pp. 71-74.
[9] Damodaran, Anusha, Fabio Di Troia, Corrado Aaron Visaggio, Thomas H. Austin, and Mark Stamp. (2017) “A Comparison of Static,
Dynamic, and Hybrid Analysis for Malware Detection.” Journal of Computer Virology and Hacking Techniques 13 (1): 1-12.
[10] Ahmed, A. Moataz, and Fakhreldin Ali. (2016) “Multiple-path Testing for Cross Site Scripting Using Genetic Algorithms.” Journal of
Systems Architecture 64: 50-62.
[11] Marashdih, Abdalla Wasef, Zarul Fitri Zaaba, and Herman Khalid Omer. (2017) “Web Security: Detection of Cross Site Scripting in PHP
Web Application Using Genetic Algorithm.” International Journal of Advanced Computer Science and Applications 8 (5): 64-75.
[12] Medeiros, Ibéria, Nuno Neves, and Miguel Correia. (2016) “DEKANT: A Static Analysis Tool That Learns to Detect Web Application
Vulnerabilities.” International Symposium on Software Testing and Analysis. pp. 1-11.
[13] Zhang, Na, Biao Wu and Xiaoan Bao. (2015) “Automatic Generation of Test Cases Based On Multi-Population Genetic Algorithm.”
Technology 10 (6): 113-122.
[14] Acunetix. (2016) Acunetix Web Application Vulnerability. Available from: https://www.acunetix.com. [Accessed January 2019].
[15] Jovanovic, Nenad, Christopher Kruegel, and Engin Kirda. (2006) “Pixy: A Static Analysis Tool for Detecting Web Application
Vulnerabilities.” IEEE Symposium on Security and Privacy (S&P'06).
[16] Gupta, Shashank, and Brij Bhooshan Gupta. (2017) “Cross-Site Scripting (XSS) Attacks and Defense Mechanisms: Classification and
State-Of-The-Art.” International Journal of System Assurance Engineering and Management, 8 (1): 512-530.
[17] Fonseca, Jose, Nuno Seixas, Marco Vieira, and Henrique Madeira. (2014) “Analysis of Field Data on Web Security Vulnerabilities.” IEEE
Transactions on Dependable and Secure Computing 11 (2): 89-100.
[18] Yusof, Imran, and Al-Sakib Khan Pathan. (2016) “Mitigating Cross-Site Scripting Attacks with a Content Security Policy.” Computer 49
(3): 56-63.
[19] Fonseca, da, José Carlos Coelho Martins, and Marco Paulo Amorim Vieira. (2014) “A Practical Experience on The Impact of Plugins in
Web Security.” International Symposium on Reliable Distributed Systems. pp. 21-30.
[20] Prokhorenko, Victor, Kim-Kwang Raymond Choo, and Helen Ashman. (2016) “Web Application Protection Techniques: A Taxonomy.”
Journal of Network and Computer Applications 60: 95-112.
[21] Marashdih, Abdalla Wasef, and Zarul Fitri Zaaba. (2016) “Cross Site Scripting: Detection Approaches in Web Application.” International
Journal of Advanced Computer Science and Applications 7 (10): 155-160.
[22] Bradley, Andrew P. (1997) “The Use of The Area Under The ROC Curve in The Evaluation of Machine Learning Algorithms.” Pattern
recognition 30 (7): 1145-1159.
[23] Fawcett, Tom. (2006) “An Introduction to ROC analysis.” Pattern Recognition Letters 27 (8): 861-874.
[24] Wiesmann, Adrian, et al. (2005) “A Guide to Building Secure Web Applications and Web Services.” The Open Web Application Security
Project.
[25] Krsul, Ivan, Eugene Spafford, and Mahesh Tripunitara. (1998) Computer Vulnerability Analysis. COAST Laboratory, Purdue University,
West Lafayette.
[26] Chess, Brian, and Jacob West. (2007) “Secure Programming With Static Analysis.” Pearson Education.
[27] Avancini, Andrea, and Mariano Ceccato. (2010) “Towards Security Testing with Taint Analysis and Genetic Algorithms.” ICSE Workshop
on Software Engineering for Secure Systems. pp. 65-71.
[28] Buczak, Anna L., and Erhan Guven. (2015) “A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion
Detection.” IEEE Communications Surveys & Tutorials 18 (2): 1153-1176.
[29] Scholte, Theodoor, William Robertson, Davide Balzarotti, and Engin Kirda. (2012) “Preventing Input Validation Vulnerabilities in Web
Applications Through Automated Type Analysis.” IEEE Annual Computer Software and Applications Conference. pp. 233-243.
[30] Han, Jiawei, Jian Pei, and Micheline Kamber. (2011) “Data Mining: Concepts and Techniques.” Elsevier.
[31] Shar, Lwin Khin, and Hee Beng Kuan Tan. (2012) “Mining Input Sanitization Patterns for Predicting SQL Injection and Cross Site
Scripting Vulnerabilities.” International Conference on Software Engineering. pp. 1293-1296.
[32] Medeiros, Iberia, Nuno Neves, and Miguel Correia. (2016) “Detecting and Removing Web Application Vulnerabilities with Static
Analysis and Data Mining.” IEEE Transactions on Reliability 65 (1): 54-69.
Abdalla Wasef Marashdih et al. / Procedia Computer Science 161 (2019) 1173–1181 1181
Author name / Procedia Computer Science 00 (2019) 000–000 9

[33] Gupta, Mukesh Kumar, Mahesh Chandra Govil, and Girdhari Singh. (2015) “Text-mining Based Predictive Model to Detect XSS
Vulnerable Files in Web Applications.” IEEE India Conference (INDICON). pp. 1-6.

Вам также может понравиться