Вы находитесь на странице: 1из 28

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/281035681

Protocol to conduct Systematic Literature Reviews in Software Engineering: a


chronological point of view of the changes made

Technical Report · August 2013


DOI: 10.13140/RG.2.1.2349.7444

CITATION READS

1 91

2 authors:

Samuel Sepúlveda Ania Cravero


Universidad de La Frontera Universidad de La Frontera
38 PUBLICATIONS   58 CITATIONS    30 PUBLICATIONS   60 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Feature modeling tools: quality assessment View project

Big Data architecture prototype to find spatiotemporal significant associations between variables in a river basin in southern Chile, Araucania
Region. View project

All content following this page was uploaded by Samuel Sepúlveda on 17 August 2015.

The user has requested enhancement of the downloaded file.


TECHNICAL REPORT

TR-DCI-01-13 - V1.0

Protocol to conduct Systematic Literature Reviews in Software


Engineering: a chronological point of view of the changes made.

Samuel Sepúlveda
(samuel.sepulveda@ceisufro.cl)

Ania Cravero
(ania.cravero@ceisufro.cl)

Dpto. Ciencias de Computación e Informática (DCI)


Centro de Estudios en Ingeniería de Software (CEIS)
Universidad de La Frontera (UFRO)
Av. Francisco Salazar, 01145, Temuco, Chile.
Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

ABSTRACT

Background: Systematic literature reviews (SLRs) have reached a considerable level of adoption in
software engineering (SE). However protocol adaptations for implementation remain tangentially
addressed, thus preventing them from reaching their full potential as a research methodology and as a
source of information for the software industry.
Objective: To account the use and adaptation of the SLR as a research methodology in SE, providing a
chronological study that includes its current status.
Methodology: A systematic literature search was performed, reviewing two sets of articles between 2004
and 2011, using digital data sources recognized by the scientific community. The first set includes 151
articles that published SLR in SE. In addition, 26 articles were reviewed that contain adaptations for
conducting SLR in SE, finally 11 papers were selected according to the inclusion/exclusion criteria.
Results: A chronological study is provided that includes the current state of the SLR as a research
methodology in SE and we show a summary of main proposals for protocol adaptations to conduct SLRs
in SE.
Conclusions: Although other papers have presented observations and critiques of SLRs in SE, no
evidence has been found of papers that specifically report results on their adaptations as a research
methodology applied in software engineering. The results indicate areas where the quantity and quality of
investigations needs to be increased.

Key words: Systematic literature review, software engineering, chronological study.

DCI-CEIS-UFRO Sepúlveda and Cravero 2


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

1. Introduction

The importance of research activity in Software Engineering*(SE) is aimed at producing knowledge based on
the scientific method, and this has become one of the main challenges in strengthening the foundations of SE
as a discipline on its path to total maturity (Rodriguez 2005). This is not only related to the academic world,
industry has also been receiving the benefits of the scientific method to validate its software technologies
(Zelkowitz, Wallace et al. 2003) and improving the software processes (Chrissis, Konrad et al. 2003).

Different types of experimental studies can be used in SE (Wohlin, Höst et al. 2006). Some proposals to
support the fulfillment of these studies can be found in the technical literature (Wohlin, Runeson et al. 2000).
Researchers have applied primary studies to improving the knowledge of SE (Basili, Shull et al. 1999) in
order to support the processes related to SE technologies, mainly those related to appraising the technology
(Shull, Carver et al. 2001). In the other hand, researchers use the secondary studies too.

Secondary studies are those designed to produce or assemble systematic comparisons between the individual
investigations, scientifically selected within a series of primary studies that can support the creation of an
evidence-based body of knowledge (Kitchenham, Brereton et al. 2009).

Evidence-based research was developed initially in medicine, since research based on the expert opinion of
medical-doctors is not as reliable as the results of scientific experiments (Dybå, Kitchenham et al. 2005).
Then, many fields have adopted this approach, e.g. criminology, social policy, economy, and increasingly
over the last few years in SE (Jørgensen and Shepperd 2007). Evidence-based Software Engineering (EBSE)
is designed to provide the means to obtain the best current evidence from an investigation, integrating
practical experience and human values into the decision-making with respect to software development and
maintenance (Dybå, Kitchenham et al. 2005), understanding the evidence as a synthesis of high-quality
scientific studies on a specific topic or research problem.

In 2004 the concept of EBSE was introduced as an approach that integrated academic research and industrial
practice in SE (Kitchenham, Dyba et al. 2004). EBSE was then presented from the point of view of the SE
practitioner (Dybå, Kitchenham et al. 2005) and was complemented with a practical way of teaching EBSE to
university students (Jorgensen, Dyba et al. 2005). A far-sighted view of the use of empirical methods and how
they could contribute to improving research and practice in SE, identifying the main challenges, the main one
being the proposal to increase available resources so as to perform empirical studies in SE according to the
importance of software systems in their social context (Sjøberg, Dybå et al. 2007).

By analogy with evidence-based medicine, five steps are needed to practice EBSE (Sackett, Rosenberg et al.
1996): (1) convert the need for information (regarding the practice of SE) into questions and answers, (2)
identify with maximum efficiency the best evidence to respond to these questions, (3) assess the critical
evidence: its validity and utility, (4) put the results of this evaluation into practice in SE and (5) evaluate the

SE: Software Engineering


SLR: Systematic Literature Review
EBSE: Experience Based Software Enginnering

DCI-CEIS-UFRO Sepúlveda and Cravero 3


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

yield of this implementation. The end point of EBSE is that professionals use the appropriate directives to
provide SE solutions in a specific context (Kitchenham, Brereton et al. 2009).

The preferred method for the application of steps 2 and 3 is the systematic literature review (SLR) (da Silva,
Santos et al. 2011). Unlike a peer review, a SLR is a rigorous methodological review of research results, the
aim of which is not only to provide all the existing evidence on a research question, but also to support the
development of evidence-based directives for professionals (Kitchenham, Charters et al. 2007).

It was Bárbara Kitchenham (2004) who adopted the directives to implement the SLRs from medicine in SE.
Later, these directives were updated using concepts from the social sciences (Kitchenham, Charters et al.
2007). Nevertheless, a SLR process uses specific concepts and terms that may be unknown to researchers
who carry out ad-hoc literature reviews. In addition, SLRs require an additional steering effort, must be
planned prior to execution and the entire process must be documented, including the interim results
(Biolchini, Mian et al. 2005). This indicates the need to direct research efforts into the development of
planning and methodologies for execution, so as to guide researchers in carrying out the SLR process;
therefore, the need for adaptation to the field of SE must be considered.

The aim of this work is to account for the use and adaptation of the SLR as a research methodology in SE,
providing a chronological study that includes its current status. The motivation that guides this work
originates in the increase of SLRs conducted in SE, which is why it is interesting to show how the use of
SLRs has been adapted in SE to date. This article may be of interest particularly to researchers planning to
conduct additional studies on SLRs and their application in SE, as well as to industry professionals and new
researchers who wish to approach SLRs as a relevant source of information in SE.

The structure of the article presents a set of related works in section 2. In section 3 the stages that comprise a
SLR are explained and briefly discussed. In section 4 the adaptations in the use of SLRs in SE are shown and
discussed. Finally, in section 5 the main conclusions and considerations of this work are presented.

2. Protocol for Systematic Literature Reviews


The main stages that comprise the SLR process are presented in Figure 1. These stages include planning,
execution, and results analysis (Kitchenham 2004). The activities provided for at each stage are illustrated in
Figure 2, where the relationship with the phases of the SLR is indicated in parentheses as well as the stages
that present changes according to the evidence collected, which is detailed in section 4.3.
During the planning stage, the aims of the investigation are enumerated and the review protocol is defined.
This protocol specifies the central research question and the methods that will be used to execute the review.
The execution stage consists of the primary identification of the studies, the selection and evaluation
according to the inclusion and exclusion criteria established in the review protocol. Once the studies were
selected, the data from the articles were extracted and synthesized during the results analysis stage
(Kitchenham 2004).

DCI-CEIS-UFRO Sepúlveda and Cravero 4


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

Figure 1. Systematic Review Process

There are two checkpoints in the systematic review process: (1) before executing the systematic review, it is
necessary to guarantee that the planning is adequate and (2) the protocol must be evaluated and if there are
problems, the investigator must return to the planning stage to review the protocol. Likewise, if problems
with respect to the Internet search engines are found in the execution stage, then a new systematic review
must be executed (Mian, Conte et al. 2005).
The aforementioned stages may seem sequential, but it is important to recognize that many of the stages
involve repetition. In particular, many activities begin during the protocol development stage, and they are
then refined and adapted to be carried out again (Kitchenham 2004; Brereton, Kitchenham et al. 2007).

3. SLR adaptation in SE
Next, we present the results of studying the adaptation made to the SLR in SE. First, the methodology applied
is reported, and then the adaptation of SLRs in SE is explained, demonstrating the selected proposals as well
as their main characteristics. Afterwards, some works are shown that, although they do not propose changes to
the SLR protocol, add elements to the discussion that must be considered when improving the design and use
of the SLR in SE in the near future. Finally, some threats are identified that may infringe upon the validity of
the work being conducted, and then we end with a discussion regarding the topics treated in this section.

3.1 Methodology applied


In order to clarify how the data shown were obtained during the development of this work, what follows is a
presentation of the fundamental steps of the applied methodology. A systematic search was made in the
literature, compiling background on the SLRs undertaken in SE. We speak of a systematic search and not of a
SLR in the strict sense as defined by (Kitchenham, Charters et al. 2007), because although the search criteria,
and selection and exclusion of works were all established, we did not strictly follow all the steps defined in
the protocol for its implementation, instead concentrating on compiling SLRs in SE for the period between

DCI-CEIS-UFRO Sepúlveda and Cravero 5


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

2004 (year of publication of the first SLR adaptation protocol for SE) and 2011. In addition, works were
compiled that provided proposals or changes to the protocol of how to carry out a SLR in SE.

- Research questions: The research questions to be answered with this work are basically two.
• RQ1: How did increase the use of the SLR methodology in SE?
• RQ2: How has the original protocol been modified for SLR implementation in SE? The latter was
divided into two questions:
o RQ2.1: How many protocol proposals to develop SLRs in SE or changes to these have been
published?
o RQ2.2: At what stages and activities are the proposals for changes to the protocol concentrated?

Figure 2. Activities at each stage of a SLR

- Search for works: To attempt to answer the previous questions, the systematic search was based on
identifying: (1) SLRs performed between 2004 and 2011 and (2) the works that mention changes or proposals
to the protocol to guide SLRs in SE.

DCI-CEIS-UFRO Sepúlveda and Cravero 6


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

These works were sought in some of the sources most frequently used by the SE community (Brereton,
Kitchenham et al. 2007), in our case we consulted IEEEXplore, ACM Digital Library and Science Direct. In
these sources the following search strings were used: (1) initially “systematic literature review” OR
“systematic review” and (2) then refining with the string “software engineering”. In the case of the proposals
reporting changes to the protocol for conducting SLR in SE, the concepts: “guidelines”, “protocols”,
“lessons” and “studies” were added to the search string.

- Selection of works: Once the data sources were identified, and having executed the queries according to the
defined search strings, all those works that reported the results of carrying out SLRs on SE topics and
excluding those that did not were compiled by reviewing the title, abstract and key words of each work.
The selection of the works reporting changes to the protocol to perform a SLR was much more detailed,
including additional reading about the methodology used, the work carried out by the researchers and the
results. This was necessary to verify that each work provided proposals to modify the original protocol
established by Kitchenham in 2004. Initially compiling 26 works and selecting 11, which are the ones that
were analyzed and described in detail in section 4.3.
Considering the fact that the inclusion or exclusion of the works was done by reviewing and interpreting the
text (which is potentially ambiguous), the reliability between the reviewers was calculated using Cohen’s
Kappa statistic (Gwet 2002). The results were satisfactory (K = 0.827), which indicates that the scale
presented in (Clark, Sammut et al. 2004) provides a basis of sufficiently clear criteria, and that it does not
induce significant differences between the reviewers. In any case, for those cases where the investigators had
doubts at the time of whether or not to include a certain work, this was subjected to an individual review and
then a decision was made by group consensus.

- Inclusion and exclusion criteria: The following states the criteria that established the relevance of the
articles compiled for their inclusion with respect to their approaches to the protocol to develop SLRs in SE.
(i) Inclusion criteria: included were all the works that approach the topic of the SLR and that specifically
mention aspects dealing with the modification of the protocol to carry out a SLR, i.e. how to conduct a SLR
and the stages/activities that this entails.
(ii) Exclusion criteria: excluded were all those works that deal with SLR topics, but do not suggest proposals
on how to carry out or modify the defined protocol to develop SLRs in SE.

- Data extraction and synthesis: The data with respect to the SLRs in SE consisted of counting the works that
reported undertaking a SLR in SE between 2004 and 2011 and identifying the sources of their publication.
The results of these counts are summarized in the graphs that appear in the Figures 3 to 5.
As for the works that submit a proposal to amend the SLR protocol in SE, a previous literature review
established as to which would be the activities for which they seek changes or proposals and then the proposal

DCI-CEIS-UFRO Sepúlveda and Cravero 7


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

for each defined activity was extracted from each paper. The results of this review are summarized in Tables 1
to 9 in section 4.3.

3.2 Number of SLRs in SE


Before fully entering into the review of the selected proposals, the number of SLRs conducted in SE was
counted in order to provide an overview of its increase over time. Figure 3 illustrates the increase in the total
number of SLRs published (unique works, without considering indexing of the same work in more than one
source) between 2004 and 2011 according to what could be verified in the data sources already mentioned.
The total number of publications is 151 works that state they conducted a SLR and showed their results.

Figure 3. Number of publications with SLR in SE (sources: ACM, IEEE and Science Direct)

Only with the idea of confirming what is shown in Figure 3, a complementary search was done using the
aforementioned search strings:
(1) Specific search for one of the selected sources, in this case Science Direct, obtaining a total of 4314
matches, and then when narrowed 63 matches, the annual number for which for the period 2004-2011 is in
Figure 4. The difference in the adoption of SLR between SE and other branches of science such as medicine
should be emphasized, because the latter have a history of performing SLRs since the mid-1960s.
(2) An extended search using Google Scholar as the reference, obtaining a total of 30900 matches (without
considering references to other articles), and then when narrowed 1035 matches, the annual number of which
for the period 2004-2011 is in Figure 5.

DCI-CEIS-UFRO Sepúlveda and Cravero 8


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

   
Figure  4.  Number  of  publications  with  SLR  in  SE   Figure  5.  Number  of  publications  with  SLR  in  SE  
(sources:  Science  Direct)   (sources:  Google  Scholar)  

Comparing the trends in the three previous figures, a significant increase in the number of SLR publications
in SE is observed in the 2007-2008 period.
From another perspective and according to the data shown in Figure 3, we can establish the level of increase
for SLRs in SE published between 2004-2011, verifying the increase of SLRs published between 2
consecutive years and obtaining an annual average rate of increase in publications. The absolute average
increase between 2 consecutive years is approximately 7 works and the average rate of increase between 2
consecutive years is approximately 44%.

3.3 Proposals for changes to the protocol


This section analyzes in greater detail the selected works that present proposals for changes to the protocol for
conducting a SLR in SE. Table 1 shows the different activities and their identifier for each of the stages of a
SLR; these were chosen by reviewing evidence in the literature that where there were changes to these, after
applying the aforementioned inclusion/exclusion selection criteria. The following sections detail the proposals
for each stage and each activity described.

Table 1. Stages and activities reviewed


Planning A1:  Definition  of  the  Research  Questions  (RQ)  
Implementation A2:  Identification  of  the  relevant  works  
A3:  Selection  of  the  relevant  works  
A4:  Evaluation  of  the  quality  of  the  works  selected  
A5:  Data  extraction    
A6:  Data  synthesis    
Documentation A7:  Report  of  the  results  

Having selected the works and applied the inclusion/exclusion criteria, 11 works were found that present
proposals or modifications to the protocol to develop SLRs in SE.
In order to evaluate the impact of each work that presents proposals/changes when a SLR is being developed
in SE, Table 2 displays the total and annual number of references between 2004 and 2011. This number of

DCI-CEIS-UFRO Sepúlveda and Cravero 9


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

references was compiled from a public source like Google Scholar and is ordered from the highest to the
lowest number of references.

Table 2. Number of references/year for selected works.


Nº Total
Authors 2004 2005 2006 2007 2008 2009 2010 2011
refs.
1 (Kitchenham 2004) 339 3 18 11 38 55 57 78 79
2 (Brereton, Kitchenham et al. 2007) 137 -- -- -- 5 17 28 45 42
3 (Dybå, Dingsøyr et al. 2007) 66 -- -- -- 0 10 11 23 22
4 (Petersen, Feldt et al. 2007) 48 -- -- -- 0 1 1 16 30
5 (Staples and Niazi 2007) 45 -- -- -- 2 8 10 14 11
6 (Mian, Conte et al. 2005) 7 -- 0 0 1 3 0 2 1
7 (Kitchenham, Brereton et al. 2010) 5 -- -- -- -- -- -- 2 3
8 (Zhang, Babar et al. 2011) 3 -- -- -- -- -- -- -- 3
9 (Grimán and Juristo 2007) 1 -- -- -- 1 0 0 0 0
10 (Kitchenham, Charters et al. 2007) 1 -- -- -- 0 0 1 0 0
11 (Caro, Ríos et al. 2005) 0 -- 0 0 0 0 0 0 0

As Table 2 illustrates, the work with greatest number of references is the initial SLR protocol in SE defined
by (Kitchenham 2004), which could be attributed to this being the first work dealing with the subject of SLRs
and proposing a protocol of how to perform them in SE. It is also the longest standing in this selection and the
trend might indicate that it will continue to be mentioned as the standard work on this topic. In addition, the
first three on the list with the greatest number of references were led by or counted on the participation of
Bárbara Kitchenham; consequently, we can say, according to the evidence compiled to date, that she and her
research group are leading the work in terms of reviews of SLRs in SE. In the case of the work by (Caro, Ríos
et al. 2005), which presents no references, we believe this is due to the fact that they propose an adaptation of
the SLR protocol for undergraduate students, and unlike the rest of the works is published in Spanish.
As for the work by (Petersen, Feldt et al. 2007), although it specified the bases for developing systematic
mapping processes, it was included because it suggested a comparison between these and the SLRs,
establishing criteria and comments that positively influence the protocol design to carry out a SLR.
Also noteworthy is the proposal by (Mian, Conte et al. 2005), who suggested the use of templates to make the
SLR easier. The final work to be mentioned is (Grimán and Juristo 2007), who even changed the stages of the
protocol defined by Kitchenham, proposing an alternative with other stages and activities.

3.3.1 Stages of a SLR-SE analysis


In order to conduct a more detailed analysis of each of the stages and the activities developed within each
stage according to the protocol of a SLR defined by(Kitchenham 2004), a table was designed for each activity
reviewed at each stage.

DCI-CEIS-UFRO Sepúlveda and Cravero 10


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

The idea is to reflect the changes that have been set out regarding the selected activities, knowing who
proposed them, which one they deal with and when they were created according to what is reported in the
literature. In addition, in each table of the stages reviewed there is a column called Code, which identifies
each proposal with the author and activity to which it is related so that it can be identified in the timelines that
appear in the following sections. This code was built from the first three letters of the last name of the main
author of the work, followed by the activity to which it alludes, in the case of matching the letters of the last
name numerical correlative is added between these and the activity. An example would be to consider for
activity 1 (A1), the work of (Brereton, Kitchenham et al. 2007): the code that identifies it is BreA1.

3.3.1.1 Planning stage and activities reviewed


For the planning stage the idea was to review the collected data to the activity “Definition of the RQ (A1)”. In
Table 3, the changes or proposals for this stage and activity A1 are presented (Kitchenham 2004; Mian, Conte
et al. 2005; Brereton, Kitchenham et al. 2007; Grimán and Juristo 2007; Kitchenham, Charters et al. 2007;
Staples and Niazi 2007).

Table 3. Proposals for the stage “Definition of the RQ (A1)”


Changes - Proposals Code Authors
Adaptation and guidelines to define the RQ from medicine in the SE context KitA1 (Kitchenham
2004)
Proposes a specific section to define the RQ according to the context in MiaA1 (Mian, Conte
which the SLR will be developed and the questions that the study must et al. 2005)
answer (syntax), and endeavors to determine the specificity (semantic).
Proposes the use of a predefined structure to guide RQ construction. Kit2A1 (Kitchenham,
Charters et
al. 2007)
Different way to pose the RQs because they do not fit exactly with what is StaA1 (Staples and
proposed in the original protocol. Suggests defining and using Niazi 2007)
complementary RQs to clarify the topic and scope of the investigation.
Proposes review of the RQs during protocol development as understanding BreA1 (Brereton,
of the problem increases. Also suggests carrying out a pre-review using a Kitchenham
mapping study that helps define the RQs. et al. 2007)
Proposes not defining RQs a priori, but previously developing an exploratory GriA1 (Grimán and
study to make progress on the subject dealt with. Juristo 2007)

It can be observed that generally the proposals mentioned here suggest: (1) a guideline to help defining high
quality RQs, (2) guidelines to review that the defined RQs are indeed the most appropriate and (3) that the
RQs are not defined a priori, but rather defined as a greater knowledge of the subject being gained.

3.3.1.2 Implementation stage and activities reviewed


For the implementation stage the idea was to review the reference to the activities: identification of relevant
works (A2), selection of relevant works (A3), evaluation of the quality of the works selected (A4), data
extraction (A5) and data synthesis (A6).

DCI-CEIS-UFRO Sepúlveda and Cravero 11


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

- Identification of relevant works (A2): Table 4 contains the changes or proposals for activity A2 (Kitchenham
2004; Caro, Ríos et al. 2005; Mian, Conte et al. 2005; Brereton, Kitchenham et al. 2007; Grimán and Juristo
2007; Kitchenham, Charters et al. 2007; Petersen, Feldt et al. 2007; Zhang, Babar et al. 2011).

Table 4. Proposals for the stage “Identification of relevant works (A2)”


Changes - Proposals Code Authors
Defines a search strategy and suggests data sources to consider, to avoid bias KitA2 (Kitchenham
and to document the searches conducted. 2004)
Within the template, the selection criteria are determined, the languages of MiaA2 (Mian, Conte
the works are emphasized, and the relevant sources are identified, generating et al. 2005)
an initial list of relevant works.
Clearly establishes a list of sources to review for a SLR. Kit2A2 (Kitchenham,
Charters et al.
2007)
Defines the terms and their combinations to conduct the search. Determines CarA2 (Caro, Ríos et
the main sources to consider and records the results obtained that can be used al. 2005)
to justify the need to investigate a specific area or the rigor of the work,
among others.
Classifies each work into one of six possible categories: validation research, PetA2 (Petersen,
evaluation research, solution proposal, philosophical papers, opinion papers, Feldt et al.
experience papers. 2007)
Selection and justification of the search strategies suitable for the defined BreA2 (Brereton,
RQs. Seeks different sources and prepares several searches for the same Kitchenham
concept because these depend on the source. et al. 2007)
This activity is carried out in the first stage of their proposal, compiling a set GriA2 (Grimán and
of papers that may potentially be relevant to the investigation. Juristo 2007)
Proposes a systematic strategy, based on evidence and rigorously developed ZhaA2 (Zhang,
to implement and evaluate the search for relevant works. Babar et al.
2011)

It can be observed that generally the proposals mentioned here suggest: (1) identification and selection of
relevant data sources, (2) definition and justification of a systematic search strategy and according to the
defined RQs and (3) identification of categories for classification of the works identified.

- Selection of the relevant works (A3): Table 5 contains the changes or proposals for activity A3 (Kitchenham
2004; Caro, Ríos et al. 2005; Mian, Conte et al. 2005; Brereton, Kitchenham et al. 2007; Grimán and Juristo
2007; Kitchenham, Charters et al. 2007; Staples and Niazi 2007; Kitchenham, Brereton et al. 2010).

Table 5. Proposals for the stage “Selection of relevant works (A3)”


Changes - Proposals Code Authors
Proposes guidelines for the definition of the inclusion/exclusion criteria and KitA3 (Kitchenham
of the process for selecting the works. 2004)
Defines inclusion and exclusion criteria, as well as peer review of the initial MiaA3 (Mian, Conte
list to avoid bias. Defines the types of studies to be reviewed. et al. 2005)
Proposes a guideline based on a set of parameters to evaluate the Kit2A3 (Kitchenham,
inclusion/exclusion of a work. Charters et al.
2007)
Proposes breaking down the structure of a paper and reviewing the parts CarA3 (Caro, Ríos et
individually. al. 2005)

DCI-CEIS-UFRO Sepúlveda and Cravero 12


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

The first selection of articles is made based on the title and abstract. Then the StaA3 (Staples and
detail of each work selected is reviewed. The work of two reviewers is Niazi 2007)
suggested: one for the first stage and another for the second.
One must look beyond the abstracts, in SE and IT these are generally of low BreA3 (Brereton,
quality; the conclusions should also be reviewed. Kitchenham
et al. 2007)
Carried out in the first stage of the proposal on the basis of the title and GriA3 (Grimán and
abstract, obtaining an initial list of papers to review. The aim of the review Juristo 2007)
and classification of the papers selected are determined in detail, executing a
refined search that complements the list.
Use of automated searches versus manual annotated searches, based on Kit3A3 (Kitchenham,
whether the number or quality of the works selected is desired. Brereton et
al. 2010)

It can be observed that generally the proposals mentioned here suggest: (1) definition of guidelines to
establish the inclusion/exclusion criteria, (2) guidelines to resolve disagreements between reviewers when
selecting works, (3) use of peer review to avoid bias when selecting a work and (4) review of other elements
of the paper such as the conclusions, because abstracts are usually of low quality.

- Evaluation of the quality of the works selected (A4): Table 6 contains the changes or proposals for activity
A4 (Kitchenham 2004; Dybå, Dingsøyr et al. 2007; Grimán and Juristo 2007; Kitchenham, Charters et al.
2007; Staples and Niazi 2007; Kitchenham, Brereton et al. 2010).

Table 6. Proposals for the stage “Evaluation of the quality of the works selected (A4)”
Changes - Proposals Code Authors
Suggests guidelines for the definition of a criterion that makes it possible to KitA4 (Kitchenham
assess the quality of the selected works. Establishes hierarchies as far as 2004)
types of works in SE, as well as the development and use of quality
instruments.
Proposes a set of checklists with factors that can evaluate the quality of the Kit2A4 (Kitchenham,
selected works. Charters et al.
2007)
The proposal suggests that the same investigator do A4 and A5 StaA4 (Staples and
simultaneously. Niazi 2007)
It is a complete stage of the proposal, third and last that is carried out on the GriA4 (Grimán and
selected works to determine their quality, ensuring that they receive some Juristo 2007)
particular treatment during the synthesis.
Frame to evaluate the quality of the work selected based on eleven quality DybA4 (Dybå,
criteria including rigor, relevance and credibility. Dingsøyr et
al. 2007)
Quality evaluations should be based on the participation of three independent Kit3A4 (Kitchenham,
evaluators and include at least two rounds of discussion to settle Brereton et
disagreements in the evaluation. al. 2010)

It can be observed that generally the proposals mentioned here suggest: (1) guidelines and framework to
evaluate the quality of the selected work, (2) use of checklists with defined factors to evaluate the quality of
the work and (3) participation of multiple evaluators and discussion rounds to reach a consensus on criteria.

DCI-CEIS-UFRO Sepúlveda and Cravero 13


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

- Data extraction (A5): Table 7 contains the changes or proposals for activity A5 (Kitchenham 2004; Caro,
Ríos et al. 2005; Mian, Conte et al. 2005; Brereton, Kitchenham et al. 2007; Grimán and Juristo 2007;
Kitchenham, Charters et al. 2007; Petersen, Feldt et al. 2007; Staples and Niazi 2007).

Table 7. Proposals for the stage “Data Extraction (A5)”


Changes - Proposals Code Authors
Defines guidelines for forms that make it possible to collect the data KitA5 (Kitchenham
extracted from the selected works. 2004)
Defines four steps to extract data from the selected works: review of MiaA5 (Mian, Conte
compliance with inclusion/exclusion criteria, standardization to document et al. 2005)
extracted data, execution of the extraction (subjective and objective results)
and the review and explanation of disagreements between reviewers.
Recommends that software tools be used to store the most relevant data from CarA5 (Caro, Ríos
each paper. It proposes a strategy divided into two stages, on which sections et al. 2005)
of the paper to read from the sections identified in A3.
It proposes the design and use of forms to collect data from the studies, as Kit2A5 (Kitchenham,
well as a sample of guidelines to improve this task. Charters et
al. 2007)
The proposal suggests that the same investigator do A4 and A5 StaA5 (Staples and
simultaneously. Niazi 2007)
Use of electronic spreadsheets to document the process, when a reviewer PetA5 (Petersen,
adds a work to a certain category he/she must justify briefly the reason for Feldt et al.
that decision. 2007)
Proposes working in pairs, even more so with great numbers of papers: one BreA5 (Brereton,
reviewer as the data extractor and the other as the validator, ensuring that Kitchenham
both understand the protocol and data extraction process. et al. 2007)
In the second stage of the proposal, endeavors to record certain parameters of GriA5 (Grimán and
the selected work and which section the data are stored in. Data regarding the Juristo 2007)
quality of the work are not recorded.

It can be observed that generally the proposals mentioned here suggest: (1) design and use of forms to record
data, (2) use of software tools to support the documentation of data, (3) use of peer review and (4) recording
of the section of the article where the selected data is found.

- Data synthesis (A6): Table 8 contains the changes or proposals for activity A6 (Kitchenham 2004; Caro,
Ríos et al. 2005; Mian, Conte et al. 2005; Brereton, Kitchenham et al. 2007; Grimán and Juristo 2007;
Petersen, Feldt et al. 2007; Staples and Niazi 2007).

Table 8. Proposals for the stage “Data Synthesis (A6)”


Changes - Proposals Code Authors
Defines guidelines for synthesizing data both quantitatively and qualitatively, KitA6 (Kitchenham
and sensitivity analysis of the collected data. 2004)
Series of six steps to summarize the results: calculation of statistical results, MiaA6 (Mian, Conte
preparation of results in tables, sensitivity analysis and possible meta- et al. 2005)
analysis, graphic of the data, final comments and recommendations of how
the results must be applied.
Proposes criteria/suggestions so that data can be synthesized from selected CarA6 (Caro, Ríos
papers. Also provides an example application. et al. 2005)
Suggests the use of databases to improve data analysis and queries. StaA6 (Staples and

DCI-CEIS-UFRO Sepúlveda and Cravero 14


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

Niazi 2007)
Use of categorized table allows publication frequencies of each work to be PetA6 (Petersen,
obtained. Starting from the categories and aspects defined in the study, a Feldt et al.
bubble graph is constructed that shows the number of works on each topic in 2007)
terms of the size of the bubble.
Use of tabulated data to facilitate their combination and thus to clarify how BreA6 (Brereton,
the data answer the RQs. Kitchenham
et al. 2007)
Forms part of the gathering phase, but here is called experiment codification. GriA6 (Grimán and
The aim is to synthesize the input data in as formalized a way as possible, Juristo 2007)
avoiding investigator bias.

It can be observed that generally the proposals mentioned here suggest: (1) guidelines for synthesizing data,
(2) summary with statistical results from quantitative and qualitative data, and (3) use of tables and databases
to facilitate data queries and analysis.

3.3.1.3 Documentation stage/Report and reviewed activities


For the documentation and report stage, it was considered to review the collected data to the activity Report of
the results (A7)
- Report of the results (A7): Table 9 contains the changes or proposals selected for activity A7 (Kitchenham
2004; Brereton, Kitchenham et al. 2007; Staples and Niazi 2007).

Table 9. Proposal for the stage “Report of the results (A7)”


Changes - Proposals Code Authors
Proposes formats where (Tech. Reports, section in a PhD Thesis, journal or KitA7 (Kitchenham
conference paper) and how to publish the results (structure and contents of a 2004)
report) of the SLR.
The results of the final protocol must be reported, which includes StaA7 (Staples and
reviews/changes regarding the process, but explaining the nature and reason Niazi 2007)
of the changes to the original protocol.
Working groups must make a detailed record of the decisions taken during BreA7 (Brereton,
the process. Need to establish a mechanism that allows SLR results to be Kitchenham
published (more extensive than traditional papers) or use of appendices et al. 2007)
stored in electronic repositories.

It can be observed that generally the proposals mentioned here suggest: (1) formats and guidelines to publish
results of the SLR and (2) the reviews and decisions made during the process must be reported.

3.3.1.4 Comments to the stages and activities reviewed


Having reviewed the stages and the proposals identified for each, we can say that they focus essentially on
defining guidelines for: (1) supporting the definition of the RQs, (2) identifying and selecting relevant data
sources as well as the definition of a search strategy aligned with the RQs and classification of the identified
works by category, (3) defining the inclusion/exclusion criteria, the solution of disagreements between
reviewers when selecting works and the caution in using only abstracts due to their low quality, (4) evaluating
the quality of the selected works and participation of several evaluators and how to reach a consensus on the

DCI-CEIS-UFRO Sepúlveda and Cravero 15


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

criteria, (5) synthesizing the data, and obtaining statistical results from quantitative and qualitative data and
using tables and databases to facilitate their analysis, and finally (6) publishing the results and reporting the
reviews and decisions made in the process.

3.3.2 Timelines
From the stages and activities identified, as well as from the changes proposed for each of these activities, a
timeline has been prepared for each stage (planning, implementation and documentation) with the aim of
illustrating graphically at what point these proposals are concentrated. To this end, use will be made of the
previously defined acronyms for each work reviewed.

3.3.2.1 Planning stage and proposed changes


As can be seen in Figure 6, the proposals defined for activity A1 are included between 2004 and 2007,
totaling six proposals, three of which are from 2007.

Fig. 6 Proposals for changes to the SLR protocol for the planning stage.

3.3.2.2 Implementation stage and proposed changes


As can be seen in Figure 7, there is a large number of proposals defined for activities A2 to A6 included
between 2004 and 2011, totaling 37 proposals, 20 of which are from 2007. In order to see what happens to
each activity in greater detail, what follows is a breakdown of the analysis for each.
The proposals defined for activity A2 are included for 2004 to 2007 and 2011, totaling eight proposals, four
of which are from 2007. The proposals defined for activity A3 are included for 2004 to 2007 and 2010,
totaling eight proposals, four of which are from 2007. The proposals defined for activity A4 are included for
2004, 2006-2008 and 2010, totaling six proposals, two of which are from 2007. The proposals defined for
activity A5 are included for 2004 to 2007, totaling eight proposals, five of which are from 2007. Finally the
proposals defined for activity A6 are included for 2004 to 2005 and 2007, totaling seven proposals, five of
which are from 2007.

DCI-CEIS-UFRO Sepúlveda and Cravero 16


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

Fig. 7 Proposals for changes to the SLR protocol for the implementation stage.

3.3.2.3 Documentation stage and proposed changes


As can be seen in Figure 8, the proposals defined for activities A7 include the years 2004 and 2007, totaling
three proposals, two of which are from 2007.

Fig. 8 Proposals for changes to the SLR protocol for the documentation/report stage.

3.3.2.4 Comments to the timelines reviewed


From what is shown in Figures 6 to 8 it can be observed that: (1) as expected, at the beginning everything was
based on the proposal by (Kitchenham 2004), (2) the greatest number of changes are concentrated in 2007,
with a total of 25 proposals, (3) the activities with the most proposed changes are A2, A3 and A5, which
correspond to the implementation stage with a total of eight proposals each, (4) the stage that seems to be the
most stable is documentation/report, because between 2004 and 2011 only 3 change proposals are recorded,
and (5) according to the changes after 2008, it could be argued that these are more focused on improving and
controlling quality aspects of the SLRs. Figure 9 shows a quantification of the proposals for changes to the
SLR protocol with respect to the year in which these were published, where it can be corroborated that the
greatest number of proposals appears in 2007.
From the information obtained to date, we can say that the greatest number of proposals regarding the original
SLR protocol in SE appeared in 2007. This is consistent with a considerable increase in the number of SLRs
published in the same year, which show a growth rate maintained to the present day, but the changes proposed
to the SLR protocol drop dramatically. This allows us to venture a hypothesis: that the protocol to perform

DCI-CEIS-UFRO Sepúlveda and Cravero 17


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

SLRs in SE has generally attained a certain acceptance and stability within the SE community and the
emphasis of the community is now migrating towards improving the quality of primary studies. It is beyond
the scope of this work to verify whether this hypothesis is true or false, but we believe that this may give rise
to a new type of research within the SLR and SE with respect to the quality of primary works and the need to
establish more tertiary studies that are dedicated to reviewing the quality of secondary studies.

Figure 9. Annual number of proposals for changes to SLR protocol.

3.4 SLR Publication Sources


Another aspect to consider is the main sources where the SLRs are published and to see what journals,
conferences or workshops specialize in publishing them. In order to do this, the source of each selected SLR
was checked, and the results obtained are summarized in Figure 10.
Figure 10 shows that: (1) the journal with the most SLRs in SE published is Information and Software
Technology with 35 publications, (2) those that follow are the International Symposium on ESEM and
International Conference on Evaluation and Assessment in software Engineering with 8 publications,
respectively and (3) practically 60% of the remaining sources reviewed contain less than 3 publications
related to SLRs in SE, but this count is not seen in the graph. Some examples of sources with 1 or 2
publications are ACM SIGSOFT Software Engineering and ICSE '11 Proceedings of the 33rd International
Conference on Software Engineering, among others. In order to complement this, a classification by thematic
areas covered by the SLRs in SE can be reviewed in (Zhang, Babar et al. 2011).

DCI-CEIS-UFRO Sepúlveda and Cravero 18


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

Figure 10. Number of publications according to main publication sources

3.5 Other considerations


The following presents a set of considerations that also includes a series of works that, although they do not
present or do not propose changes to the SLR protocol in SE, they do mention aspects that we consider
relevant when improving and having a positive impact on the development of SLRs. These aspects include the
relevance and role that abstracts play at the time of reviewing and selecting a work, as well as considerations
concerning to propose and conduct the search for works, among others.

- Abstracts: With respect to the use that can be made of abstracts in SE when selecting articles, (Staples and
Niazi 2007) criticize their low quality and how they may be considered a key element in making this decision.
For their part, (Jedlitschka and Pfahl 2005) emphasize the use of the structured abstract and suggest its use as
an important source of information that serves the readers in general, summarizes the main aspects of the
work and emphasizes it as the only section of the publication that is accessible free of charge. This is
complemented by the recommendations and considerations of the use of structured abstracts proposed in
(Budgen, Kitchenham et al. 2008).

- Searches: As far as the search for relevant works using the search engines offered on the websites of the
main digital sources used by the SE community (IEEEXplore, ACM Digital Library, Springer Link, Science
Direct), it is necessary to use different search strings for the different sources, try them out and evaluate the
results (Kitchenham, Mendes et al. 2007.; Chen, Ali Babar et al. 2009). This is also supported by (Staples and
Niazi 2007), who illustrate the fact that the search engines do not support the use of the search strings to
conduct SLRs. It is relevant to mention how efforts in reviewing the processes to search for works have been
made, comparing the use of manual searches with automated wide searches as well as evaluating the

DCI-CEIS-UFRO Sepúlveda and Cravero 19


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

importance of grey literature (Kitchenham, Brereton et al. 2009). The work by (Kitchenham, Brereton et al.
2010), in addition to proposing changes in activities A3 and A4, presents: (1) a comparison between the
guidelines for medicine and SE in conducting SLRs, with respect to how to perform searches of relevant
works and (2) a glossary of terms adopted from the based experience medicine and which are not widely
known in SE, which can be of great help for those initiating SLRs. As far as having a unified source of SLRs
in SE, (Staples and Niazi 2007) pose the idea of generating a centralized SLR index in SE similar to the one
in medicine, the Cochrane Collaboration†.

- Quality: According to (Cruzes and Dybå 2011), the quality of the SLRs conducted can be positively
influenced if the challenges at the time of synthesizing the research around SE are better understood; in
addition, despite the focus being placed on SLRs, limited attention is given to this item because it requires
becoming a central aspect of the SLR so as to increase its importance and utility both in the research and
practice of the discipline. For their part, (Staples and Niazi 2007) suggest a simplification of the original
criterion raised by Kitchenham to evaluate the quality of the work shown in each paper, thus facilitating the
undertaking. In the future, instruments should be developed that support the implementation and control of a
SLR, similar to the PRISMA‡ proposal for medicine (Moher, Liberati et al. 2010).

- Protocol and stages: The improvements or critiques regarding how to conduct a SLR expressed by
(Brereton, Kitchenham et al. 2007) also present a set of learning strategies that have accumulated with the
development of the SLR in SE. They also define the stages of a SLR and which of these are used “as-is” or
which need to be adapted to the field or practice of SE. By contrast, with respect to the original protocol for
SLR, (Staples and Niazi 2007) talk about the little clarity in directives for synthesizing data, and although
they agree with Brereton about the importance of running a pilot project, they also then criticize Kitchenham
for not clarifying when to stop or when a pilot project must be run. Based on his experience, Staples discusses
the non-trivial nature of validating the protocol of a SLR because it is not easy to find reviewers, and he
attributes this to the paucity of experience in developing SLRs.
Finally, it should be emphasized that (Biolchini, Mian et al. 2006) also propose the use of templates to
conduct SLRs, but they also define an ontology that describes the knowledge of experimental studies, the
application of this template can be seen in the technical report (Biolchini, Mian et al. 2005). As far as the
reporting of results is concerned, (Jedlitschka and Pfahl 2005) provide guidelines about how to report results
in empirical SE and establish a comparison between the different guidelines for reporting results, which
include SLRs.

3.6 Threats to the results of the work carried out


http://www.cochrane.org/cochrane-reviews

http://www.prisma-statement.org/index.htm

DCI-CEIS-UFRO Sepúlveda and Cravero 20


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

The final selection included 151 works that report having conducted a SLR on SE subjects; 11 works were
also included that report protocol proposals for carrying out SLRs in SE or changes to these, between 2004
and 2011. We think that the specificity of the latter topic has caused the sample to be rather small, and due to
this same specificity, the review provides a reliable overall view of the state of research in this area.
We are aware that there are some threats that may affect the validity of the findings discovered to date, the
most important being:
• Possible bias at the time of selecting works, such that we considered only a subgroup of the existing
SLRs.
• To the aforementioned we must add that although data sources that are highly recognizable within the SE
community were used (IEEEXplore, ACM Digital Library, Science Direct and Google Scholar), we
stopped considering others that were equally relevant, basically due to aspects of scope and time.
• Limitations of the tools used to conduct the searches in the electronic data sources, as already mentioned
in previous sections.
We tried to mitigate these threats by means of an individual selection and a joint validation of the works, thus
avoiding individual bias. In order to avoid works being left out of the study as a result of the searches, the idea
was to review all the versions of a work, whether these were journals, conference proceedings or technical
reports.

3.7 Discussion and comments


Next, the main results and findings are discussed, as well as possible consequences that these may have for
research around SLRs in SE.
As far as how this work answers the RQs posed, we can argue the following from the data collected.

- RQ1: How did increase the use of the SLR methodology in SE?
The sources consulted revealed a significant increase in the number of SLRs conducted, going from zero in
2004 to a total of 50 in 2011 and in the entire 2004-2011 periods, 151 works were published. In addition it
was possible to observe how from 2007 on the number of SLRs in SE published per year had increased
significantly. In order to ratify this upward trend in SLRs published from 2004 to 2011, it should be
emphasized that the average absolute increase between 2 consecutive years is approximately 7 works, and the
average rate of increase between 2 consecutive years is approximately 44%. For details see the Figures 1-3
and 9.

- RQ2: How has the original protocol been modified for SLR implementation in SE?
The protocol for the implementation of SLRs in SE was originally defined by (Kitchenham 2004), and later
works were published proposing changes to it, in one or more activities of the three stages included in the
original protocol.

DCI-CEIS-UFRO Sepúlveda and Cravero 21


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

Generally we can say that the proposals for changes to the SLR protocol in SE focus essentially on defining
guidelines for: (1) supporting the definition of the RQs, (2) identifying and selecting relevant data sources as
well as the definition of a search strategy aligned with the RQs and classification of the identified works by
category, (3) defining the inclusion/exclusion criteria, the solution of disagreements between reviewers when
selecting works and the caution in using only abstracts due to their low quality, (4) evaluating the quality of
the selected works and participation of several evaluators and how to reach a consensus on the criteria, (5)
synthesizing the data, and obtaining statistical results from quantitative and qualitative data and using tables
and databases to facilitate their analysis, and finally (6) publishing the results and reporting the reviews and
decisions taken in the process.

- RQ2.1: How many protocol proposals to develop SLRs in SE or changes to these have been published?
From the evidence collected, it may be stated that altogether there are 11 reviewed works that propose a
protocol or changes to it to conduct a SLR in SE and include the period between 2004 and 2011.
These 11 works contain 46 proposals, 25 of which were published in 2007, which means that 54% of the
proposals are concentrated in this year.
- RQ2.2: At what stages and activities are the proposals of changes to the protocol concentrated?
From the point of view of the stages of the process to conduct a SLR, the stage with the greatest number
of proposals is that of implementation, which concentrates 37, or 80% of all the proposals. With respect
to the activities, three were identified with the greatest number of proposals: Identification of relevant
works (A2), Selection of relevant works (A3) and Data Extraction (A5) with 8 proposals each, which is
equivalent to 17% in each case. The documentation/report stage presents the least number of proposals, 3,
or 7% of the total. Finally, it is worthy of note that some works not only present changes in some
activities of the protocol, but also define different stages and that these are executed in an order different
from the other proposals, as is the case with (Grimán and Juristo 2007). For details see Figures 6-8 and
Tables 3-9.

3.8 Meaning of the findings and results


From the data collected and the results obtained in this work we can state that SLRs are a subject that has
gained relevance in the SE community, which translates into an increasing number of articles in specialty
journals and conferences, as well as an increasing number of experiences of application/adoption in the
industry. Nevertheless, we detected some relevant aspects where there is a considerable lack of both
theoretical and empirical contributions and some areas where it is possible to make contributions to the
community, such as the implementation of: (1) tertiary studies that allow the real state of the quality of SLRs
conducted in SE to be visualized, (2) studies that make it possible to verify whether there is indeed a
stabilization of the protocol for conducting SLRs, (3) use of empirical evidence to establish how this protocol
is used and adapted and (4) studies to establish the level of adoption and adaptation of SLRs in the industry.

DCI-CEIS-UFRO Sepúlveda and Cravero 22


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

The reviewed data indicate that parallel to the significant increase in SLRs in SE in 2007-2008, which also
shows a growth rate maintained until today, but by the other way the number of proposals or changes to the
protocol to perform the SLR in SE have fallen drastically. This makes us think that the protocol to implement
SLRs in SE has generally attained a certain acceptance and stability within the SE community and the
emphasis of the community is migrating toward improving the quality of primary studies. We do not have the
arguments and it is beyond the scope of this work to verify whether this hypothesis is true or false, but we
believe that it can give rise to a new type of research within the SLR and SE with respect to the quality of
primary works and the need to establish more tertiary studies that are dedicated to reviewing the quality of
secondary studies.
In addition, we can state that if the authors and co-authors of each one of the 11 articles with proposals for the
SLR protocol in SE are reviewed, we observe that in 50% of these a set of six researchers is involved, and we
can therefore say that there is a group concerned with improving the processes involved in the performing
SLRs in SE. Among these authors, the case of Barbara Kitchenham stands out, who in addition to having
defined the protocol to develop SLRs in SE, is present in four of the 11 works, three of which as the author
and one as the co-author.

4. Related Work
In this section we present a compilation of works with observations and analyses of the SLRs performed in
SE. This compilation orders the works chronologically.
The literature review on research in SE carried out by (Glass 2002) suggested that it is broad in the topics
treated and narrow in the approaches and research methods used; in addition, the study shows a range of
research methods used in SE, and it is worthy of note that only 1.1% of investigations in SE use the method
called literature review/analysis.

In 2004 (Glass, Ramesh et al. 2004) mentioned that one of the criticisms of the research conducted is that the
investigators in SE and computer science, particularly in contrast to those in information systems, make little
or no use of the methods and experiences available from other disciplines of reference.

In 2009 (Kitchenham, Brereton et al. 2009) published an evaluation of the impact of SLRs between 2004 and
2007, concluding that the thematic areas covered until that time were limited and that the European
investigators, in particular those from the simulation laboratory, seemed to be the main representatives of
SLRs.

Next in 2010 (Kitchenham, Pretorius et al. 2010) published the results of tertiary studies between 2007 and
2010 with the aim of providing a set of comments available to the investigators developing SLRs in SE, and
they concluded that the works had improved in quality, but could not yet be considered a principal research
method in SE.
In 2011 (da Silva, Santos et al. 2011) analyzed the quality, topics covered and potential impact of the SLRs
published in 2008 and 2009, both for education and for the practice of the discipline, concluding that although

DCI-CEIS-UFRO Sepúlveda and Cravero 23


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

the quality and number of investigations had improved, most SLRs did not appraise the quality of the primary
studies and did not provide directives for professionals, thereby actually reducing the potential impact on the
practice of SE. In the same year, (Ramey and Rao 2011) referred to the SLR as a methodology, its being
“imported” from medicine, the changes made to adapt it to other disciplines and finally they suggested an
evaluation with a set of strengths and weaknesses of the method applied in SE.
Finally, (Zhang, Babar et al. 2011) conducted an empirical investigation into the use, adoption and advantages
of SLRs in the scientific area for SE, helping SE researchers and professionals to understand the perceived
value and the current or potential impact of SLRs.
The works mentioned in this section present a review that includes a set of comments, observations and
critiques of the SLRs conducted in SE. None of these, however, shows a study that presents the adaptation of
SLRs as an applied research methodology in SE from a perspective of the proposed changes to the initial
protocol used to develop them. This article also provides an account of the origin, development and current
state of the SLR in SE.

5. Conclusions

The work presented covers aspects of the origin, development, use and adaptation of the SLR as a research
methodology in SE, providing a chronological frame of reference that includes its current status in the field.
In addition, the answers and evidence for the RQs posed at the beginning of the work have been reviewed. We
believe that this work may be of interest to industry professionals and new researchers who wish to approach
the SLR as a relevant source of information as well as researchers planning to conduct additional studies on
SLR and their application in SE.

Although there are other works that present both a review and a set of observations and critiques of SLRs
conducted in SE, evidence of works that specifically report results on the adaptation of SLRs as a research
methodology applied in SE from a perspective of the changes proposed to the initial protocol used to develop
them have not been found. From this, we understand that more tertiary studies are required in this area that
makes it possible to delve into greater detail.
A future work is suggested, extending out from this one, adding and refining the RQs and more data sources
in order to ratify the ideas put forward here. Furthermore, the development of a prototype that indexes and
finds works that only include SLRs in SE is proposed, thus dealing with the deficiency indicated by the
literature for our discipline.

Acknowledgements
This work was conducted with the support of Vicerrectoría de Investigación y Postgrado at the Universidad
de La Frontera, through Research Project # DI14-0065. Special thanks to Mauricio Bustamante for his useful
comments, reviews and technical advice on this work.

References

DCI-CEIS-UFRO Sepúlveda and Cravero 24


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

Basili, V., F. Shull, et al. (1999). "Building knowledge through families of experiments." IEEE Transactions
on Software Engineering 25(4): 456-473.
Biolchini, J., P. G. Mian, et al. (2005). "Systematic Review in Software Engineering." System Engineering
and Computer Science Department COPPE/UFRJ, Technical Report ES 679(05).
Biolchini, J. C., P. G. Mian, et al. (2006). "Scientific research ontology to support systematic review in
software engineering." Advanced Engineering Informatics 21(2): 133-151.
Brereton, P., B. A. Kitchenham, et al. (2007). "Lessons from applying the systematic literature review process
within the software engineering domain." Journal of Systems and Software 80(4): 571-583.
Budgen, D., B. a. Kitchenham, et al. (2008). "Presenting software engineering results using structured
abstracts: a randomised experiment." Empirical Software Engineering 13(4): 433-458.
Caro, M. A., A. R. Ríos, et al. (2005). "Análisis y revisión de la literatura en el contexto de proyectos de fin
de carrera: Una propuesta." Revista Sociedad Chilena de Ciencia de la Computación 6(1).
Chen, L., M. Ali Babar, et al. (2009). Variability Management in Software Product Lines: A Systematic
Review. 13th International Software Product Line Conference, Carnegie Mellon University.
Chrissis, M. B., M. Konrad, et al. (2003). CMMI: Guidelines for process integration and product
improvement, Addison-Wesley Professional.
Clark, T., P. Sammut, et al. (2004). "Applied metamodelling: a foundation for language driven development."
Cruzes, D. S. and T. Dybå (2011). "Research synthesis in software engineering: A tertiary study." Information
and Software Technology 53(5): 440-455.
da Silva, F. Q. B., A. L. M. Santos, et al. (2011). "Six years of systematic literature reviews in software
engineering: An updated tertiary study." Information and Software Technology 53(9): 899-913.
Dybå, T., T. Dingsøyr, et al. (2007). Applying systematic reviews to diverse study types: An experience
report. First International Symposium on Empirical Software Engineering and Measurement, ESEM
2007, IEEE.
Dybå, T., B. A. Kitchenham, et al. (2005). "Evidence-based software engineering for practitioners." Software,
IEEE 22(1): 58-65.
Glass, R. L. (2002). "Research in software engineering: an analysis of the literature." Information and
Software Technology 39(2): 735-506.
Glass, R. L., V. Ramesh, et al. (2004). "An Analysis of Research in Computing Disciplines."
Communications on ACM 47(6): 89-94.
Grimán, A. and N. Juristo (2007). Proposal of a Review Process of Empirical Studies in Software
Engineering. International Doctoral Symposium on Empirircal Software Enginnering
(IDoESE2007): 25-32.
Gwet, K. (2002). "Inter-rater reliability: dependency on trait prevalence and marginal homogeneity."
Statistical methods for inter-rater reliability assessment 2: 1-9.
Jedlitschka, A. and D. Pfahl (2005). Reporting Guidelines for Controlled Experiments in Software
Engineering. International Symposium on Empirical Software Engineering, IEEE.
Jorgensen, M., T. Dyba, et al. (2005). Teaching evidence-based software engineering to university students.
Software Metrics, 11th IEEE International Symposium (METRICS’05).
Jørgensen, M. and M. Shepperd (2007). A systematic review of software development cost estimation studies.
IEEE Transactions on SE.
Kitchenham, B. (2004). Procedures for performing systematic reviews. Technical Report TR/SE-0401. S. E.
Group, Department of Computer Science, Keele University.
Kitchenham, B., P. Brereton, et al. (2009). "Systematic literature reviews in software engineering – A
systematic literature review." Information and Software Technology 51(1): 7-15.
Kitchenham, B., P. Brereton, et al. (2009). The Impact of Limited Search Procedures for Systematic
Literature Reviews – A Participant-Observer Case Study. Third International Symposium on
Empirical Software Engineering and Measurement.
Kitchenham, B., S. Charters, et al. (2007). Guidelines for performing Systematic Literature Reviews in
Software Engineering. EBSE Technical Report, EBSE-2007-01 Software Engineering Group, School
of Computer Science and Mathematics Keele University and Department of Computer Science
University of Durham.
Kitchenham, B., E. Mendes, et al. (2007.). "Cross versus within-Company Cost Estimation Studies: A
Systematic Review." IEEE Transactions on Software Engineering 33: 316-329.

DCI-CEIS-UFRO Sepúlveda and Cravero 25


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

Kitchenham, B., R. Pretorius, et al. (2010). "Systematic literature reviews in software engineering – A tertiary
study." Information and Software Technology 52(8): 792-805.
Kitchenham, B. a., P. Brereton, et al. (2010). "Refining the systematic literature review process—two
participant-observer case studies." Empirical Software Engineering 15(6): 618-653.
Kitchenham, B. A., T. Dyba, et al. (2004). Evidence-based software engineering. 26th International
Conference on Software Engineering, IEEE.
Mian, P., T. Conte, et al. (2005). A systematic review process to software engineering. 2nd Experimental
Software Engineering Latin American Workshop (ESELAW'05), Brazil.
Moher, D., A. Liberati, et al. (2010). "Preferred reporting items for systematic reviews and meta-analyses:
The PRISMA statement." The PRISMA Group International Journal of Surgery 8(5): 336–341.
Petersen, K., R. Feldt, et al. (2007). Systematic Mapping Studies in Software Engineering. 12th International
Conference on Evaluation and Assessment in Software Engineering.
Ramey, J. and P. G. Rao (2011). The systematic literature review as a research genre. Professional
Communication Conference (IPCC), Cincinnati, OH, USA, IEEE.
Rodriguez, D. (2005). Empirical software engineering research: epistemological and ontological foundations.
First Workshop on Ontology, Conceptualizations and Epistemology for Software and Systems
Engineering (ONTOSE).
Sackett, D. L., W. Rosenberg, et al. (1996). "Evidence based medicine: what it is and what it isn't." British
Medical Journal (BMJ) 312(7023): 71-72.
Shull, F., J. Carver, et al. (2001). An empirical methodology for introducing software processes. Joint 8th
European Software Engineering Conference (ESEC) and 9th ACM SIGSOFT Foundations of
Software Engineering (FSE-9), Vienna, Austria.
Sjøberg, D. I. K., T. Dybå, et al. (2007). The Future of Empirical Methods in Software Engineering Research.
Future of Software Engineering, FOSE'07, IEEE CS.
Staples, M. and M. Niazi (2007). "Experiences using systematic review guidelines." Journal of Systems and
Software 80(9): 1425-1437.
Wohlin, C., M. Höst, et al. (2006). "Empirical research methods in Web and software Engineering." Web
Engineering: 409-429.
Wohlin, C., P. Runeson, et al. (2000). Experimentation in software engineering: an introduction, Kluwer
Academic Publisher.
Zelkowitz, M. V., D. R. Wallace, et al. (2003). "Experimental validation of new software technology."
Lecture Notes on Empirical Software Engineering 12: 229-263.
Zhang, H., M. A. Babar, et al. (2011). "Identifying relevant studies in software engineering." Information and
Software Technology 53(6): 625-637.

DCI-CEIS-UFRO Sepúlveda and Cravero 26


Protocol to conduct Systematic Literature Reviews in Software Engineering: a chronological point of view of
the changes made TR-DCI-01-13 v1.0

Appendix 1 – Summary of selected papers

#Id. Title Authors Summary Year


T1 Procedures for performing Kitchenham, B. The initial SLR protocol in SE, which 2004
systematic reviews could be attributed to this being the first
work dealing with the subject of SLRs
and proposing a protocol of how to
perform them in SE
T2 Análisis y revisión de la literatura Caro, M.A., Ríos, The authors propose an adaptation of 2005
en el contexto de proyectos de fin A.R. et al. the SLR protocol for undergraduate
de carrera: Una propuesta students, and unlike the rest of the
works it is published in spanish.
T3 A systematic review process to Mian, P., Conte, It suggests the use of templates to make 2005
software engineering T. et al. the SLR easier
T4 Systematic Mapping Studies in Petersen, K., Although it specifies the foundations 2007
Software Engineering Feldt, R. et al. for developing systematic mapping
processes, it compares systematic
mapping and SLRs, establishing criteria
and comments that positively influence
the protocol to carry out a SLR
T5 Guidelines for performing Kitchenham, B., The use of a predefined structure to 2007
Systematic Literature Reviews in Charters, S. et al. guide RQ construction and a set of
Software Engineering checklists that can evaluate the quality
of the selected works
T6 Proposal of a Review Process of Grimán, A. And A change to the initial SLR protocol, 2007
Empirical Studies in Software Juristo, N. proposing alternative stages and
Engineering activities
T7 Lessons from applying the Brereton, P., Reviewing the RQs during protocol 2007
systematic literature review process Kitchenham, B. et development and working in pairs
within the software engineering al.
domain
T8 Applying systematic reviews to Dybå, T., a quality framework is proposed to 2007
diverse study types: An experience Dingsøyr, T. et al. evaluate the quality of selected works
report
T9 Experiences using systematic Staples, M. and A different way to pose the RQ is 2007
review guidelines Niazi, M. proposed, also suggests to use
complementary RQs to clarify the topic
and scope of the research
T10 Refining the systematic literature Kitchenham, B., The use of automated searches is 2010
review process—two participant- Brereton, P. et al. proposed, also suggests that quality
observer case studies evaluations should be based on the
participation of three independent
evaluators including at least two rounds
of discussion
T11 Identifying relevant studies in Zhang, H., Babar, A systematic strategy, based on 2011
software engineering M.A. et al. evidence to implement and evaluate the
search for relevant works, including a
rigorous development

DCI-CEIS-UFRO Sepúlveda and Cravero 27

View publication stats

Вам также может понравиться