Академический Документы
Профессиональный Документы
Культура Документы
Detectors
Anastasios Tsimakis
Diploma Thesis
February, 2018
Anastasios Tsimakis
Abstract
Refactoring tools have had a significant presence in software engineering for many
years. However, they are often not utilized to their fullest potential, and studies have
outlined that one of their issues is the difficulty of freely extending their capabilities
with new refactorings. In this work, we propose a pattern for refactoring detectors
in order to overcome that obstacle. In addition, the pattern was applied to an already
existing refactoring tool to showcase its properties.
Chapter 6. References....................................................................................................................................43
Chapter 1. Introduction
However, refactoring large amounts of code can be a difficult process, taking up a lot
of time and effort. For that reason, several automated tools have been developed,
allowing quick and efficient identification of code blocks in need of refactoring.
Nevertheless, it has been found in studies [MHPB11] that such tools are most of the
time not utilized by developers, for various reasons. In the University of Ioannina,
Theofanis Vartziotis developed one such tool by the name of Refactoring
TripAdvisor [V16], aiming to assist developers with better understanding of
refactorings, as well as automatic detection of opportunities for some refactorings.
Along with the tool, a generalized algorithm for refactoring detectors was
developed, with the main intent to make the tool extensible for future additions.
However, this algorithm was neither fully clarified nor implemented. The goal of
this thesis is twofold: First, to fully develop and refine the algorithm in the form of a
pattern, and secondly to implement it in the code of Refactoring TripAdvisor, to
5
showcase the quality increase of the existing code after application of the proposed
pattern.
6
Chapter 2. Related Work
7
Figure 2.1: Color-coding of refactoring categories
8
Figure 2.3: Feature Movement Between Objects relationship map
Succession relations are indicated by a straight line with an arrow, “Part Of”
relations are indicated by a dashed line with an arrow, and finally “Instead of”
relations are indicated by a dashed line with arrows on both ends.
For each of the refactorings, additional details are provided, such as the motivation
for applying that refactoring, as well as a short code snippet before and after
application. An example of this is provided in Figures 2.4 and 2.5.
9
Figure 2.4: Motivation for using the Replace Temp with Query refactoring
Figure 2.5: Example of applying the Replace Temp with Query refactoring
Our main focus, however, lies with the design and implementation of the automated
refactoring opportunity detectors. Not all the refactorings have implemented
10
detectors; of the 68 included in the map, only 11 have automatic opportunity
detection capabilities. They are listed in Table 2.1 below, along with their categories.
Refactoring Category
11
identifyRefactoringOpportunities(EclipseProject p)
t = getASTTree(file)
suggestedEntit =identificationForSpecificRefactoring(Method
method)
return(suggestedEntity)
This algorithm was expanded and refined later in a conference paper [VZV15],
where it took the following form given in figure 2.7:
Ensure: ∀ o ∈ O, o ∈ AST
1: scope ← identifyScope(AST);
2: S ← identifySubjects(scope);
4: O ← O ∪ identifyOpportunities(subject);
5: end for
6: visualizeOpportunities(O);
7: return;
“[...] a refactoring detector takes as input an abstract syntax tree AST that represents a
software project; the nodes of the tree correspond to specific program structures,
while the edges denote structural relations between these structures. The detection of
refactoring opportunities is a four steps process. The first step of the process
(Algorithm 1 line 1) identifies in the given AST the scope of the refactoring, i.e., the
12
part of the project that the developer is working with (e.g., a selected package, class,
method). The second step of the process (Algorithm 1 line 2) identifies, within the
refactoring scope, the refactoring subjects S, i.e., the specific structures within the
refactoring scope that can be refactored (e.g. the methods of the class that can be
simplified with Extract Method). The third step (Algorithm 1 lines 2-3) identifies a set
of refactoring opportunities O for each subject (e.g., the computational slices of a
method that can become new methods). Finally, the last step of the process (Algorithm
1 line 6) visualizes the identified refactoring opportunities.” (p. 6)
The purpose of this algorithm was to facilitate the ability to continue work on the
tool, expanding its abilities with the addition of further refactoring detectors, either
developed in-house or by third parties. This is especially important, since of the 11
detectors implemented, 4 of them rely on opportunity identification algorithms of
third-party developers. In particular, Extract Class [FTSC12], Extract Method
[TC09], Move Method, and Replace Method with Method Object utilise algorithms
used in the JDeodorant plugin by N. Tsantalis, A. Chatzigeorgiou, et al [TC14].
The main issue with the algorithm is that, while applicable for every detector
regardless of refactoring, is quite vague with its definitions of scope, subject, and
13
opportunity. There is no formal specification of how to identify subjects, or why
they are needed. In addition, while the algorithm was implemented in the RTA
source code, it was not in an easily recognisable form, and was repeated in every
single detector class, rather than in one superclass, leading to very large amounts of
duplicated code. Not only that, but RTA also had other quality issues, such as very
long methods, and code duplication unrelated to the above algorithm. These
problems are illustrated in detail in Chapter 4.
14
2.2 Patterns
Design patterns were first introduced in 1977 by an architect by the name of
Christopher Alexander [A77] and were focused in the disciplines of architecture and
civil engineering, but later found widespread application in the field of software
engineering as well as other sciences [BC87]. In its basest form, a pattern is a set of
rules describing the relations between a context, a problem (or a set of forces) that
occurs repeatedly in that context, and a solution for the problem. We will be using
this concept to propose a generalized form for refactoring detectors.
In 1997, Gerard Meszaros and Jim Doble laid the foundations for formalized pattern
writing, devising a pattern language for pattern writing in the book “Pattern
Languages of Program Design” [MD97]. Later, in 2004, Neil Harrison expanded on
this basis with a paper submitted in the European Conference on pattern Languages
of Programs (EuroPLoP) titled “Advanced Pattern Writing” [H04]. There are several
different forms of patterns, and while each author can create their own or adjust
existing ones at will, there are a few that have been studied and refined over the
years. Two of the more notable ones are the Alexandrian Form (the original one
proposed by Christopher Alexander), and Gang of Four Form, appearing in their
well-known book “Design Patterns: Elements of Reusable Object-Oriented Software”
[GOF94] in 1994. In this thesis, we will focus on a form by James Coplien, based on
the Alexandrian Form. In 2011, Wellhausen and Fießer [WF11] published a paper in
EuroPLoP with a step-by-step guide on how to write a design pattern in this form.
Below there is a quick overview and explanation of each of the pattern components
for the Coplien Form.
The Problem is the specific problem that the pattern attempts to solve.
The Forces are a set of conditions that magnify the problem or inhibit its solution.
The Solution describes the proposed solution to the problem, taking into account
the given context and forces.
15
The Consequences describe what happens after application of the solution. They
are usually divided into positive consequences, or Benefits, and negative
consequences, or Liabilities.
16
Chapter 3. Refactoring Detector
Pattern
This chapter contains the formalized pattern for the refactoring detectors, as well as
some further explanations.
Context:
You want to create a tool to assist with the refactoring process of a software project.
Problem:
Given the large number of different refactorings, it would be tedious and impractical
to design a detector from the ground up for each one of them.
Forces:
Solution:
17
1. Identify Refactoring Scope
The first step is identifying the refactoring scope, which is the region of the software
project that the user wants to search for possible instances of code that require
refactoring. The scope can be divided in clearly defined tiers, each one broader than
the next.
The second step is identifying the refactoring subjects within the given scope. Each
refactoring has a subject type, which is the smallest self-contained block of code that
refactoring opportunities can be detected in, or, more abstractly, where the
refactoring is applied. As such, concerning Object-Oriented languages, there are only
two subject types: Methods and Classes. Table 3.1 contains the refactoring subject
and opportunity types of all the refactorings in RTA's map, divided according to
their category.
The final step is identifying the actual refactoring opportunities. These are code
blocks or structures that exist within the refactoring subjects that fulfil all criteria
for the refactoring in question to be applied. While some refactorings may have
similar opportunities in terms of the code structures that will be affected by the
different refactorings, there is simply too much variance to have practically usable
groups. The table with subject types mentioned above also contains opportunity
types for the refactorings included, to illustrate this issue.
Consequences
Benefits:
18
Simplicity: The 3 steps presented in the solution are clear, concise, and
straightforward.
Liabilities:
Implementation
Figure 3.1
19
The RefactoringDetector superclass contains the code for scope identification that
remains the same across all refactorings. The two subject subclasses implement the
abstract method for subject identification, and also store all subjects found in the
subjectList field. Finally, a number of individual refactoring detectors, each with
their own opportunity type, extend their appropriate subject class and implement
the refactoring opportunity detection algorithm. (The “...” in the parameter list of
identifyOpportunities signify that they have a variable number of parameters, as
some of them might be reliant on thresholds).
While the steps of the pattern algorithm might make sense intrinsically, it is not
immediately obvious why they are needed. Concerning the use of the concept of
scope, scanning an entire project for refactorings might return hundreds if not
thousands of results, which would be cumbersome for the user to go through. Not
only that, but for some parts of the project, changes might be unwelcome or even
prohibited, so showing possible refactorings on those would only hinder the user.
As for subjects, they assist greatly in categorising refactorings and reducing code
duplication. As mentioned in the solution, the two subject types are Methods and
Classes. While there is no absolute rule for assigning subject types to refactorings,
especially since some of them are vaguely defined (for example Substitute
Algorithm), a good predictor is the refactoring's “range”, or the area of code it will
affect. For example, refactorings that deal with local variables, parameters,
conditionals, and generally the internal workings of methods, have mostly (with
very few exceptions: for example, Introduce Parameter Object has a Class subject
even though it concerns parameters) Method subjects. On the other hand,
refactorings that have to do with fields and inter-class relations (such as inheritance
or association) mostly have Class subjects. A full list of 69 refactorings, their subject
types, as well as opportunity types is present below, in Table 3.1, divided according
to category.
20
Refactoring Subject Type Opportunity Type
Method Composition
21
Refactoring Subject Type Opportunity Type
Data Organization
22
Refactoring Subject Type Opportunity Type
23
Refactoring Subject Type Opportunity Type
Generalization Improvement
There are a couple of concerns about how the pattern handles some conflicts of
scope, but there are multiple ways of resolving them, all depending on how the
developer prefers to handle them. Firstly, there is the question of what to do when
the subject of a refactoring is “larger” that it’s scope; for example, with a Class-
subject refactoring, a user couldn’t pick a singular method for scope. Possible
solutions would be to disallow selection of invalid scopes (either re-prompting the
user or displaying an error), or automatically selecting the smallest or largest valid
scope. Secondly, there is the question of handling refactorings that affect classes
beyond the selected scope; for example Move Method. In this case, opportunities
that would affect code beyond the selected scope could be ignored, or presented
with a special warning.
24
Chapter 4. Case Study
First, we show the state of RTA's code before application of the pattern. Figure 4.1
contains the code of the RefactoringDetector interface that all detector classes
implement.
boolean opportunitiesFound();
It is clear that the interface has almost zero functionality, with no references to any
of the pattern solution steps. Instead, the entire algorithm is implemented in each
individual detector class again and again.
In the following figures, we take a look at the classes responsible for the detection of
Inline Method and Extract Class refactoring opportunities. For the sake of clarity, we
split them into different segments; the first for class declaration and fields, and one
25
for each method. We will then compare them with each other to showcase the issues
present.
public InlineMethodIdentification() {
mainFrame.setDefaultCloseOperation(JFrame.DISPOSE_ON_CLOSE);
mainFrame.setResizable(false);
mainFrame.setIconImage(java.awt.Toolkit.getDefaultToolkit().getImage(
getClass().getResource("/images/repair.png")));
mainFrame.setContentPane(mainPanel);
mainPanel.setLayout(null);
opportunitiesFound = true;
26
public class ExtractClassIdentification implements RefactoringDetector{
public ExtractClassIdentification() {
mainFrame.setDefaultCloseOperation(JFrame.DISPOSE_ON_CLOSE);
mainFrame.setResizable(false);
mainFrame.setIconImage(java.awt.Toolkit.getDefaultToolkit().getImage(
getClass().getResource("/images/repair.png")));
mainFrame.setContentPane(mainPanel);
mainPanel.setLayout(null);
opportunitiesFound = true;
27
Comparing Figures 4.2 and 4.3 to Figures 4.4 and 4.5 respectively, we can see that
there is significant amount of duplication; in fact, the entire constructor is exactly
the same, with only one method call having a different parameter in the second line
of the constructor body.
We will ignore some of the methods dedicated to GUI construction and other work
irrelevant to the pattern, and focus on a specific method that contains the algorithm
for scope, subject, and opportunity identification. Due to its very large size (113
lines for Inline Method, 88 lines for Extract Class), we only present the part relevant
to the pattern. Figures 4.6 and 4.7 contain the code roughly corresponding to scope
identification and subject identification respectively in the Inline Method Detector,
while Figure 4.8 corresponds to scope and subject identification for the Extract
Class Detector. We picked these two specific detectors in order to show the
differences between a method subject detector and a class subject detector.
28
final SystemObject systemObject = ASTReader.getSystemObject();
final Set<ClassObject> classObjectsToBeExamined = new LinkedHashSet<ClassObject>();
final Set<AbstractMethodDeclaration> methodObjectsToBeExamined = new
LinkedHashSet<AbstractMethodDeclaration>();
if(selectionInfo.getSelectedPackageFragmentRoot() != null) {
classObjectsToBeExamined.addAll(systemObject.getClassObjects(selectionInfo.ge
tSelectedPackageFragmentRoot()));
}
else if(selectionInfo.getSelectedPackageFragment() != null) {
classObjectsToBeExamined.addAll(systemObject.getClassObjects(selectionInfo.ge
tSelectedPackageFragment()));
}
else if(selectionInfo.getSelectedCompilationUnit() != null) {
classObjectsToBeExamined.addAll(systemObject.getClassObjects(selectionInfo.ge
tSelectedCompilationUnit()));
}
else if(selectionInfo.getSelectedType() != null) {
classObjectsToBeExamined.addAll(systemObject.getClassObjects(selectionInfo.ge
tSelectedType()));
}
else if(selectionInfo.getSelectedMethod() != null) {
AbstractMethodDeclaration methodObject =
systemObject.getMethodObject(selectionInfo.getSelectedMethod());
if(methodObject != null) {
ClassObject declaringClass =
systemObject.getClassObject(methodObject.getClassName());
if(declaringClass != null && !declaringClass.isEnum()
&& !declaringClass.isInterface()
&& methodObject.getMethodBody() != null)
methodObjectsToBeExamined.add(methodObject);
}
}
else {
classObjectsToBeExamined.addAll(systemObject.getClassObjects());
}
Figure 4.6: Method fragment corresponding to Scope identification for Inline Method
29
if(!classObjectsToBeExamined.isEmpty())
ListIterator<MethodObject> methodIterator =
classObject.getMethodIterator();
while(methodIterator.hasNext())
methodObjectsToBeExamined.add(methodIterator.next());
Figure 4.7: Method fragment corresponding to Subject Identification for Inline Method
The two code fragments shown in figures 4.6 and 4.7 are repeated identically in all
detectors for method subject refactorings, placed in a single method along with the
code for opportunity detection. Besides the code duplication, this also leads to very
large, confusing methods that make it hard to figure out which part of the code is
responsible for each part of the algorithm, hindering the extensibility of the project.
Similar for Figure 4.8, which is repeated in all class subject refactoring detectors.
30
SystemObject systemObject = ASTReader.getSystemObject();
if(selectionInfo.getSelectedPackageFragmentRoot() != null) {
classObjectsToBeExamined.addAll(systemObject.getClassObjects(selectionInfo.ge
tSelectedPackageFragmentRoot()));
classObjectsToBeExamined.addAll(systemObject.getClassObjects(selectionInfo.g
etSelectedPackageFragment()));
classObjectsToBeExamined.addAll(systemObject.getClassObjects(selectionInfo.ge
tSelectedCompilationUnit()));
classObjectsToBeExamined.addAll(systemObject.getClassObjects(selectionInfo.ge
tSelectedType()));
else {
classObjectsToBeExamined.addAll(systemObject.getClassObjects());
if(!classObject.isEnum())
classNamesToBeExamined.add(classObject.getName());
Figure 4.8: Method fragment corresponding to scope and subject identification for Extract
Class
To solve these issues, we apply the pattern, as seen in Figure 3.1. We turn the
RefactoringDetector interface to an abstract superclass, and move up all common
fields, as well as the constructor, as seen in Figures 4.9 and 4.10:
31
public abstract class RefactoringDetector {
Besides moving up the common fields, we have also added some new ones. The
listOfValidScopes field, as the name implies, contains all valid scopes for a given
subject type; as mentioned in the end of Chapter 3, the chosen scope must always be
larger than the refactoring's subject type. It is initialized in the constructor of
RefactoringDetector, but populated in the constructor of the subject-specific
subclass constructors. The reflectionMap field links each scope tier with a getter
method that returns the actual scope object. This is due to already-existing code that
could not be changed, both in RTA but also in JDeodorant. It is populated in the
method populateReflectionMap, shown in Figure 4.11, and used in the scope
identification method (Figure 4.12).
32
public RefactoringDetector(String title)
mainFrame.setTitle(title);
mainFrame.setDefaultCloseOperation(JFrame.DISPOSE_ON_CLOSE);
mainFrame.setResizable(false);
mainFrame.setIconImage(java.awt.Toolkit.getDefaultToolkit().
getImage(getClass().getResource("/images/repair.png")));
mainFrame.setContentPane(mainPanel);
mainPanel.setLayout(null);
opportunitiesFound = true;
populateReflectionMap();
33
private void populateReflectionMap()
{
reflectionMap = new HashMap<scopeType, Method>();
Class packageSelection;
try {
packageSelection =
Class.forName("DataHandling.PackageExplorerSelection");
Method selectionGetPackageFragmentRoot =
packageSelection.getDeclaredMethod("getSelectedPackageFragmentRoot");
Method selectionGetPackageFragment =
packageSelection.getDeclaredMethod("getSelectedPackageFragment");
Method selectionGetCompilationUnit =
packageSelection.getDeclaredMethod("getSelectedCompilationUnit");
Method selectionGetType =
packageSelection.getDeclaredMethod("getSelectedType");
Method selectionGetMethod =
packageSelection.getDeclaredMethod("getSelectedMethod");
reflectionMap.put(scopeType.PACKAGE_FRAGMENT_ROOT,
selectionGetPackageFragmentRoot);
reflectionMap.put(scopeType.PACKAGE_FRAGMENT,
selectionGetPackageFragment);
reflectionMap.put(scopeType.COMPILATION_UNIT,
selectionGetCompilationUnit);
reflectionMap.put(scopeType.TYPE, selectionGetType);
reflectionMap.put(scopeType.METHOD, selectionGetMethod);
reflectionMap.put(scopeType.NONE, null);
} catch (ClassNotFoundException e) {
e.printStackTrace();
} catch (NoSuchMethodException e) {
e.printStackTrace();
} catch (SecurityException e) {
e.printStackTrace();
}
}
34
In Figure 4.12, we see the updated code for scope identification. We removed the
use of chain if-else-if-else through the use of reflection and generic types.
scope = selectionInfo.getSelectedScope();
else
scope = scopeType.NONE;
return null;
Finally, there are three abstract methods declared: one of them is for subject
identification, implemented in the subject-dependent subclasses, and the other
twoare auxiliary ones implemented in each individual refactoring detector
class.Moving on to the newly-created subject subclasses, we have two of them, one for
Moving on to the newly-created subject subclasses, we have two of them, one for
method subjects and one for class subjects, shown in Figures 4.14 to 4.17.
35
protected abstract <T> void identifySubjects(T scopeRegion, SystemObject
systemObject);
super(title);
listOfValidScopes.add(scopeType.PACKAGE_FRAGMENT_ROOT);
listOfValidScopes.add(scopeType.PACKAGE_FRAGMENT);
listOfValidScopes.add(scopeType.COMPILATION_UNIT);
listOfValidScopes.add(scopeType.TYPE);
listOfValidScopes.add(scopeType.NONE);
As you can see, we populate the listOfValidScopes in the constructor, omitting the
invalid Method scope tier. Figure 4.15 contains the code for subject identification,
using the generic scope object that will be returned from identifyScope and casting
it to the appropriate class.
36
protected<T> void identifySubjects(T scopeRegion, SystemObject systemObject)
if (scope != scopeType.NONE)
if (scope == scopeType.PACKAGE_FRAGMENT_ROOT){
subjectList.addAll(systemObject.getClassObjects((IPackageFragmentRoo
t) scopeRegion));
subjectList.addAll(systemObject.getClassObjects((IPackageFragment)
scopeRegion));
subjectList.addAll(systemObject.getClassObjects((ICompilationUnit)
scopeRegion));
else
subjectList.addAll(systemObject.getClassObjects());
37
Similarly for the Method Subjects, Figure 4.16 contains the fields and constructor,
while 4.17 and 4.18 contain the subject identification.
super(title);
listOfValidScopes.add(scopeType.PACKAGE_FRAGMENT_ROOT);
listOfValidScopes.add(scopeType.PACKAGE_FRAGMENT);
listOfValidScopes.add(scopeType.COMPILATION_UNIT);
listOfValidScopes.add(scopeType.TYPE);
listOfValidScopes.add(scopeType.METHOD);
listOfValidScopes.add(scopeType.NONE);
38
protected<T> void identifySubjects(T scopeRegion, SystemObject systemObject)
if (scope != scopeType.NONE){
if (scope == scopeType.PACKAGE_FRAGMENT_ROOT){
classObjectsToBeExamined.addAll(systemObject.getClassObjects((IPackage
FragmentRoot) scopeRegion));
classObjectsToBeExamined.addAll(systemObject.getClassObjects((IType)
scopeRegion));
AbstractMethodDeclaration methodObject =
systemObject.getMethodObject((IMethod) scopeRegion);
ClassObject declaringClass =
systemObject.getClassObject(methodObject.getClassName());
&& !declaringClass.isInterface()
subjectList.add(methodObject);
}
}
else{
classObjectsToBeExamined.addAll(systemObject.getClassObjects());
39
if(!classObjectsToBeExamined.isEmpty()){
ListIterator<MethodObject> methodIterator =
classObject.getMethodIterator();
while(methodIterator.hasNext())
subjectList.add(methodIterator.next());
identifySubjects(identifyScope(), systemObject);
identifyOpportunities(systemObject, threshold);
Figure 4.19
40
For a more quantitative analysis, Table 4.1 containsa full breakdown of number of
lines before and after application of the pattern in each of the individual detector
classes, as well as the numerical and percentage reduction of lines of code in those
classes.
Lines Lines # %
Class
Before After Reduction Reduction
RefactoringDetector 15 101
RefactoringDetectorClassSubject - 57
RefactoringDetectorMethodSubject - 86
41
Chapter 5. Conclusions
The problem this thesis deals with relates to the issue of refactoring tool
extensibility. It has been highlighted in several studies regarding refactoring tools,
but, to the best of our knowledge, not much research has been carried out regarding
it. Our solution uses groundwork laid by the tool Refactoring TripAdvisor. We
improved and refined the ideas presented in the original work and subsequent
publications, and formalized them in the form of a pattern for refactoring detectors.
From our study, this pattern is applicable to any of the 68 refactorings devised by
Martin Fowler, and can be further applied to new ones without issue. It is also
language-independent, concerning object-oriented languages of course.
42
Chapter 6. References
[BC87] Kent Beck, Ward Cunningham. Using Pattern Languages for Object–
Oriented Programs. OOPSLA 87 workshop on Specification and Design
for Object–Oriented Programming, 1987
[GOF94] Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides. Design
Patterns: Elements of Reusable Object–Oriented Software. Addison
Wesley, 1994
43
[LT12] Huiqing Li, Simon Thompson. Let’s Make Refactoring Tools user–
Extensible! Fifth Workshop on Refactoring Tools, 2012
44