10 1 1 123

Simmons et al ...
Industrial Tools for the Feature Location Problem: A Case Study in Tool Adaptation1
Sharon Simmons, Dennis Edwards, Norman Wilde 2 , Josh Homan Department of Computer Science, University of West Florida Michael Groble Motorola, Inc.
Abstract Software Engineers who maintain and enhance large systems often encounter the feature location problem: Where in the many thousands of lines of code is a particular user feature implemented? Several methods of addressing the problem have been proposed, most of which involve tracing execution of the system and analyzing the traces. Some supporting academic tools are available. However companies that depend on the successful evolution of large systems are more likely to use new methods if they are supported by industrial strength tools of known reliability. This report describes a study performed with Motorola, Inc. to see if Metrowerks CodeTEST and Klocwork inSight could be used for feature location on message passing software, similar to many systems that Motorola maintains. Both tools are currently in use in Motorola for other purposes and are known to be robust and effective. These two tools were combined with TraceGraph, an academic trace comparison tool, in a case study of four features in a large open-source system. About 180 hours of experimentation and some "glue" code were needed to adapt the two tools for this new task. Once this work was complete the tool combination operated quite effectively. CodeTEST provided an efficient way of collecting traces from a large, multi-process system, TraceGraph compared those traces and identified code areas for study, and inSight assisted in browsing those code areas and in documenting the resulting understanding of the feature. Case study participants completed these steps in typically 3 4 hours per feature, studying only a few hundred lines out of a 200,000 line system. An ongoing project with Motorola is focused on improving tool integration with the hope of making feature location with this tool combination into a common part of Motorola practice.
Support for this study was provided in part by Motorola, Inc. through the Software Engineering Research Center, an NSF Industry/University Cooperative Research Center (http://www.serc.net). Additionally, Klocwork, Inc. provided a free academic license for the inSight tool used in the study. 2 Address correspondence to: Norman Wilde, Department of Computer Science, University of West Florida, Pensacola, FL 32514, tel: 850-474-2548, fax: 850-857-6056, email: nwilde@uwf.edu
Simmons et al ...
Key Words: software maintenance, program comprehension, feature location, tool adaptation Introduction Industrial software systems are so large that they often are no longer completely understood, especially after initial development is complete. Further evolution and servicing may then become the responsibility of a team which has lost some of the knowledge generated in system development [2].
A common problem facing the members of such a team is to understand how a specific feature of the software has been implemented. Large systems do not do just one thing, but instead offer many interrelated features to their users. A word processor, for example, will offer features for editing text, changing fonts and text styles, inserting tables and images, saving in several different file formats, etc. A change request for such a system may mention a bug or an enhancement related to a particular feature, e.g. "Setting the text style within a table heading does not work".
A software engineer assigned to this change request would have to track down the code involved in table headings and text style changes. Traditionally such feature exploration is done using a combination of text search utilities such as Unix's grep, a debugger (if a suspicious variable or code segment can be identified) and lengthy perusal of the source code. Such methods obviously become more and more difficult to use as the size of the system increases.
Several researchers have studied dynamic analysis methods for feature location. These involve running the program using different sets of test data, some with the feature and some without. The program is instrumented to keep track of the software components that are executed in each test (modules, subroutines, or lines of code depending on the study). For example Wong et al. [19] use a metrics approach to quantify the disparity between a feature and a component, the concentration of a feature in a component, and the dedication of a component to a feature. An extended version of this approach characterizes the distance between features using both a static method and a dynamic one which takes into account a system operational profile [20]. A somewhat different method has been described by Eisenbarth et al. [4]. In this method the program is executed under several scenarios, each of which has been tagged with the features it involves. A trace of subroutine calls is taken, and these traces are analyzed to categorize the subroutines according to their degree of specificity to a given feature. The analysis also
Simmons et al ...
automatically produces a set of concepts for the program. The concepts are subsets of the subroutines that tend to execute together and are presented in a lattice.
The simplest, and perhaps the oldest, dynamic analysis approach to feature location has been called software reconnaissance [15]. In this method the program is executed for a few test cases with the feature, and for similar cases without the feature. The marker code for the feature is defined as the set difference:
code components executed code components executed marker code components = in tests with the feature in tests without the feature
The strengths and weaknesses of this method have been fairly well established in a series of case studies [16, 17, 18]. The method tends to be very good at focusing study on a small fraction of the code. The marker code provides "good places to start looking" for code browsing. The method often provides surprising insights, even to programmers who have experience with the system. "I didn't know it was doing that!" is a fairly common reaction.
However the method is certainly not a panacea. It requires the system to be instrumented and test cases to be run. Poorly chosen test cases that exercise too much or too little of the system may cause problems. As well, the marker code is usually not enough by itself to provide understanding of the system. The markers need to be studied in context, to see how they fit into the control flow and the data flow of the system. The task of putting the pieces together ranges from trivial through very difficult, depending on the structure and clarity of the code [18].
Industrial Tools and Feature Location Most of the published studies of the reconnaissance method have been done in an academic context. It is well known that academic tools often have difficulty dealing with real industrial software due to problems of scale and to the complexity of the heterogeneous environments found in the real world. A company may well prefer to work with industrial tools, which may be expensive but have been proven in use at the same company.
Motorola Labs was interested in knowing if tools currently in use at Motorola Inc. could be adapted for feature location in telecommunications systems. Motorola products include cellular
Simmons et al ...
phones, two-way radios, telecommunications switches, etc. that run on highly specialized platforms with many processors. Many of these systems are message oriented; a software component listens for incoming messages, processes them, and produces messages in response.
Motorola asked the University of West Florida and the Software Engineering Research Center (SERC) to perform the pilot case study reported in this paper to see if feature location could be performed effectively on this kind of software, using tools already in use in the company. The specific tools suggested were CodeTEST from Metrowerks 3 and inSight from Klocwork 4 .
CodeTEST instruments software so that, when the compiled program runs, information can be collected on performance, test coverage, and memory usage [7]. It can be used either in native mode, with all functions performed in software, or in a hardware-in-circuit mode in which data collection is performed in real time using specialized hardware. Motorola has used the hardwarein-circuit mode to collect code coverage and performance data of a number of different embedded systems. The hardware collection mechanism has proved invaluable when analyzing device driver code and other packet processing software routines that are called thousands of times a second and have processing times in the range of 10 to 0.1 microseconds.
The inSight tool is part of a broader Klocwork suite that provides static analysis of large systems [5]. Klocwork parses the source code and builds a data repository that has several different uses, including metrics generation, identification of potential defects and security issues, and enforcement of architectural constraints. inSight provides a visual interface to view software dependencies, construct architectural models, and explore possible refactorings.
It was immediately evident that these tools provide support for important parts of the feature understanding problem, but do not cover it completely. CodeTEST instruments the target program and produces traces, but it does not include a mechanism for comparing execution from different tests. inSight provides very powerful code browsing and abstraction capabilities, but has no direct way of importing or using trace data.
3 4
Metrowerks and CodeTEST are trademarks of Metrowerks Corp. Klocwork and inSight are trademarks of Klocwork, Inc.
Simmons et al ...
We decided to use the TraceGraph tool from our Recon3 toolset as an intermediary to do the comparison step. The Recon3 toolset is a free academic product for tracing software [10]. TraceGraph, the trace comparison tool of Recon3, provides a visual image of a set of traces that lets the eye pick out differences between them [6].
The complete case study setup is thus as shown in Figure 1. To generate traces of execution, subjects used both CodeTEST and the Recon3 tracing tool. While Recon3 is an academic tool with all the limitations that implies, it was designed specifically to support feature location so we thought it would be interesting to compare its results with CodeTEST.
instrument system & collect traces
compare traces & locate feature
explore code & understand feature
CodeTE ST
trace file Recon3
TraceGraph
inSight
Software E ngineer
Figure 1 Tools Used in the Case Study
Subjects in the case study ran test cases and used CodeTEST or Recon3 to produce traces, both with and without the feature being sought. Then they fed these traces to TraceGraph to visually identify differences between them, thus giving the marker code for the feature. Finally, they used inSight to explore the source code functions containing marker code and build a model of the feature, consisting of a diagram of the most important code elements and supporting text describing the way the feature works.
Simmons et al ...
Case Study Goals and Organization The case study had several goals. First, we wanted to see if feature location was possible using tools currently in use in Motorola and on a system comparable with Motorola's own software. We thus wished to evaluate two tool combinations: 1. CodeTEST + TraceGraph + inSight 2. Recon3 + TraceGraph + inSight
Doubts involved the usability of each tool in a context that was probably not contemplated by its designers. As well, we wished to identify and, if possible, overcome expected difficulties in getting different tools to work together. A third question concerned the role of inSight. As far as we know, this study represents the first attempt to combine dynamic feature location with a powerful static analysis tool to achieve feature understanding. How well would that combination work?
A secondary goal of the case study was to explore the process of adopting feature location technology in an industrial setting. Software managers at other companies may not have the same tools used at Motorola, but could reasonably be interested in the process of technology adaptation, and especially in its cost. Thus the case study tried to record accurately the steps involved in adopting this new technology, as well as the person-hours expended and the efficiency of the resulting feature location process.
The case study required a realistic target system with a range of features to explore. Unfortunately both system complexity and hardware constraints made it impossible to study an actual Motorola system. Instead Motorola representatives and the researchers selected the Apache web server as an acceptable proxy [1]. This open-source system is large (about 227 KLOC raw line count), multi-process and in widespread use. Its architecture is similar to many Motorola systems in that it is based on message passing; as described in the HTTP protocol, Apache waits for an incoming message, processes it, and returns a response message. Apache is written in standard C so all of the case study tools were applicable. For the case study we ran Apache under Solaris on a dual processor (2 x 360 MHz) Sun Sparc Ultra 60 that was available at the University of West Florida.
The case study followed a three phase technology adoption process similar to that expected in industry:
Simmons et al ...
1. Installation: New tools are installed along with existing tools. If necessary some "glue" code is written to tie them together appropriately. Trials are made using small programs. 2. Trial: The toolset is tried on a production system and any kinks are worked out. 3. Use: The toolset is used "live" as part of normal software change activity. Change requests are received, the relevant code is located, and it is analyzed before making the change.
To facilitate objective collection of data, work on the "use" phase of the case study was divided. One experimenter acted as controller. He prepared four plausible scenarios derived from the HTTP/1.1 protocol description [11], each with a change request that would affect a particular feature of Apache. Two other experimenters acted as subjects, playing the role of a software engineer attempting to resolve the change request. Each subject worked on all four scenarios, two with the Recon3 - TraceGraph - inSight combination and two with the CodeTEST - TraceGraph inSight combination. Assignment and sequencing of scenarios was randomized to minimize any individual bias or learning effects.
Tracing Long-Running Systems A particular issue arises in tracing systems such as those produced by Motorola that take some time to start up and shut down. For feature location it is obviously more convenient to start up, then run the tests with the feature and without, and only shut down at the end of the sequence. A feature test case thus corresponds to an interval of time (Figure 2).
The broad arrow in Figure 2 represents the target system (Apache in the case study) executing in time. Trace events occur continually as different software components are executed (small arrows). Trace event collection usually has to take place in a separate process, typically called something like a "trace manager". To run a test, the software engineer: (1) informs the trace manager that a test is starting; (2) sends a message to the target to start the feature; (3) receives back the response message; (4) tells the trace manager that that the feature has completed.
Simmons et al ...
start message target program end message Software Engineer
"trace manager" collects events wr ites file
commands
Figure 2 - Tracing a Long-Running System
Ideally the final trace file should contain just the events from the interval in which the feature test was executing, with as little extraneous data as possible. However a difficulty may arise in synchronizing the trace manager and the target system. Both processes can be quite machine intensive, and if either is allowed to get ahead of or behind the other the captured events may not accurately represent a true trace of the feature.
Using CodeTEST for Trace Collection Considerable difficulties were encountered in using CodeTEST to generate the traces, mainly due to the problem of timing described in the previous section. It should be remembered that CodeTEST is intended as a test coverage and performance evaluation tool, not a feature location tool. Thus its design is not tuned for the production of traces tightly delimited in time.
Simmons et al ...
commands (shared memory)
target machine ctserver
Target Target Target Process Process

Process trace events (pipe)
network connection
ctmanager Python
IDB
test data code written for the case study
USER
traces
Figure 3 - CodeTEST Architecture
The architecture of CodeTEST is roughly as shown in Figure 3. The target program (Apache in this study) is compiled with CodeTEST instrumentation. The instrumentation identifies each code element with a unique integer, and creates the instrumentation data base (IDB) which maps these integers to source file, line number, etc.
As the target program executes, trace events are recorded. Approximately every 10 ms events are sent to the CodeTEST server (ctserver) through a pipe. The ctserver component can collect traces from several target processes simultaneously. Trace events are buffered until ctmanager stops the collection process and requests the data.
The CodeTEST manager (ctmanager) communicates with ctserver via a socket network connection to start tracing, to stop tracing, and to extract trace data. ctmanager can look up the identifying integers in the IDB and display the trace, coverage data, or performance data. It also has an application programmer interface (API) for the Python language [8] so that the user can write a Python program to control the whole process.
Our first design looked like Figure 3 and used a Python program that provided several commands to let the user control tracing manually. The user would:
Simmons et al ...
10
1. instruct ctmanager to start tracing 2. run a test of the target program 3. instruct ctmanager to stop tracing 4. extract the trace data from ctmanager and save it to a file.
While this worked acceptably on a small program in the installation phase of our study, it exhibited shortcomings as soon as we moved to programs that generate events very rapidly. With a higher volume of events, we found that the process of extracting the trace data for a single test generally took two minutes or more to complete.
The situation shown in Figure 2 thus has widely differing time scales. The target program is working at computer time scale and may be generating events at a rate of many thousands per second. The user is running tests on a human scale and comfortably takes a few seconds to a few minutes to set up and run a test. Standing between the two we have ctmanager, with delays of more than two minutes to collect events and turn them into Python/Java objects for analysis and display.
Using our first design, it was very hard to get consistent trace results. One run of a test would give us many more events than the next, probably because the user's commands to stop and start tracing were taking variable times to reach ctserver. It is possible that trace data was sometimes being lost in the stop and start process. Such variations can be overcome by repeating the feature many times and applying statistical methods to analyze the trace [3]. However the extra work required by that method is considerable so it would be desirable to come up with something better for routine software maintenance work.
Simmons et al ...
11
commands (shared memory)
target machine
Target Target Target Program Program

Process
ctserver
trace events
network connection
test data
Tag Process ctmanager IDB Python
Test Driver
"ping"
trace
USER
code written for the case study
Figure 4 - Trace Capture Using a Tag Process
Our second design addressed these problems, as shown in Figure 4. Testing was automated using a simple test driver program to step through a sequence of tests, some with and some without the feature. To get around the ctmanager delays, its processing was deferred until the end of the test sequence by using tags inserted into the trace to separate the tests. The additional "tag" process shown in Figure 4 simply waited while listening on a socket. It was instrumented with CodeTEST just like the target program so that when it received a "ping" it generated a trace event.
The driver cycled through the following process: 1. send the test data for one test to the target program and wait for a response 2. wait for 5 seconds 3. send a "ping" to the tag program and wait for a reply 4. wait for 5 seconds
The ctserver component is thus receiving trace events from both the target program and the tag program so there are known events in the trace to mark the beginning of each test. The ctmanager/Python code ran only after all testing was complete and generated a single long trace, which was then broken apart at the tags to get the trace segment for each test.
Simmons et al ...
12
Using Recon3 for Trace Collection The Recon3 instrumentor and trace manager were designed with the goal of producing clear traces from a long running program, so it is not surprising that they showed none of the problems encountered with CodeTEST trace collection. On the other hand Recon3 exhibited some of the robustness problems common in academic tools when faced with the large base of Apache code.
The Recon3 instrumentor for C does not do a complete parse, but instead simply scans the code for surface features indicating where instrumentation should be inserted. Instrumentor flags allow the user to specify function entry/return instrumentation or decision instrumentation.
Many files in the current version of Apache make very heavy use of macros and conditional compilation which are handled by the C pre-processor. These structures confused the instrumentor, especially when it was trying to identify function entry and return points. We found that we could only use decision instrumentation (of if, switch, while, for, etc.) in the study. That does not seem to have seriously hampered feature location, since all of the features we sought were large enough to contain at least one decision that was marked by TraceGraph. However inSight organizes data by C function, so the lack of function entry and return data required extra hand work by the case study subjects. They had to locate each marked decision in the code, find which C function it was in, and then navigate to that C function in inSight.
One dramatic difference between CodeTEST and Recon3 is in performance. We ran a few small performance tests to compare the two tools, with the test driver on the same node as the instrumented Apache to minimize network time delays. Table 1 shows a representative result for a series of tests in which Apache served up 100 different web pages.
Table 1 - The Performance Impact of Instrumentation - Execution Time to Serve a Web Page
Uninstrumented Apache Average time per page in microseconds Relative Time Instrumented with CodeTEST Instrumented with Recon3
1709 100%
2135 125%
173,399 10,146%
Simmons et al ...
13
As can be seen from the table, while CodeTEST was not really designed for feature location, Recon3 was not really designed to be fast. The wall-clock time to run the test increased by a factor of 100. This difference would not be significant in analyzing a conventional program in a laboratory setting using small test data sets; after all, the time to serve a single page is still less than a fifth of a second. However it could obviously be very important for a system with real-time constraints where missed deadlines could modify behavior.
Using TraceGraph for Trace Comparison The TraceGraph tool, part of the free Recon3 tool suite, provides an intuitive and visual way for a software engineer to compare traces and spot the marker code for a feature.
The software engineer feeds TraceGraph first the traces without the feature, then a trace with the feature. TraceGraph displays each trace as a compact column of rectangles, each only a few pixels in size (Figure 5). Each horizontal row represents a specific trace event, such as the execution of a particular block of code 5 .
Figure 5 - TraceGraph Screen Shots The last column of rectangles is the trace with the feature
Since the last column is the first trace with the feature, rectangles that only appear in that column are the "marker" code that was executed only when the feature was present. The software engineer can get a quick impression of the amount of marker code and of how it is distributed. He
5
An on-line demo of the TraceGraph tool can be viewed at
http://www.cs.uwf.edu/~recon/recon3/r3wDemo.htm
Simmons et al ...
14
can click on a rectangle to get more information about the type of event, the source file, line number, etc.
It proved to be easy to convert the traces from both CodeTEST and Recon3 into a format that TraceGraph could read. However one annoying problem was that TraceGraph had no way of easily exporting the information about the marker code. The case study subjects had to click on each rectangle, copy the data from the pop-up window (Figure 5), click on the next rectangle, etc. They then opened up inSight to study each bit of marker code.
The rather tedious process of clicking on rectangles and copying information from the pop-up window took a significant fraction of the time required to study each feature. While the marker code was always a small fraction of Apache, there were often dozens of rectangles to be investigated. TraceGraph clearly needs a better way of exporting a list of marker code to another tool.
Using inSight for Feature Understanding As mentioned previously, inSight is part of the Klocwork static analysis tool suite. It provides a graphic environment for browsing and analyzing large software systems (Figure 6). One application of inSight is architectural modeling of such a system, in which one or more high level diagrams are prepared to illustrate the architectural relationships between the many software components.
Since our purpose was to understand and document a specific system feature, we decided to create an architectural model showing just those code components that are relevant for one feature. While most program comprehension studies indicate that programmers tend to combine a top-down and a bottom-up approach (e.g. [14]), we found that the "custom diagram" facility of inSight let us use a nearly pure bottom-up method 6 . First the case study subjects created a new empty custom diagram with the right-hand panel of the display blank. The list of marker code from TraceGraph was then reduced to a list of the C functions containing the markers (since inSight works at the function level). Each of these functions was then located in the hierarchy of
We would like to thank Neil Lillemark of Klocwork, Inc. for his assistance in defining the best way to use
the inSight tool.
Simmons et al ...
15
the left panel of inSight and dragged into the right panel. Then in the right panel, inSight shows each function as a rectangle and adds arrows to show the relationships between them (Figure 6).
Figure 6 - inSight Screen Shot
The case study subjects then used inSight to explore Apache's code until they felt they had a sufficient understanding of the feature. inSight allowed them to browse the code or a flowchart representation of it, to add functions, for example those that call the marker code, or to subtract functions if examination showed them to be irrelevant. Finally, the subject saved the resulting diagram and wrote a brief feature report containing the figure and an explanation of the feature. (See Appendix 1 for an example.)
Feature Location Results The case study simulated the process of feature location in a company as part of normal software change activity. Table 2 lists the four scenarios of software change that were used. The scenarios are in the form of a problem report or change request related to a specific feature of Apache. The
Simmons et al ...
16
two case study subjects were not previously familiar with Apache code, though they were experienced in C programming and in networking.
Table 2 - Scenarios Used in the Case Study

Scenario 1: The OPTIONS * request currently provides several pieces of information about the http server. We want an additional piece of information to be added to the returned information. Specifically, we want the date of the next scheduled shutdown to be read from a file and returned. Scenario 2: We assume that a security problem has been discovered in the code used to handle the ? syntax in a GET request. We want to add security measures to the code that will provide additional checks on the URI before allowing processing to continue. Scenario 3: We assume that an error has been reported in the processing of certain date formats used in the If-Modified-Since: request structure. The error occurs only in the RFC 850 form but does not occur with the other two forms. We are looking for the code specific to the RFC 850 date form. Scenario 4: Currently, the http server only accepts the 100-continue parameter to the Expect header as specified in RFC 2616. We want to add the ability to service an Expect: PRIORITY = HIGH header as well. Location of the code responsible for implementing the currently accepted header is needed in order to facilitate the addition of the new header.
As previously described, case study participants went through all the steps of analyzing a scenario: designing and running test cases, loading the resulting traces into TraceGraph, copying the identified functions into inSight, study of these functions and their context, and finally preparing a feature report to document their understanding. Appendix 1 provides an example of the whole process and its end result.
Subjects were asked to record approximately how much time it took to locate the feature up to the point where they would be fairly confident that they could successfully make the code change. They were also asked to estimate how much Apache code they had to scan and how much they studied in detail to complete their analysis.
The general results of the study are summarized in Table 3 and are consistent with the experience of earlier feature location case studies. The amount of code studied for each feature was quite small considering that the Apache system is over 200 KLOC. The time required to locate and understand a feature was variable, with about 3 to 4 hours being typical.
Table 3 - Feature Location Results
Simmons et al ...
17
Case study subject Scenario 1 - CodeTEST - Recon3 Scenario 2 - CodeTEST - Recon3 Scenario 3 - CodeTEST - Recon3 Scenario 4 - CodeTEST - Recon3 Median A B A B B A B A
Approx. hours to complete the feature 4.25 4.00 3.00 4.00 3.25 1.00 4.00 2.00 3.63
C files with marker code 15 8 24 9 1 1 3 2 5.50
C functions with marker code 29 10 81 18 2 2 4 2 7.00
Approx. lines of code scanned 1000 500 4000 500 20 200 400 200 450
Approx. lines of code studied 500 100 500 200 5 100 100 100 100
Only in one case, scenario 2, did the two subjects come to a different understanding of the feature. In the other scenarios similar diagrams were constructed in inSight with pointers to consistent locations in the code.
The problem with scenario 2 was not really in the tools, but rather the test cases. The scenario concerns a URI that specifies a cgi script and contains the query "?" character. One subject compared cgi URI's to non-cgi URI's and thus identified a large body of code related to cgi processing. The other subject compared cgi URI's containing a "?" to cgi URI's with no "?" and found a much smaller amount of code. The second approach was much more effective. This difference shows the importance of choice of test cases in using dynamic methods of feature location. When a first set of tests identifies a lot of marker code it is probably best to go back and refine the test cases to see what additional code can be subtracted.
Conclusions This paper addresses a practical problem in the software industry: how can new technology be introduced without the overhead and uncertainty of introducing a new tool for each case? It is obviously attractive to leverage existing and robust tools that are known to work in a particular industrial environment, but that requires adapting them to a new application. We hypothesized a three stage process of tool adaptation: installation, trial and use. As is often the case, a design that worked on small examples in our installation phase, failed to scale up when applied to real
Simmons et al ...
18
software in the trial phase. In adopting a technology, it is important to allow time for both kinds of experimentation.
Table 4 - Approximate Person-Hours for the Installation and Trial Phases of the Case Study
CodeTEST Installation hours Trial hours 77 67 KlocWork 18 0 Recon3 8 2 Other 4 7 Total 107 76
Table 4 shows the approximate time required for the technology adaptation process to get the two tool combinations working. The effort required by another company or using different tools would obviously be different since the time is strongly affected by the depth of previous experience with each tool. However an investment of around 180 person hours would seem to be reasonable to bring a new technology into a large project.
The case study showed that both the tool combinations (CodeTEST plus TraceGraph plus inSight and Recon3 plus TraceGraph plus inSight) were effective in locating features in code typical of the Motorola domain. The main differences between the commercial CodeTEST instrumentor and the free Recon3 were in the much greater efficiency of CodeTEST instrumentation and its better robustness in dealing with the C preprocessor. For systems in which timing behavior is important, CodeTEST is clearly the better alternative but for other systems being tested manually the delays introduced by Recon3 may not be significant.
As shown in Table 3, the time required to locate and understand a feature was typically around 3 4 hours, which would seem to be quite fast for a software engineer dealing with an unfamiliar system of over 200 KLOC. Better tool integration could improve this time still further, since we observed informally that around half the time was spent hand transferring the TraceGraph markers into inSight.
As far as we know, this is the first study to combine dynamic analysis for feature location with a high-end static analysis tool for feature understanding. The combination seems to work quite well. Dynamic analysis is a powerful tool for locating a few feature markers in the code, but it only shows what happened on the specific test cases that were run.
Simmons et al ...
19
For a software engineer, there is obviously some danger in making a code change based only on the dynamic picture; the code to be changed may be called from other locations in other contexts so a fix in one place might break code in another. Static analysis, using a tool such as inSight, allows the software engineer to see the context of each of these marker fragments, and also shows code relationships from all possible executions, not just those from the specific test cases.
Another advantage of a tool such as inSight is the ability to document a feature graphically. Since documentation is costly and often becomes obsolete, it is frequently missing in legacy systems. One approach is progressive redocumentation, in which new documentation is written as each change is made, thus codifying knowledge about those parts of the system that change most frequently [9]. The diagrams produced by inSight would make it easy to generate such documentation on a feature-by-feature basis.
Based on our initial results, Motorola, SERC and the University of West Florida have started a follow-on project to improve the integration of these tools and the over-all user-friendliness of the feature location process. The long-term goal is to see if feature location can become a useful part of the software evolution process at Motorola.
It is interesting to speculate on what might be achieved by a tool that integrated the dynamic and static views more completely. It would be interesting to see, for example, the execution of one specific trace as an overlay on the static view, perhaps by turning on and off color highlighting of the graph. For multi-process systems such as those developed at Motorola, the trace data could also allow an overlay showing a specific process or thread.
In a recent article, Spinellis has pointed out that the profession may have forgotten the art of writing tools that are good for doing one job well, but that also can interact with each other [13]. The rich graphic user interfaces provided by modern tools undoubtedly have value, but more attention to interaction with human users may be at the expense of attention paid to interaction with other tools. This study certainly encountered some difficulties of this nature, notably the lack of any easy export facility from TraceGraph that would allow it to interface with inSight. We hope that future developers of commercial tools will give some thought to the improving the ways that dynamic and static analysis can work together.
Simmons et al ...
20
References [1] "The Apache Software Foundation", http://www.apache.org/ URL current as of August, 2006.
[2] Bennett K, Rajlich V, Wilde N, "Software Evolution and the Staged Model of the Software Lifecycle", in Advances in Computers, ed. Marvin Zelkowitz, Vol. 56, pp. 1 - 54 (2002).
[3] D. Edwards, S. Simmons, N. Wilde, "An approach to feature location in distributed systems", in press, to appear in the Journal of Systems and Software.
[4] Eisenbarth, T., Koschke, R., Simon, D., "Incremental location of combined features for largescale programs." In: Proceedings of the International Conference on Software Maintenance ICSM 2002, Montreal, Canada, pp. 273282. (2002)
[5] Klocwork: Automated solutions for understanding and perfecting software, http://www.klocwork.com/products/ URL current as of August, 2006
[6] Lukoit, K, Wilde, N., Stowell, S., Hennessey, T., "TraceGraph: Immediate Visual Location of Software Features." In Proceedings of the International Conference on Software Maintenance ICSM 2000, pp. 33 - 39, October, 2000.
[7] Metrowerks CodeTEST Software Analysis Tools, http://www.metrowerks.com/MW/Develop/AMC/CodeTEST/default.htm URL current as of August, 2006.
[8] What is Python? http://www.python.org/doc/Summary.html, URL current as of August, 2006.
[9] Rajlich, V, "Incremental Documentation Using the Web", IEEE Software, Vol. 17, No. 5, September/October 2000, pp. 102 - 106.
[10] Recon Tools for Software Engineers, http://www.cs.uwf.edu/~recon/, URL current as of August, 2006.
[11] RFC 2616, Hypertext Transfer Protocol - HTTP/1.1, http://www.w3.org/Protocols/rfc2616/rfc2616.html, URL current as of August, 2006
Simmons et al ...
21
[12] Simmons S, Edwards D, Wilde N, Homan J, Groble M, Using Industrial Tools for Software Feature Location and Understanding, SERC Technical Report SERC-TR-275, August, 2005, http://www.serc.net/report/tr275.pdf, URL current as of August, 2006.
[13] Spinellis D. "Tool Writing: A Forgotten Art?", IEEE Software, Vol. 22, No. 4, July/August 2005, pp. 9-11.
[14] Von Mayrhauser, Annelise and Vans, A. Marie, "Program Comprehension During Software Maintenance and Evolution", IEEE Computer, Vol. 28, No. 8, August, 1995, pp. 44 - 55.
[15] Wilde, N., Scully, M., "Software reconnaissance: Mapping program features to code." Journal of Software Maintenance: Research and Practice, Vol. 7, pp. 4962 (1995)
[16] Wilde, N., Casey, C., "Early field experience with the software reconnaissance technique for program comprehension." In: Proceedings of the International Conference on Software Maintenance - ICSM-96, Monterey, California, pp. 312318.
[17] Wilde, N., Casey, C., Vandeville, J., Trio, G., Hotz, D., "Reverse engineering of software threads: A design recovery technique for large multi-process systems." Journal of Systems and Software, vol. 43, 1117.
[18] Wilde, N., Buckellew, M., Page, H., Rajlich, V., 2001. "A case study of feature location in unstructured legacy Fortran code." In: Proceedings of the Fifth European Conference on Software Maintenance and Reengineering - CSMR'01. IEEE Computer Society, pp. 68-76. (2001).
[19] Wong W.E., Gokhale, S.S., Horgan, J.R., "Metrics for quantifying the disparity, concentration, and dedication between program components and features." In: Sixth IEEE International Symposium on Software Metrics, p. 189. (1999)
[20] Wong, W. E., and Gokhale, S., "Static and dynamic distance metrics for feature-based code analysis", Journal of Systems and Software, vol. 74 (2005), pp. 283-295.
Simmons et al ...
22
Appendix 1 Sample Scenario and Feature Report from the Case Study
Scenario 3 - Date
Section 3.3.1 of RFC 2616 (HTTP 1.1 Protocol) allows several different formats for dates:
Sun, 06 Nov 1994 08:49:37 GMT
; RFC 822, updated by RFC 1123
Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036 Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format
See http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.3.1 for details.
Suppose a bug has been encountered in processing the RFC 850 form, but which does not seem to occur with the other two forms. The bug has specifically been seen when the If-Modified-Since header (section 14.25 of RFC 2616) is used, but may also be occurring in other headers. It may have something to do with the processing of two-digit years, which occur in this data form but not in the others.
Where in Apache is the code that is specific to the RFC 850 date form? Alternatively, if there is no code specific to this form, where is the code for the If-Modified-Since header and how does it work?
Current test case for this feature:
Open socket to wahoo.cs.uwf.edu port 7348 (or 2548 for CodeTest)
Then send: GET /~dswg/simple_page1.html HTTP/1.1 Host: wahoo.cs.uwf.edu If-Modified-Since: Sunday, 06-Nov-94 08:49:37 GMT
Current result is: The specified page is returned.
Simmons et al ...
23
Feature Report Scenario: 3 (DATE) We suppose that an error has been reported in the processing of certain date formats used in the "If-Modified-Since:" request structure. The error occurs in only one format, specifically the one that described in RFC 850 (obsoleted by RFC 1036). We hypothesize that the code may be occurring in the code using the two-digit year provided in RFC 850 compliant dates. We need to locate the code which handles the RFC 850 requests.
By, date: Dennis Edwards, 03/31/2005
Times: (1)Task (2)No of Software Engs Running tests and collecting traces Analyzing traces and studying the feature 1 0.25 0.25 1 (3)Elapsed Time (e.g. 3.2 hours) 0.75 (4)PersonHours(2) x (3) 0.75
Tools used: RECON instrumentor, TraceGraph, and Klocwork Test cases used: I ran four test cases using the driver program in ~dswg/StudyDocuments/Dennis/Scenario3/Driver/driver.c as follows. 1. 2. 3. 4. GET ... If-Modified-Since: Fri, 11 Mar 2005 10:24:42 GMT GET ... If-Modified-Since: Fri, 11 Mar 2005 10:24:42 GMT GET ... If-Modified-Since: Friday, 11-Mar-05 10:24:42 GMT GET ... If-Modified-Since: Friday, 11-Mar-05 10:24:42 GMT
The execution created five partial trace files which were stored in the ~dswg/StudyDocuments/Dennis/Scenario3/Traces directory. 1. tr00000.r3t : Fri, 11 Mar 2005
Simmons et al ...
24
2. tr00001.r3t : Fri, 11 Mar 2005 3. tr00002.r3t : Friday, 11-Mar-05 4. tr00003.r3t : Friday, 11-Mar-05 5. tr00004.r3t : termination code after last test case Marker code identified: 6 decisions in 2 functions (apr_date_parse_http() and apr_date_checkmask()) in a single source file. Marker code functions are tagged with A in the Klocwork diagram below.
Summary description of the feature and the intended change: As shown in the diagram, the default_handler() calls ap_meets_conditions() to determine if the file meets the conditions specified. That function in turn calls apr_date_parse_http() which calls apr_date_checkmask() to obtain the answer. The date parameter is parsed in the apr_date_parse_http() function so our investigation should begin at that point. KlocWork identified apr_date_parse_rfc() as the only other function which calls apr_date_checkmask(). While the test cases didn't identify the apr_date_parse_rfc() function as being used in the feature, it should be examined before any changes are made that could alter the functionality of apr_date_checkmask().
Code visited: Estimate of LOC scanned: 200 Estimate of LOC studied in some detail: 100
Notes: This function didn't use any utility functions so the simple test cases were sufficient. A small number of functions were identified as important and the identification turned out to be accurate.
Simmons et al ...
25
Klocwork feature diagram:

10 1 1 123

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

10 1 1 123

Загружено:

Авторское право:

Доступные форматы

Simmons et al ...

instrument system & collect traces

compare traces & locate feature

explore code & understand feature

trace file Recon3

Figure 1 Tools Used in the Case Study

start message target program end message Software Engineer

"trace manager" collects events wr ites file

Figure 2 - Tracing a Long-Running System

commands (shared memory)

target machine ctserver

Target Target Target Process Process

test data code written for the case study

Figure 3 - CodeTEST Architecture

commands (shared memory)

Target Target Target Program Program

Tag Process ctmanager IDB Python

Figure 4 - Trace Capture Using a Tag Process

An on-line demo of the TraceGraph tool can be viewed at

the inSight tool.

Figure 6 - inSight Screen Shot

Table 2 - Scenarios Used in the Case Study

Table 3 - Feature Location Results

C files with marker code 15 8 24 9 1 1 3 2 5.50

C functions with marker code 29 10 81 18 2 2 4 2 7.00

[8] What is Python? http://www.python.org/doc/Summary.html, URL current as of August, 2006.

Sun, 06 Nov 1994 08:49:37 GMT

; RFC 822, updated by RFC 1123

See http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.3.1 for details.

Current test case for this feature:

Open socket to wahoo.cs.uwf.edu port 7348 (or 2548 for CodeTest)

Current result is: The specified page is returned.

By, date: Dennis Edwards, 03/31/2005

Klocwork feature diagram:

Вам также может понравиться