Вы находитесь на странице: 1из 124

Software Reverse Engineering Education

http://www.reversingproject.info

Teodoro Cipresso, tcipress@hotmail.com San Jos State University, Spring 2009 Advisor: Dr. Mark Stamp Committee: Dr. Robert Chun, Dr. David Taylor

http://www.reversingproject.info

Background Information
Introduction to Software Reverse Engineering

Software Reverse Engineering (SRE) can be described as the practice of analyzing a software system to create abstractions that identify the individual components and their dependencies, and, if possible, the overall system architecture [1]. Once the components and design of an existing system have been recovered, it becomes possible to repair and even enhance them. Reverse engineering skills are also used to detect and neutralize viruses, worms and other malware, as well as to protect intellectual property [1].

http://www.reversingproject.info

Background Information (contd)


Importance of SRE Education

More emphasis is needed in SE [and CS] undergraduate and graduate programs on the issue of software evolution and change. Students need to be educated on the theory and practice of software comprehension, maintenance and reengineering. They need to learn how to live with the monsters from the past and tame them [2]. Most of the time, students are trained in developing very small programs starting from scratch. This approach is really misleading since most students learn to believe that software engineering is just about developing brand new software. In fact many students will be involved in evolution-related activities after completion of their studies [3].

http://www.reversingproject.info

Background Information (contd)


Student Feedback on SRE Education

Incorporation of software reverse engineering techniques and methodologies into regular course work was tried at the University of Missouri-Rolla [1]. The results of this experiment were quite positive:

77% of students thought that the incorporation of SRE techniques and methodologies reinforced concepts taught during lectures. 82% of students wanted SRE to be included in future courses, especially those that deal with software design.

http://www.reversingproject.info

Background Information (contd)


Development-Related Reversing Scenarios

Figure 1. Development-related software reverse engineering scenarios.

http://www.reversingproject.info

Background Information (contd)


Security-Related Reversing Scenarios

Figure 2. Security-related software reverse engineering scenarios.

http://www.reversingproject.info

Background Information (contd)


Legacy Software Development Process

Figure 3. Software development process in a typical enterprise software system.

http://www.reversingproject.info

Project Overview
Baseline Education in Software Reverse Engineering
Computer programmers with an improved ability to understand, evolve, and secure software.

Educate programmers on software reversing, antireversing, and patching

Educate programmers on software reengineering and reuse

Educate programmers on software security and malware detection

Figure 4. Activities related to providing a baseline SRE education.

http://www.reversingproject.info

Materials and Methods

More than ten peer-reviewed articles on the topics of software reverse engineering, re-engineering, maintenance, reuse, and security were selected and used to address the research questions. Of the articles selected, three were chosen for their specific coverage of experiences with teaching courses in software reversing, reengineering, and maintenance. Drew upon my experience, which is just shy of a decade, with designing and developing legacy software modernization tools at IBM.

http://www.reversingproject.info

Results
Overview of Developed SRE Course Modules

Reversing and Patching Wintel Machine Code Reversing and Patching Java Bytecode Applying Anti-Reversing Techniques to Machine Code Applying Anti-Reversing Techniques to Java Bytecode

Reengineering and Reuse of Legacy Software


Identifying, Monitoring, and Reporting Malware

10

http://www.reversingproject.info

Results (contd)
Overview of Developed SRE Course Modules

Reversing and Patching Wintel Machine Code Reversing and Patching Java Bytecode Applying Anti-Reversing Techniques to Machine Code Applying Anti-Reversing Techniques to Java Bytecode

Reengineering and Reuse of Legacy Software


Identifying, Monitoring, and Reporting Malware

11

http://www.reversingproject.info

Results (contd)
Reversing and Patching Wintel Machine Code

An introduction to the compilation of high-level languages to machine code is provided. Assembly is contrasted as having a one-to-one mapping to machine code The negative results of experimentation with two decompilers (Boomerang and REC) for machine code are documented. Given the current state of decompiler technology, it was concluded that working with disassembly is the most feasible approach. A Wintel machine code reversing and patching exercise was developed against Password Vault, a non-trivial application that is provided with the exercise to avoid any legal concerns with reversing software written by others.

12

http://www.reversingproject.info

Results (contd)
Reversing and Patching Wintel Machine Code (contd)

The machine code reversing and patching exercise asks the learner to create a new executable version of the application that no longer has a trial limitation of five password records per user. A reliable, and repeatable reversing strategy is used: place a breakpoint on a memory artifact and trace back stack frames to locate the section in the disassembly. For instructional purposes, an animated solution that demonstrates the application of this reversing strategy using OllyDbg, an interactive debugger-disassembler, was developed using Qarbon Viewlet Builder.

13

http://www.reversingproject.info

Results (contd)
Reversing and Patching Wintel Machine Code (contd)

Figure 5. Animated solution to the Wintel reversing and patching exercise.

14

http://www.reversingproject.info

Results (contd)
Reversing and Patching Wintel Machine Code (contd)

Figure 6. Animated solution to the Wintel reversing and patching exercise.

15

http://www.reversingproject.info

Results (contd)
Reversing and Patching Wintel Machine Code (contd)

Figure 7. Animated solution to the Wintel reversing and patching exercise.

16

http://www.reversingproject.info

Results (contd)
Reversing and Patching Wintel Machine Code (contd)

Figure 8. Animated solution to the Wintel reversing and patching exercise.

17

http://www.reversingproject.info

Results (contd)
Reversing and Patching Wintel Machine Code (contd)

Figure 9. Animated solution to the Wintel reversing and patching exercise.

18

http://www.reversingproject.info

Results (contd)
Reversing and Patching Wintel Machine Code (contd)

Figure 10. Animated solution to the Wintel reversing and patching exercise.

19

http://www.reversingproject.info

Results (contd)
Reversing and Patching Wintel Machine Code (contd)

Figure 11. Animated solution to the Wintel reversing and patching exercise.

20

http://www.reversingproject.info

Results (contd)
Reversing and Patching Wintel Machine Code (contd)

Figure 12. Animated solution to the Wintel reversing and patching exercise.

21

http://www.reversingproject.info

Results (contd)
Reversing and Patching Wintel Machine Code (contd)

Figure 13. Animated solution to the Wintel reversing and patching exercise.

22

http://www.reversingproject.info

Results (contd)
Reversing and Patching Wintel Machine Code (contd)

Figure 14. Animated solution to the Wintel reversing and patching exercise.

23

http://www.reversingproject.info

Results (contd)
Reversing and Patching Wintel Machine Code (contd)
Idea for an advanced Wintel machine code (**) exercise:

It should be feasible to patch in additional function to the Password Vault machine code:

The GCC compiler can generate assembly language instead of machine code, so the programmer can work in a high-level language. Patching in the generated assembly code would require some significant amount of time spent in the program understanding phase. Final integration of the new code would require modification of the Windows PE header to increase the size of the .code section, also the .rdata and .data sections if new variables and constants are added.
24

http://www.reversingproject.info

Results (contd)
Overview of Developed SRE Course Modules

Reversing and Patching Wintel Machine Code Reversing and Patching Java Bytecode Applying Anti-Reversing Techniques to Machine Code Applying Anti-Reversing Techniques to Java Bytecode

Reengineering and Reuse of Legacy Software


Identifying, Monitoring, and Reporting Malware

25

http://www.reversingproject.info

Results (contd)
Reversing and Patching Java Bytecode

An introduction to interpreted/intermediate executable formats such as Java bytecode is provided. These formats are contrasted with machine code and assembly language. Java bytecode disassembly using javap is covered for help with analysis of bytecode generated by javac. The positive results of experimentation with the Jad Java bytecode decompiler are documented; it is concluded that direct reading/writing of bytecode is not necessary. A Java bytecode reversing and patching exercise was developed against a Java version of Password Vault.

26

http://www.reversingproject.info

Results (contd)
Reversing and Patching Java Bytecode (contd)

The Java bytecode reversing and patching exercise asks the learner to create a new executable version of the application that no longer has a trial limitation of five password records per user. Since the Password Vault application consists of a small number of classes in a single package, a simple reversing strategy of unpacking the Jar archive, batch decompiling the classes, modifying the generated Java source, and recompiling is used. For instructional purposes, an animated solution that demonstrates the application of this reversing strategy using FrontEnd Plus, a graphical interface to Jad, was developed using Qarbon Viewlet Builder.

27

http://www.reversingproject.info

Results (contd)
Reversing and Patching Java Bytecode (contd)

Figure 15. Animated solution to the Java bytecode reversing and patching exercise. 28

http://www.reversingproject.info

Results (contd)
Reversing and Patching Java Bytecode (contd)

Figure 16. Animated solution to the Java bytecode reversing and patching exercise. 29

http://www.reversingproject.info

Results (contd)
Reversing and Patching Java Bytecode (contd)

Figure 17. Animated solution to the Java bytecode reversing and patching exercise. 30

http://www.reversingproject.info

Results (contd)
Reversing and Patching Java Bytecode (contd)

Figure 18. Animated solution to the Java bytecode reversing and patching exercise. 31

http://www.reversingproject.info

Results (contd)
Reversing and Patching Java Bytecode (contd)

Figure 19. Animated solution to the Java bytecode reversing and patching exercise. 32

http://www.reversingproject.info

Results (contd)
Reversing and Patching Java Bytecode (contd)

Figure 20. Animated solution to the Java bytecode reversing and patching exercise. 33

http://www.reversingproject.info

Results (contd)
Reversing and Patching Java Bytecode (contd)

Figure 21. Animated solution to the Java bytecode reversing and patching exercise. 34

http://www.reversingproject.info

Results (contd)
Reversing and Patching Java Bytecode (contd)

Figure 22. Animated solution to the Java bytecode reversing and patching exercise. 35

http://www.reversingproject.info

Results (contd)
Reversing and Patching Java Bytecode (contd)
Idea for an advanced Java bytecode (**) exercise:

Use available Java class libraries, such as jclasslib, to directly read and write Java bytecode.

Write a Java program that scans through the bytecode for the Java Password Vault application and locates the instructions for the trial limitation. Once the instructions are located, overwrite them with a sequence that disables the trial limitation. This can be good practice for getting a feel for writing code that patches an executable.

36

http://www.reversingproject.info

Results (contd)
Overview of Developed SRE Course Modules

Reversing and Patching Wintel Machine Code Reversing and Patching Java Bytecode Applying Anti-Reversing Techniques to Machine Code Applying Anti-Reversing Techniques to Java Bytecode

Reengineering and Reuse of Legacy Software


Identifying, Monitoring, and Reporting Malware

37

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Machine Code

An brief introduction to basic anti-reversing techniques is provided: Eliminating Symbolic Information, Obfuscating the Program, and Embedding Anti-Debugger Code. Machine code typically has very little symbolic information that can be altogether eliminated, therefore a discussion illustrates how debuggers insert quite a bit of information that makes machine code easier to reverse. The technique Obfuscating the Program, is demonstrated in a Wintel machine code anti-reversing exercise where data, computation, and control flow obfuscations are applied to the C++ source code for Password Vault.

38

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Machine Code (contd)

Commercial tools such as EXECryptor www.strongbit.com, fully obfuscate and pack Windows executables, using advanced algorithms that are based on the elementary techniques described in this module. It is difficult to provide a before and after illustration of machine code that is obfuscated using EXECryptor, so the examples and exercise in this module are implemented first at the source code level and then confirmed in the machine code using live and static analysis.

In the case of control-flow obfuscation, only static analysis is used, where subsequent run traces are compared using an edit-distance measurement.

39

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Machine Code (contd)

The Wintel machine code anti-reversing exercise asks the learner to create a new executable version of the Password Vault application where the following transformations are applied:

Encryption of string literals (data obfuscation). Obfuscation of the numeric representation of the password record limit (computation obfuscation). Obfuscation of the method that performs the record limit check (control flow obfuscation).

40

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Machine Code (contd)

Encryption of String Literals (data obfuscation):

Figure 23. Strings are decrypted each time they are used using a bundled cipher. 41

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Machine Code (contd)

Obfuscation of the numeric representation of the password record limit (computation obfuscation):

Figure 24. Complex evaluations obscure the actual condition.

42

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Machine Code (contd)

Obfuscation of the numeric representation of the password record limit (computation obfuscation) (contd):

Figure 25. Testing for a function of a number can slow a reverser down.

43

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Machine Code (contd)

Obfuscation of the method that performs the record limit check (control flow obfuscation):

We introduce some non-essential, recursive, and randomized logic to the password limit check to make it more difficult for a reverser to perform static and/or live analysis. Since no standards exist for control flow obfuscation, a custom algorithm was designed to hinder live and static analysis through use of recursive and randomized procedure calls.
Recursion grows the stack considerably, making stepping through the code difficult, while randomization makes execution unpredictable (breakpoints may not trigger & run traces differ).
44

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Machine Code (contd)
Depth of the recursion is randomized on each check of the limit.

Random procedure call targets generate and return a number that is added to an instance variable, preventing the procedures from being identified as NOOPs by a code optimizer.
Figure 26. A control flow obfuscation algorithm for the record limit check.
45

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Machine Code (contd)

To measure the effectiveness of the control flow algorithm in hindering analysis, three execution traces of the section of the code containing the record limit check were compared. The Levenshtein Distance (LD) was computed between the three traces where each instruction in the trace was compared. LD was modified to consider each line as opposed to each character. The execution traces were collected using OllyDbg and had to be cleaned of disassembly artifacts such as line numbers, base addresses, and comments in order to ensure that the analysis was fair.

46

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Machine Code (contd)

Figure 27. Comparison of executions of record limit check on identical program input.

47

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Machine Code (contd)

The Wintel anti-reversing module also demonstrates source code obfuscation which is a useful anti-reversing technique for source code. There may exist a requirement to ship the source code of an application so that the machine code can be generated on the end users computer.

If the source code contains intellectual property that is worth protecting, one can perform transformations to the source code which make it difficult to read, but have no impact on the machine code that would ultimately be generated when the program is compiled.

48

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Machine Code (contd)

Demonstration of the COBF source code obfuscator:


VerifyPassword.cpp: 01: int main(int argc, char *argv[]) 02: { 03: const char *password = "jup!ter"; 04: string specified; 05: cout << "Enter password: "; 06: getline(cin, specified); 07: if (specified.compare(password) == 0) 08: { 09: cout << "[OK] Access granted." << endl; 10: } else 11: { 12: cout << "[Error] Access denied." << endl; 13: } 14: } COBF invocation: 01: C:\cobf_1.06\src\win32\release\cobf.exe 02: @C:\cobf_1.06\src\setup_cpp_tokens.inv -o cobfoutput -b -p C: 03: \cobf_1.06\etc\pp_eng_msvc.bat VerifyPassword.cpp

49

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Machine Code (contd)
COBF obfuscated source for VerifyPassword.cpp: 01: #include"cobf.h" 02: ls lp lk;lf lo(lf ln,ld*lj[]){ll ld*lc="\x6a\x75\x70\x21\x74 03: \x65\x72";lh la;lb<<"\x45\x6e\x74\x65\x72\x20\x70\x61\x73\x73 04: \x77\x6f\x72\x64""\x3a\x20";li(lq,la);lm(la.lg(lc)==0){lb<<"\x5b 05: \x4f\x4b\x5d\x20\x41" "\x63\x63\x65\x73\x73\x20\x67\x72\x61\x6e 06: \x74\x65\x64\x2e"<<le;}lr{lb<<"\x5b\x45\x72\x72\x6f\x72\x5d 07: \x20\x41\x63\x63\x65\x73\x73\x20\x64" "\x65\x6e\x69\x65 08: \x64\x2e"<<le;}} COBF generated header (cobf.h): 01: 02: 03: 04: 05: 06: 07: 08: #define #define #define #define #define #define #define #define ls lp lk lf lo ld ll lh using namespace std int main char const string 09: 10: 11: 12: 13: 14: 15: #define #define #define #define #define #define #define lb li lq lm lg le lr cout getline cin if compare endl else

50

http://www.reversingproject.info

Results (contd)
Overview of Developed SRE Course Modules

Reversing and Patching Wintel Machine Code Reversing and Patching Java Bytecode Applying Anti-Reversing Techniques to Machine Code Applying Anti-Reversing Techniques to Java Bytecode

Reengineering and Reuse of Legacy Software


Identifying, Monitoring, and Reporting Malware

51

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode

While experiments with decompiling machine code were not successful, decompilation of Java bytecode to Java source code yielded acceptable results. Given these results, one does need to be concerned with protecting Java bytecode from decompilation if there is significant intellectual property in the program.

Obfuscating bytecode is inherently easier than obfuscating source code because bytecode has a significantly more strict and organized representation than source code.

52

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Variable, class, and method names, are all left intact when compiling Java source code to Java bytecode. This is a stark difference from machine code where variable and local method names are not preserved. A high-level of protection can be achieved for Java bytecode by applying three transformations: Name Obfuscation, String Encryption, and Control Flow Obfuscation. Zelix Klassmaster, a commercial product, is capable of all performing all three. Unfortunately no open-source or free tool exists that can perform all three.

53

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

The trial version of Zelix Klassmaster is restricted to 30 days, and the company will only e-mail a trial version to non-free e-mail addresses. Not much is learned by having everything done for us, so this module sees how far one can get with open-source and free software.

ProGuard and RetroGuard are free Java bytecode obfuscators capable of Name Obfuscation.
SandMark, a Java bytecode watermarking and obfuscation tool from the University of Arizona, is capable of String Encryption and some weak control flow obfuscations.

54

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

A Java bytecode anti-reversing exercise was developed against the Java version of Password Vault.

Since the learner will have already experienced manually applying obfuscations in the Wintel machine code antireversing, this exercise focuses on the use of tools.
In the exercise, it is expected that the Java bytecode for the Password Vault application will be incrementally obfuscated using two or more tools. For instructional purposes, an animated solution that demonstrates obfuscating the Password Vault Java bytecode to the point of inhibiting decompilation, was developed using Qarbon Viewlet Builder.

55

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 28. Animated solution to the Java bytecode anti-reversing exercise.

56

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 29. Animated solution to the Java bytecode anti-reversing exercise.

57

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 30. Animated solution to the Java bytecode anti-reversing exercise.

58

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 31. Animated solution to the Java bytecode anti-reversing exercise.

59

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 32. Animated solution to the Java bytecode anti-reversing exercise.

60

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 33. Animated solution to the Java bytecode anti-reversing exercise.

61

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 34. Animated solution to the Java bytecode anti-reversing exercise.

62

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 35. Animated solution to the Java bytecode anti-reversing exercise.

63

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 36. Animated solution to the Java bytecode anti-reversing exercise.

64

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 37. Animated solution to the Java bytecode anti-reversing exercise.

65

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 38. Animated solution to the Java bytecode anti-reversing exercise.

66

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 39. Animated solution to the Java bytecode anti-reversing exercise.

67

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 40. Animated solution to the Java bytecode anti-reversing exercise.

68

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 41. Animated solution to the Java bytecode anti-reversing exercise.

69

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 42. Animated solution to the Java bytecode anti-reversing exercise.

70

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 43. Animated solution to the Java bytecode anti-reversing exercise.

71

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 44. Animated solution to the Java bytecode anti-reversing exercise.

72

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 45. Animated solution to the Java bytecode anti-reversing exercise.

73

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 46. Animated solution to the Java bytecode anti-reversing exercise.

74

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 47. Animated solution to the Java bytecode anti-reversing exercise.

75

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 48. Animated solution to the Java bytecode anti-reversing exercise.

76

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 49. Animated solution to the Java bytecode anti-reversing exercise.

77

http://www.reversingproject.info

Results (contd)
Applying Anti-Reversing Techniques to Java Bytecode (contd)

Figure 50. Animated solution to the Java bytecode anti-reversing exercise.

78

http://www.reversingproject.info

Results (contd)
Overview of Developed SRE Course Modules

Reversing and Patching Wintel Machine Code Reversing and Patching Java Bytecode Applying Anti-Reversing Techniques to Machine Code Applying Anti-Reversing Techniques to Java Bytecode

Reengineering and Reuse of Legacy Software


Identifying, Monitoring, and Reporting Malware

79

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software

The question of whether to reengineer or reuse components of a software system most often arises in the context of large business or government organizations. Over time the processes and procedures of a business or organization will inevitably be reflected in the software systems that enable efficient, day-to-day operations [5].

While reverse engineering of legacy software is inherently intractable, some of us will inevitably find ourselves in a situation where no other option is available because the cost of rewriting a large, complex software system is prohibitive [6].

80

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

If good development practices were followed, legacy software is typically composed of three layers [5]:

Figure 51. Layers of a well-structured legacy software application.

81

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Legacy applications that are not sufficiently componentized, such that their general organization resembles the three layers, are not good candidates for reengineering and reuse. The most widely accepted technique to reuse legacy application components is that of Wrappering [5], where a new piece of code provides an interface to a legacy application component or layer without requiring code changes to it. Typically, candidate applications should be well-structured such that the business logic can be isolated, encapsulated, and made into reusable components.

82

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Unless enough of an application's source code remains such that it's possible to identify the names of reusable entry points (procedures) and their I/O data structures, attempting to reuse the application may be difficult. While it is possible to learn the names of entry points that have been explicitly exported by an application in the case of a DLL, the names don't indicate the layout of the expected I/O data structures. One way to discover the entry points and I/O data structures in legacy machine code is to read the source code of other applications which depend on it.

83

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

The COBOL programming language is most often associated with legacy software applications.

Normally, COBOL programs have a single entry point; additional alternate entry points are rare.
Legacy COBOL programs often include functional discriminators in their I/O data structures.

Figure 52. Mapping legacy functional discriminators to an object-oriented design. 84

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

In a real-world situation, we would be looking to reuse legacy components whose machine code is the result of thousands of lines of high-level language statements (COBOL) that implement a particular business process. Since our focus is more on reuse and reengineering of legacy code at a basic level, it's not necessary to encumber ourselves with a very large program in order to learn strategies for reuse and reengineering. Included with this module is a small COBOL calculator that we wish to make reusable from Java. This program is assumed to be something from the business logic layer.

85

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)
01: 02: 03: 04: 05: 06: 07: 08: 09: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: ****************************************************************** ** Simple COBOL program that performs integer arithmetic ** ****************************************************************** IDENTIFICATION DIVISION. PROGRAM-ID. 'SMPLCALC'. DATA DIVISION. WORKING-STORAGE SECTION. 77 MSG-NUMERIC-OVERFLOW PIC X(25) VALUE 'Numeric overflow occurred'. 77 MSG-SUCCESSFUL PIC X(22) VALUE 'Completed successfully'. LINKAGE SECTION.

* Input/Output data structure


01 SMPLCALC-INTERFACE. 02 SI-OPERAND-1 PIC S9(9) COMP-5. 02 SI-OPERAND-2 PIC S9(9) COMP-5. 02 SI-OPERATION PIC X. 88 DO-ADD VALUE '+'. 88 DO-SUB VALUE '-'. 88 DO-MUL VALUE '*'. 02 SI-RESULT PIC S9(18) COMP-3. 02 SI-RESULT-MESSAGE PIC X(128). PROCEDURE DIVISION USING BY REFERENCE SMPLCALC-INTERFACE. MAINLINE SECTION.

* Perform requested arithmetic

86

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)
27: INITIALIZE SI-RESULT SI-RESULT-MESSAGE 28: EVALUATE TRUE 29: WHEN DO-ADD 30: COMPUTE SI-RESULT = SI-OPERAND-1 + SI-OPERAND-2 31: ON SIZE ERROR 32: PERFORM HANDLE-SIZE-ERROR 33: END-COMPUTE 34: WHEN DO-SUB 35: COMPUTE SI-RESULT = SI-OPERAND-1 - SI-OPERAND-2 36: ON SIZE ERROR 37: PERFORM HANDLE-SIZE-ERROR 38: END-COMPUTE 39: WHEN DO-MUL 40: COMPUTE SI-RESULT = SI-OPERAND-1 * SI-OPERAND-2 41: ON SIZE ERROR 42: PERFORM HANDLE-SIZE-ERROR 43: END-COMPUTE 44: END-EVALUATE 45: * Successful return 46: MOVE MSG-SUCCESSFUL TO SI-RESULT-MESSAGE 47: MOVE 2 TO RETURN-CODE 48: GOBACK 49: .

87

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Many commercial tools support importing a COBOL data structure and generating Java marshalling classes.

These marshalling classes are intended to be used with the J2EE Connector Architecture (JCA) where a Java application wrappers a legacy software application.

Figure 53. Example JCA implementation for accessing a legacy application.

88

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

A popular alternative to using the JCA architecture to reengineer and reuse legacy applications is to implement a Service Oriented Architecture (SOA). SOA components become capable of communicating without the tight and fragile coupling of traditional binary interfaces because they are wrappered with a platformneutral interface such as XML and Web services. When XML is used as envisioned, all data, both of type character and numeric are represented as printable text completely divorced from any platform specific representation or encoding.

89

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

The net effect of this is that two entities or programs can interact without having to know the data structures that comprise each other's binary interface. Of course, the XML that is exchanged cannot be arbitrary, so industry standards such as XML Schema (XSD), and Web Services Definition Language (WSDL) fill this gap.

A Web service is considered to be WS-I compliant, or generally interoperable, if it meets many criteria, one of which is the use of XML for the input and output of each operation exposed by service.

90

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

This particular requirement of WS-I where XML is the interoperable interface of choice, sets the stage for a meaningful exercise. A Legacy Software Reengineering and Reuse Exercise was developed for this module where the focus is on wrappering a COBOL program so that is reusable from Java using XML in a local environment. The learner is asked to create a language neutral XML interface to the COBOL calculator program and invoke it from a Java program, which incidentally makes it reusable from other Java programs.

91

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Overview of the architecture for the exercise:

Figure 54. Architecture for legacy application reengineering and reuse from Java. 92

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Steps in the reengineering and reuse exercise:

Create an XML Schema which represents all of the data in the SMPLCALC-INTERFACE COBOL data structure. Write a Java interface ISimpleCalculator.java for three computation types supported by SMPLCALC.cbl.

Write a Java class JSimpleCalculator.java that implements the interface defined in ISimpleCalculator.java and provides a user interface.
Use the Java command-line utility xjc, in combination with the XML Schema, generate Java to XML marshalling code (JAXB).

93

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Steps in the reengineering and reuse exercise (contd):

Write a small C/C++ JNI program Java2CblXmlBridge.cpp which exports a method Java2SmplCalc that:

Invokes XML2CALC.cbl, passing the XML document received from JSimpleCalculator.java. Returns the XML generated by XML2CALC.cbl to JSimpleCalculator.java.

94

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Steps in the reengineering and reuse exercise (contd):

Write a COBOL program XML2CALC.cbl:

Marshalls XML from Java2CblXmlBridge.cpp into SMPLCALC-INTERFACE. Invokes SMPLCALC.cbl, passing SMPLCALCINTERFACE by reference. Marshalls SMPLCALC-INTERFACE back to XML before returning to Java2CblXmlBridge.cpp.

Compile XML2CALC.cbl and link it with the object code for SMPLCALC.cbl (SMPLCALC.obj).

95

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Steps in the reengineering and reuse exercise (contd):

Create a DLL to be loaded by JSimpleCalculator.java by compiling and linking Java2CblXmlBridge.cpp with the object code for XML2CALC.cbl. Update JSimpleCalculator.java to use the JAXB marshalling code to send/receive XML through the JNI layer and display the results.

96

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Highlights of the solution code:

SimpleCalculator.xsd
<element name="SI-OPERAND-1"> <simpleType> <restriction base="integer"> <totalDigits value="9" /> </restriction> </simpleType> </element> . . . <element name="SI-OPERATION"> <simpleType> <restriction base="string"> <enumeration value="+" /> <enumeration value="-" /> <enumeration value="*" /> </restriction> </simpleType> </element>

97

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Highlights of the solution code (contd):

ISimpleCalculator.java

98

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Highlights of the solution code (contd):

JSimpleCalculator.java

99

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Highlights of the solution code (contd):

JSimpleCalculator.java (contd)

100

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Highlights of the solution code (contd):

Java2CblXmlBridge.c

101

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Highlights of the solution code (contd):

XML2CALC.cbl

102

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Sample run of solution code:

Figure 55. Reuse of COBOL from Java using JAXB, JNI, and COBOL XML Support.103

http://www.reversingproject.info

Results (contd)
Reengineering and Reuse of Legacy Software (contd)

Sample run of solution code:

Figure 56. Reuse of COBOL from Java using JAXB, JNI, and COBOL XML Support.104

http://www.reversingproject.info

Results (contd)
Overview of Developed SRE Course Modules

Reversing and Patching Wintel Machine Code Reversing and Patching Java Bytecode Applying Anti-Reversing Techniques to Machine Code Applying Anti-Reversing Techniques to Java Bytecode

Reengineering and Reuse of Legacy Software


Identifying, Monitoring, and Reporting Malware

105

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware

Malware describes a category of software that does always operate in a way that benefits the user.

Of course, those of us who have ever used software might contend that this definition of malware will cause programs that we use every day to be categorized as malware. So let's qualify it a bit: the malicious or annoying behaviors of malware are intentional, not the result of one or more bugs.

106

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware (contd)

There are currently five types of malware that affect computer systems [6] [7]:

Viruses: require some deliberate action to help them spread.


Worms: similar to a virus but can spread by itself over computer networks.

Trojan Horses: functional software that performs hidden malicious or annoying operations.
Backdoor: a vulnerability purposely embedded in software.

Rabbit: a program that exhausts system resources.

107

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware (contd)

Malware usually isn't of just one type; for example, 3 of the top 10 malicious codes families reported in 2008 were Trojans with a backdoor component [8]. Using the machine code and bytecode reversing experiences gained from the previous modules, one could try reversing malware.

Using virtualization tools such as VMware to create secondary operating system images on which to analyze malware can still result in infection of the primary operating system.

108

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware (contd)

The goal of this module is to help the learner become familiar with using tools to identify, monitor, and report software that might be malicious. Since it's not practical to ask a learner to install a virus, worm, backdoor, or rabbit, we are left with the possibility of a benign software Trojan. (discussed later).

In 1996, Mark Russinovich founded a company called Winternals Software where he was the chief software architect on a comprehensive suite of tools for diagnosing, debugging, and repairing Windows systems and applications [9].

109

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware (contd)

Mark's company has since been purchased by Microsoft and his suite of tools have been rebranded Windows Sysinternals and are offered for free on Microsoft Technet. Mark's story is an interesting one because he is recognized as an expert on the internals of Windows even though he did not participate in its developmenta true testament to what can be learned about software through reverse engineering. The Sysinternals suite contains 66 different utilities, but we'll focus on the most useful one in this context of analyzing the behavior of malware: Process Monitor.

110

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware (contd)

The Process Monitor can capture detailed information about any running process in a Windows system including: file system, registry, and network activity.

Figure 57. Process Monitor session for the Password Vault application.

111

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware (contd)

Of course, Process Monitor itself doesn't identify malware, it simply reports what a process is doing.

With a little bit of ingenuity, one can identify Trojan Horses by looking for activities that don't seem to fit with the advertised functionality of a program.
It's common practice to download free software from the Internet, and because we've been convinced that opensource software, which is sometimes confused with free software, should have the fewest number of vulnerabilities, we do it without much afterthought.

112

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware (contd)

Incidentally, the data on the number of vulnerabilities found in popular Internet browsers does not support this belief. Mozilla browsers were affected by 99 new vulnerabilities in 2008, more than any other browser; there were 47 new vulnerabilities identified in Internet Explorer, 40 in Apple Safari, 35 in Opera, and 11 in Google Chrome [8]. It seems counter-intuitive that an open-source browser would have twice as many security holes than a closedsource browser like Internet Explorer.

113

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware (contd)

Becoming familiar with the Windows Sysinternals suite can help you evaluate whether the software on your Windows machine is acting in your best interest. If you suspect a particular program to be malware, it can be submitted online to a service called ThreatExpert. ThreatExpert is a Web-based tool that supports submission of software executables that are to be evaluated against an on-line malware database. Matching against existing malware is just one part of ThreatExpert's automated engine; the service tries to execute suspected malware in an isolated environment in order to perform heuristic analysis of its actions.

114

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware (contd)

Figure 59. Example ThreatExpert report summary for submitted malware.

115

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware (contd)

A Malware Identification and Monitoring Exercise was developed against a Java Alarm Clock application. This program was written to be a benign software Trojan. The exercise asks the learner to identify the behaviors of the Alarm Clock application that make it a software Trojan using the Windows Sysinternals tool suite.

The Alarm Clock application bytecode has been aggressively obfuscated to discourage the use of decompilation as a strategy for learning the programs behavior.

116

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware (contd)

The Alarm Clock application is a benign software Trojan that, in addition to being a rudimentary alarm clock, performs unadvertised functions on background threads:

Logs information from the Windows registry Logs locations of office documents in the file system. Scans for computers that respond to an ICMP ping. Paced background threads are used.

117

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware (contd)

Figure 60. Background threads log information about the users system.

118

http://www.reversingproject.info

Results (contd)
Identifying, Monitoring, and Reporting Malware (contd)

Figure 61. Process Monitor session for the Alarm Clock application.

119

http://www.reversingproject.info

Conclusions

Since programmers would benefit from reverse engineering education, instructors need to be able to teach it to them. At the present time, computer science instructors will be hard pressed to find materials for teaching a course that are compatible with classroom delivery.

Several books exist on reverse engineering that cater to industry professionals or those interested in self-study.
However, in a university setting, instructors engage students in ordered learning through exercises, quizzes, and exams.

120

http://www.reversingproject.info

Conclusions

Universities should continue to work toward establishing standard content for software reverse engineering and software maintenance courses. Software Reverse Engineering is an activity that relies heavily on tools. Better tools can only make this activity more feasible and reliable.

The market for reverse engineering tools does not seem saturated; there appear to be some opportunities for either new open-source projects or commercial products.

121

http://www.reversingproject.info

Thank you!

122

http://www.reversingproject.info

References
[1] M. R. Ali, Why teach reverse engineering? ACM SIGSOFT SEN, v.30, n.4, pp.1-4, Jul 2005. [2] M. El-Ramly, Experience in teaching a software reengineering course, in Proceedings of the 28th International Conference on Software Engineering (ICSE). Shanghai, China, 2006, pp. 699-702. [3] A. V. Deursen, J. Favre, R. Koschke, and J. Rilling, Experiences in Teaching Software Evolution and Program Comprehension, in Proceedings of the 11th IEEE international Workshop on Program Comprehension, Washington, DC, 2003, pp. 2834-284. [4] B. W. Weide, W. D. Heym, J. E. Hollingsworth, Reverse engineering of legacy code exposed, in Proceedings of the 17th international Conference on Software Engineering, Seattle, Washington, WA, 1995, pp. 327-331. [5] H. M. Sneed, Encapsualtion of legacy software: A technique for reusing legacy software components, in Annals of Software Engineering, v.9, n.4, pp.293-313, 2000.

123

http://www.reversingproject.info

References (contd)
[6] B. W. Weide, W. D. Heym, J. E. Hollingsworth, Reverse engineering of legacy code exposed, in Proceedings of the 17th international Conference on Software Engineering, Seattle, Washington, WA, 1995, pp. 327-331. [7] E. Eliam, Secrets of Reverse Engineering, Indianapolis, IN: Wiley, 2005. M. Stamp, Information Security: Principles and Practice, Hoboken, NJ: John Wiley & Sons, 2006. [8] Symantec Corp. (2009, Apr.). Symantec Global Internet Security Threat Report. [Online]. Available: http://eval.symantec.com/mktginfo/enterprise/white_papers/bwhitepaper_ internet_security_threat_report_xiv_04-2009.en-us.pdf. (Accessed April 26th, 2009). [9] Microsoft Corporation, Windows Sysinternals: utilities to help manage, troubleshoot and diagnose Windows systems and applications. [Online]. Available: http://technet.microsoft.com/en-us/sysinternals/default.aspx. (Accessed April 30th, 2009).

124

Вам также может понравиться