Corsello RE Paper Spring 2009

Corsello Research Foundation
Software Tampering
The purpose, methods and potential safeguards to prevent the reverse
engineering of software
michael.corsello
1/24/2009
Abstract
Computer security is implemented at many levels; from the physical network, to the physical machines,
to software in any device. Today we place most of our emphasis on preventing malicious logic from ever
getting into a device where it can do harm. There is little effort in the protection of software and
systems from being directly hacked in the first place. Current operating system and software
architectures are extremely vulnerable to exploitation via the manipulation of executable code. One
main reason for the limited nature of actual exploits is the lack of understanding on how these exploits
can be performed.
Michael Corsello Term Paper CSci 287 Computer Network Defense
Introduction
Software is arguably the most complex thing man has ever invented. Modern software applications can
be composed of many million lines of source code that are executing on processors running at several
gigahertz. This software performs the operations specified by the developers of the software, nothing
more, nothing less. Given this basic premise, it would seem that software should be perfectly “safe” in
that it should only be capable of operating as programmed. However, the underlying system, the
hardware and specifically the CPU only understands a primitive, basic set of instructions. This set of
instructions forms the instruction set architecture (ISA) of the CPU. These ISA instructions are quite
primitive operations such as add, subtract, multiple, divide, read, write, compare and jump. All
applications are developed as aggregations of this simpler form of instruction to form a set of
abstractions to perform what we know as an application.
Software applications are generally written today using “general purpose” programming languages that
are already highly abstracted from the underlying ISA of the machine. This abstraction provides a great
many benefits in that developers do not need to understand what the machine is actually doing at the
ISA level when they write this high-level code. Unfortunately, this also means that few developers ever
learn what a line of code in these high-level, general purpose languages actually is compiled into at that
lower level. This means that most developers will never understand what vulnerabilities they are
actually creating in their code.
Software Architecture
The architecture of a software application is based upon levels of architectures for the underlying
components an application will use. In this manner, any application is subject to any benefits and
limitations of the underlying architectures it will reside upon. At the lowest level, this is the ISA and
overall hardware architecture of the platform the software will run upon. This is largely static and will
not be addressed in this paper. Even so, there are many places within the hardware architectures of
both computers and networks that could be re-designed to enhance capabilities, performance and
security.
Operating Systems
Above the hardware, all software is hosted by an operating system that directly runs upon the hardware
platform. This operating system provides a hardware abstraction layer (HAL) and core software based
services that all applications need. This is generally in the form of libraries and a primary “process” that
can initiate other “user mode” application processes (our applications). The operating system abstracts
the interaction with hardware devices through the use of software drivers that the operating system
loads and manages. Interaction between the hardware devices and software (generally at the driver
level) is performed via “interrupts” that manage the synchronization of hardware operations and data
flows within the system.
The operating system “kernel” is the portion of the operating system that manages the memory and
interrupts and overall coordinates the operating of the system as a whole. The most important aspect
P age |1 Software Tampering

of the kernel with respect to applications is that the kernel initiates and manages the creation of
application processes and their memory allocations in coordination with the CPU.
Within an operating system, processes are started and managed to perform work. Each process can be
started on the behalf of a user (user mode) or some level of the operating system itself (system mode or
kernel mode). In general, the system mode processes can be divided into “rings” from level 0 to some
upper bound level. The level 0 ring is the operating system kernel itself and must be the most secured
area from intercession. Any exploits at this ring can be completely catastrophic to the system as there is
no security at this level. In the higher level rings, a lower level of security is needed and therefore more
functional capabilities are granted at these levels. In general, drivers operate between ring 1 and ring 3
(each operating system is different and may have anywhere between 3 and 9 rings in total).
It is the operating system and these “rings of trust” that eventually open up into the user mode
applications. Any poorly written or vulnerable code at the lower numbered rings will affect every
application above that level even if it does not directly use the vulnerable code. It is for this very reason
that system mode code bases must be evaluated and should always be signed to prevent or at least limit
tampering.
Programming Languages and Libraries

The user mode applications we use to perform our work are still subject to any underlying vulnerabilities
in the operating system. Additionally, our applications generally use third party libraries that provide
some set of abstracted functionality. Each of these libraries may contain vulnerabilities that may be
exploited. Further, our applications written in a high-level language must be compiled into some
executable format that can be run within the operating system. This compilation process may produce
vulnerable code.
Each high-level language has a core set of keywords and operators that are recognized in textual form
that can then be mapped into a lower level set of instructions. In native code languages (such as
assembly, C, C++, etc) the source code is compiled directly into a “machine language” that can be run on
the host hardware and leverage the operating system provided services. In byte-code compiled
languages (such as Java, .NET, Python, etc), the source code is compiled into some intermediate format
that cannot be run directly on the hardware, but is instead dynamically compiled to native machine
code by a just-in-time (JIT) compiler or byte code interpreter. These forms of languages all provide a
form of protection to the underlying system in that their code cannot be run directly on the hardware
platform without the intercession of a virtual machine or interpreter. Due to the high-level nature and
inherent safety added in byte-code languages, the use of native code languages has been deemed by
many as “dangerous” and recommend that the only use of native code languages should be for lower
level rings, such as operating system and driver development.
Applications
Applications are in general loaded from some permanent storage device (i.e. disk). The access of a user
to the storage medium in which applications are held is the primary vehicle for tampering with
applications, both directly by users and by malicious applications that mutate the application executable

files. The code that is run on a computer is stored in a file with a specific format that the runtime (for
byte-code languages) or the operating system (native-code languages) can execute. As an example, the
Microsoft Windows operating system on the Intel based X86 32-bit hardware platform uses a “portable
executable” (PE) file format.
Portable Executable Format

The PE file format is used for executables, object code, type libraries, dynamic link libraries (DLLs), static
libraries and drivers in both 32 and 64 bit versions of Microsoft Windows operating systems. The PE
format was derived from the UNIX COFF file format originally and is occasionally still known as the
PE/COFF format.
The file is laid out with multiple headers and sections used to define the memory mapping strategy
taken by the operating system when loading the library. The PE file in loaded into memory with static
offsets to code addresses based upon a relative base address. This makes the execution take fewer
dynamic address resolution steps, but also makes hacking easier as all function points are at fixed offsets
from this base address. For a DLL, the load process is based upon a preferred base address, which if
available, allows a single loading of the library to serve all processes using this library. If the preferred
address is unavailable, then the library is loaded to an available base address and then becomes process
specific and cannot be shared between processes. While this can increase overall memory efficiency
and make inter-process communication easier it also adds complexity to the overall architecture and
provides additional points for external attack.
For the newer dynamic runtime provided by .NET (and Mono under Linux), Microsoft wanted executable
compatibility. Therefore, all .NET executables are actually PE files with an additional common language
runtime (CLR) header and data sections. When run, the .NET PE file will bootstrap and hand over
execution to the CLR libraries which read these sections and transition to execution of the contained CLR
managed byte-code.
Due to the nature of any executable, there are sections of code that can be bypassed or altered without
compromising the overall functionality of the application. Details on these forms of manipulation
collectively become known as tampering and are generally performed via the process of reverse
engineering.
Details of the PE file format can be downloaded directly from several sites including directly from
Microsoft at: http://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-
92cdfeae4b45/pecoff.doc
Reverse Engineering
Reverse engineering is the process of analyzing an existing “product” and working backwards to
determine how it functions, what components it is made of, and ultimately how to circumvent or
replicate the product in question. In software, reverse engineering comes in several forms and is a
specialized subset of the larger category of “hacking”. In many cases, the act of reverse engineering
software is actually legal and is protected in a limited scope under the “fair use” clause of U.S. Patent

law. Unfortunately, this provides enough freedom that the act of reverse engineering software to
create malicious products such as spyware, viruses, worms, etc; is generally legal. It is generally only the
use of such malicious software that is illegal.
Cracking
Cracking applications is one primary reverse engineering activity in software that intends to provide a
mechanism to bypass protection mechanisms. Examples of cracking would include altering installers to
not ask for a serial number / key code, altering software applications to bypass “activation” (primarily
Microsoft, such as in Office 2003/2007), removing trial software expiration and checking routines,
removing nag screens, etc.
The “art” of cracking software is quite widespread and many active crackerz (note the use of the “z”)
tend to take a great deal of pride in their ability to defeat any software protection / licensing schemes.
In many cases, a new software protection scheme will be cracked and an active “patch” will be posted to
file sharing sites within days of the protected software release. In the U.S. cracking is illegal based upon
the digital millennium copyright act (DMCA), which makes any attempt to circumvent any form of
intellectual property protection scheme illegal.
Tamper Resistance
The addition of mechanisms to a product to increase the difficulty of reverse engineering is known as
tamper resistance. In software, tamper resistance mechanisms come in many forms, the most common
of which is code obfuscators. Unfortunately, in compiled native code, these tools often only obfuscate
the source code and object code, but allow a disassembler to generate quite tractable assembly code
which can be tampered with and re-assembled into a new executable.
The discipline of trusted computing is the set of activities that provide protection from tampering with
software (and hardware) at various levels from high-level user mode application code (via mechanisms
such as digital signatures of code by third party code reviewers) to the operating system architecture
itself in limiting access to resources thus preventing the possibility of tampering with applications.
Mechanisms used in tamper resistance for software include:
Digital signatures (digests) of executable files

Increasing cyclomatic complexity of code
Obfuscation of source code
Dynamic code injection (polymorphic executables)
Installer encryption (executables stored as encrypted and decrypted into memory only)
Address randomization (virtual addresses and lookups)
Disassembers
A primary means of tampering with an executable is to disassemble that executable into an assembly
language file. Assembly language is a low-level programming language that has a one-to-one

correspondence with the public ISA of the hardware architecture. Due to this correspondence, there is
no way to prevent the disassembly of an executable file if it is accessible.
There are several free and commercial disassembler applications available, the 2 used in this paper are:
OllyDbg (http://www.ollydbg.de/)
IdaPro (http://www.hex-rays.com/idapro/)
OllyDbg is a fully free application available for download, whereas IdaPro is a commercial product that
also has a publically free version.
Tampering With a PE File

In order to tamper with a PE file, the file is disassembled in a disassembler, in this case, OllyDbg. The
executable we are tampering with in this example is available at the “Hacker Challenge” website at:
http://www.dareyourmind.net/Challenge/Variable.rar. Once unpacked and executed, the program
presents a challenge to the user:
We are to crack this program and change the value of some variable “vari” to the integer value 1302.
Once we believe we have succeeded in this task, we should be able to run the program and type check
and be greeted with success.
If we make no changes to the program and enter “check”, we see the following:
As a first action, we will open the program in OllyDbg and see the program as the CPU does. The
OllyDbg application shows all aspects of the program and allows us to step through the program

statement by statement and see how the memory registers change and which line of assembly will be
executed next.
We can also examine the executable file itself in a hex format:

There are multiple means of solving this problem, we can search for the value presented (31333031),
which we will not find (it may be computed and never be mentioned in code), we can search for the
label “vari” (also we will not find), or we can search for the solution.
In this case, we know that there will be a check comparison that will verify our value is equal to 1302.
We can look for the message we see in the original program and backtrack from there:
Now, we can track to where this (004015D4) is called from:
We can now clearly see that we are close by noticing the “ASCII “You got it!! The password is” text 2
lines below our call to the existing “wrong” answer.
Looking at the call to our output message, we can see the JNZ call – which is a “jump not equal to zero”.
Looking one line above this statement we see a CMP (compare) statement which compares the variable
at location 441000 to the static value 516 (HEX), which is our goal value of 1302 (DEC). Therefore, we
can crack this application in one of 2 ways, either change the static value to the current value (change
516 to 1DE1AA7 (HEX for 31333031) or change the value stored at 441000 to 516.
Looking at the current value in address 441000 we see:
Therefore, the current value of “vari” is actually 1301. So, we change this to 1302 and re-assemble this
file to an executable and run.

Then, after re-assembly and running, we see the same prompt:
But now when we enter “check” we see a new result:
This indicates that we successfully cracked this application.
General Concepts
When looking at a PE file there are several aspects that can be tampered with without impacting any
addresses. First, any string variables can be edited directly
if their lengths are not changed with zero impact of any
kind. Second, numeric constants can be edited directly if
their length is maintained with zero impact. Third,
embedded icons can be used as “free space” within the file.
In general, many PE files will have multiple icons embedded
within the file. The first icon can be used as free space to
place executable code into as long as it is less than the total
space of the icon. Then all references to the location of
that first icon can be re-mapped to a different icon within
the file with no impact to the PE. This is a means of causing Figure 1. Original Program Structure
malicious code to be embedded in that icon area of the file
with a know address point (stays the same as for the icon). Then, the end of that malicious code can call
back to the origin point to continue execution.

Figure 2. Altered Program Structure
Example
The winmine.exe PE file is the “Minesweeper” game in Windows operating systems. Opening the PE in
OllyDbg, the text labels for menu items and related content are easy to see.
Making edits to these text entries within this PE file will be directly displayed in the user interface of the
application as shown below:

Further, the winmine.exe file contains several icons for use in the game. These icons can also be clearly
seen in OllyDbg:
Altering these icon areas can yield graphical “updates” to the application as shown below:
P a g e | 10 Software Tampering
Or, more maliciously, some icons can be replaced with malicious code and simply be remapped to other
icons that are graphically similar.
Abstract Architecture For a Secure OS

A mechanism for creating a more secure operating system involves several aspects:
Increasing isolation from the hardware

Increasing isolation from the operating system
Reducing application access to resources
Reducing access to application files
Virtualization of resources
Use of virtual machines for all process hosting
High process isolation
Virtualization of operating environment
A basic architecture for accomplishing this is to build upon a primary concept within UNIX operating
systems already. There are resources that are unavailable to user mode operations. A single prime
example of this is the swap partition. The operating system handles all I/O directly with no application
access to the partition. It in general appears to not exist. This is the single same concept to secure an
operating system. Make all unneeded resources be invisible to the application.
Memory
In this OS architecture, memory is abstracted to a pool much like in existing OSes. Each application gets
an isolated view of memory and therefore “owns” that memory image. However, this memory image is
fragmented into frames that each have base addresses that cannot be spanned. In this manner, OS level
resources and shared libraries are loaded into fixed, read-only frames that are thereby safe for use.
Note that these are read-only, and therefore do not contain application editable data. In the context of
.NET, this is similar to the use of “Application Domains”.
Disk
The security of mass storage devices will continue to be an issue even in this architecture. The primary
additional security in this architecture is the isolation of executable code from the user mode accessible
storage devices. In this architecture, the physical non-removable storage is virtualized to “slices” that
are characterized by basic performance metrics indicating relative performance for read, write,
overwrite operations of the underlying physical hardware. A virtual slice may span any number of
physical devices in a manner similar to that of current RAID solutions. The underlying physical media
may still be on a RAID array or SAN/NAS devices.
The storage slices are used to form mount points that are of one of the following classes:
System (Swap and OS are such volumes)

Protected (Drivers and applications are on such a volume)
Limited (Configuration data is on such a volume)
User (Data is such a volume)
A system volume is completely invisible to the system outside of the ring 0 operating system code base.
The volume is fully accessible to the OS alone.
A protected volume is read-only accessible at a virtual mount point. A protected volume is essentially
hierarchical with the root level of access being defined by the “owner” of the virtual root. For example
company ABC would have a virtual ABC root that is defined securely. Within that root, all programs
produced by company ABC would reside. ABC has the sole control over the structure, content and
accessibility within that virtual root. All files are read-only except by the OS which performs single
writes (installs) and single deletes (uninstalls) to this volume.
The limited volumes are read/write accessible and built upon the same structure as the protected
volume. This is the primary type of volume that a program may store configuration parameters for use
by the application. Also, each user has a mirrored structure for user-level profile parameters. This
volume is read/write to the program that owns the root and is not visible in user mode to any user.
Overall, the system, protected and limited class volumes within a system are not visible to the user at all
in any form or fashion. For a user to be able to execute an application that resides in these volumes, the
application is registered with the OS and placed into the volume by the OS. The OS exposes a virtual link
that permits the executable to be run, but it is only runnable and in no way readable as a file.
The final class of volume, the user volume is a semi-traditional full access user mode storage location. In
this manner, downloaded software cannot be run until it is installed into the system. Since the volume
is data only, there are no files in these volumes that the OS will recognize as runnable. Therefore, any
tampering to files within the user volumes will not affect the applications in a traditional sense (scripting
or macro files that are interpreted from a data file by an installed program is a vulnerable vector).
The primary difference between the user volume in this OS architecture and a traditional data volume is
that the user volume is further sliced by user to be virtualized as multiple overlapping volumes. Each
user gets a custom perspective view of the logically virtualized volumes based upon their access rights.
The security in this volume is a combination of role-based (RBAC), policy-based (PBAC), mandatory
(MAC) and user-discretionary controls. The algorithms used to dynamically merge the volumes for
performance is a key distinction from existing OSes.
Devices
All hardware devices that are installed on a system must be made available to users. The control of
these devices, their drivers and access constraints are a key area for both security and system attack.
This OS architecture will protect against external vectors via a combination of device locking (at the
kernel level) and disk access at the disk level (preventing driver installation) that controls the addition of
new devices. Additionally, keyed devices can be optionally allowed or disallowed by user, by device type
and by device id via a security device labeling engine (a form of MAC). Via the disk access controls,
drivers all require a minimum soft restart of the OS to be installed.
Applications
All software must be installed into the protected and limited volumes (executable/libraries and
configurations respectively) to be made available to users. The installation process for all programs and
drivers will require a physical acknowledgement from the user with appropriate access to execute an
installation package. The installer itself only places the content in a virtual volume to be copied and
installed upon restart of the OS. In this manner, the installation process is 2 phase and requires full
acknowledgement. Upon restart, the OS identifies this virtual volume and installs the verified content to
the secured volumes and makes the application available to the users. An OS soft restart does not
require a full system (hardware) reboot, only an environment unload and reload much like a current
virtual machine restart on virtualized systems.
This installation process permits several actions during installation:
Each file can be checked against a digital signature and verified online if OS policy is enabled to
do so
Files are copied into virtual space where they can be scanned and checked by a registered
security application granted access to do so
Multi-stage process thwarts many malicious code practices by not permitting automatic start of
the malicious code
Conclusions
Current software architectures are inherently insecure by the nature of the applications themselves and
their underlying system architectures. To build truly secure applications that perform well a new
approach must be taken. The operating system itself is a key target for redesign that can ameliorate
many problems including network level exploitations by reducing runtime access to resources.
References
Bishop, M. (2003). Computer Security. Boston: Addison Wesley.
Wikipedia contributors. (2009, April 4). Operating system. Retrieved April 4, 2009, from Wikipedia, The
Free Encyclopedia: http://en.wikipedia.org/w/index.php?title=Operating_system&oldid=281660249
Wikipedia contributors. (2009, April 3). Portable Executable. Retrieved April 4, 2009, from Wikipedia,
The Free Encyclopedia:
http://en.wikipedia.org/w/index.php?title=Portable_Executable&oldid=281445980
Wikipedia contributors. (2009, April 1). Reverse engineering. Retrieved April 4, 2009, from Wikipedia,
http://en.wikipedia.org/w/index.php?title=Reverse_engineering&oldid=281088125
Wikipedia contributors. (2009, March 25). Software cracking. Retrieved April 4, 2009, from Wikipedia,
The Free Encyclopedia: http://en.wikipedia.org/w/index.php?title=Software_cracking&oldid=279631247
Wikipedia contributors. (2009, March 11). Tamper resistance. Retrieved April 4, 2009, from Wikipedia,
http://en.wikipedia.org/w/index.php?title=Tamper_resistance&oldid=276490869

Corsello RE Paper Spring 2009

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Corsello RE Paper Spring 2009

Загружено:

Авторское право:

Доступные форматы

Corsello Research Foundation

P age |1 Software Tampering

Programming Languages and Libraries

P age |2 Software Tampering

Portable Executable Format

P age |3 Software Tampering

Mechanisms used in tamper resistance for software include:

Digital signatures (digests) of executable files

P age |4 Software Tampering

Tampering With a PE File

P age |5 Software Tampering

We can also examine the executable file itself in a hex format:

P age |6 Software Tampering

Now, we can track to where this (004015D4) is called from:

Looking at the current value in address 441000 we see:

P age |7 Software Tampering

Then, after re-assembly and running, we see the same prompt:

But now when we enter “check” we see a new result:

This indicates that we successfully cracked this application.

P age |8 Software Tampering

Figure 2. Altered Program Structure

P age |9 Software Tampering

Abstract Architecture For a Secure OS

Increasing isolation from the hardware

System (Swap and OS are such volumes)

This installation process permits several actions during installation:

Bishop, M. (2003). Computer Security. Boston: Addison Wesley.

Вам также может понравиться