Auditing Binaries For Security Vulnerabilities

Auditing binaries for security
vulnerabilities
Speech outline (I)
• Legal considerations concerning reverse
engineering
• Introduction to the topic: The different
approaches to auditing binaries
• Review of C/C++ programming mistakes
• Spotting these mistakes in the binary
• Demonstration of finding a vulnerability
• --- Break ---
© 2001 Halvar Flake

Auditing binaries for security
vulnerabilities
Speech outline (II)
• Patching the problem away
• Dealing with Run-time-encrypted binaries
• Automated scanning for suspicious constructs
• Automating the process of reconstructing
structures
• Extending structure reconstruction to
automate OOP class reconstruction
• Free time to answer questions and discuss
the topic

Legal considerations
Technically, the reverse engineer breaks the license
agreement between him and the software vendor, as
he is forced to accept upon installation that he will not
reverse engineer the program.
The vendor could theoretically sue the reverse engineer

and revoke the license.
Depending on your local law, there are different ways

to defend your situation:

Legal considerations (EU)
EU Law:
1991 EC Directive on the Legal Protection of
Computer Programs
• Section 6 grants the right to decompilation for

interoperability purposes
• Section 5.3 grants the right to decompilation for
error correction purposes
Under EU Law, these rights cannot be contracted away

Legal considerations (USA)
US Law:
Final form of DMCA includes exceptions to
copyright for:
• Reverse engineering for interoperability

• Encryption research
• Security testing
One should ask his lawyer if these rights can be

contracted away.

Approach A: Stress testing
Overly long (or malformed) strings are automatically
generated and supplied to the program
Pro‘s:
• The process is largely automatic
• No specially skilled personnel is needed
• The stress-testing tool is re-usable
Con‘s:
• The protocol has to be known
• Complex conditions will be missed

Approach B: Tracing Input
A reverse engineer reads the program from the
point where it receives input on and analyzes the
code to find possible weaknesses
Pro‘s:
• Even very complex conditions are found
Con‘s:
• Auditor needs to be highly skilled
• Nearly infeasible for large applications
• Very time consuming since one will be
reading a lot of irrelevant `tentacles´

Approach C: Finding suspicious
constructs and reading backwards
Certain constructs which appear suspicious are
detected, and a reverse engineer then manually
analyzes the threat they pose
Pro‘s:
• A lot less time consuming than approach B
• The process of detecting suspicious
constructs can be automated
• Fairly complex conditions can be found
Con‘s:
• Some vulnerabilities will be missed
• Needs highly specialized auditor

Blackhat vs Whitehat auditing
Blackhat:
• Wants the fastest way to find an unknown
vulnerability
• Doesn‘t care if he misses some problems
• Only needs to repeat the process if the
vulnerability was fixed
Whitehat:
• Wants security, so he needs to read all code
• Has to repeat the process with every upgrade
• Has to continue after he has found something
The Blackhat is at an advantage here

Tools the auditor needs
IDA Pro by Ilfak Guilfanov
www.datarescue.com
• Can disassemble x86, SPARC, IA64, MIPS and much more ...
• Includes a powerful scripting language
• Can recognize statically linked library calls
• Features a powerful plug-in interface
• Features CPU Module SDK for self-developed CPU modules
• Automatically reconstructs arguments to standard calls via
type libraries, allows parsing of C-headers for adding new
standard calls & types
• Great technical support
• ... much more ...
C/C++ auditing recap
strcpy() and strcat()
Old news:
strcpy() and strcat() copying dynamic data

into any kind of fixed-size buffer are inherently
suspicious

sprintf() and vsprintf()
Old news:
Since sprintf() can expand an arbitrary string

using the `%s` format character, any call to
sprintf()/vsprintf() which expands dynamic data
into a fixed-size buffer has to be considered
suspicious.

The *scanf() function family
As *scanf() parses data of dynamic origin

into fixed buffers by using the ´%s` format
character, any *scanf() call which targets a fixed-
size buffer with a `%s` format character is
suspicious

The strncpy()-pitfall (I)
While strncpy supports size checking, it does not

guarantee NUL-termination of the destination buffer.
So in cases where the code includes something like
strncpy(destbuff, srcbuff, sizeof(destbuff));
problems will arise.

The strncpy()-pitfall (II)
Source string \x0 data
After copying the source into a smaller buffer, the
destination string is not properly terminated any more.
Destination string data with a \x0 somewhere

Any subsequent operations which expect the string to
be terminated will work on the data behind our original
string as well.

The strncat()-pitfall (I)
As with strncpy(), strncat() supports size checking,
but guarantees the proper termination of the string
after the last byte has been written.
If the buffer that is targeted is the first one which

was declared in the offending function, it is possible
to overwrite the frame pointer and gaining control
one function layer outwards.

The strncat()-pitfall (II)
saved_EBP‘s lowest byte is set to 0x00
Buffer to which
we append
Function epilogue: mov esp, ebp
saved_EBP
saved_EIP

The strncat()-pitfall (III)
saved_EBP Function epilogue: pop ebp

saved_EIP

The strncat()-pitfall (IV)
The value in EBP (the frame pointer) is now our modified value !
saved_EIP Function epilogue: ret

The strncat()-pitfall (V)
Next function epilogue: mov esp, ebp
ESP slides upwards (as its lowest order byte was
overwritten) into the user-supplied data. We can
now supply a new return address to gain control
.. but it lands lands here ...

User-supplied data
ESP should be here ...
saved_EBP
saved_EIP

The strncat()-pitfall (VI)
Furthermore, the fact that strncat() has to deal with
dynamic values for the len parameter increases the
danger of signedness misconceptions:
strncpy(buff, userdata, sizeof(buff));

strncat(buff, userdata2, sizeof(buff)-strlen(buff)-1);
Fills buff so that strlen(buff) = sizeof(buff)
len is pushed to –1 which is 0xFFFFFFF

Cast screwups (I)
void func(char *dnslabel)
{
char buffer[256]; First byte at *dnslabel is 0x80 = -128
char *indx = dnslabel;
int count;
Gets expanded to 0xFFFFF80
count = *indx;
buffer[0] = '\x00'; signed comparison passes
while (count != 0 && (count + strlen (buffer)) < sizeof (buffer) - 1)
{
strncat (buffer, indx, count);
indx += count;
count = *indx; arbitrary length string is appended
}
}
Format string vulnerabilities
Any call that passes user-supplied input directly to a
*printf()-family function is dangerous. These calls can
Also be identified by their argument deficiency.
Consider this code:
printf(“%s“, userdata);
printf(userdata); Argument deficiency

-- x86 assembly recap --
void *memcpy(void *dest, void *src, size_t n);
Assembly representation:
push 4
mov eax, unkn_40D278
push eax
lea eax, [ebp+var_458]
push eax
call _memcpy

Disassembly: strcpy()/strcat()
The source is variable, not a static string
This call targets a stack buffer

Disassembly: sprintf()/vsprintf()
Target buffer is a stack buffer

Format string containing „%s“
Expanded strings are not static and not fixed in length

Disassembly: The *scanf()
function family
Format string contains „%s“
Data is parsed into stack buffers

Disassembly: The
strncpy()/strncat() pitfall (I)
Copying data into a stack buffer again ...
If the source is larger than n (4000 bytes),

no NULL will be appended

Disassembly: The
strncpy()/strncat() pitfall (II)
The target buffer is only n bytes long

Disassembly: The strncat()
pitfall
Dangerous handling of len parameter

Disassembly: Cast screwups
• Does the function accepts a size_t parameter for
copying data into a buffer ? (e.g. strncpy(), strncat(),
fgets())
• Is the size_t parameter a dynamic value and not
hardcoded ?
• Is the size_t parameter at any point loaded using a
movsx – instruction (move with sign extend) ?
• Is anything substracted from the size_t parameter
before it gets passed to the function ?

Disassembly: Format String
vulnerabilities
Argument deficiency
Format string is a dynamic variable

Disassembly: Format String
vulnerabilities
Argument deficiency
Format string is a dynamic variable

Demonstration of finding
vulnerabilities by manually
auditing binaries

-- BREAK --

Patching the problem away (I)
PE File Header
.text section
containing code
Zero-padded to
so-called `Cave` the file alignment
(usually 0x200)
other sections
containing data
so-called ´Cave´
...

Patching the problem away (II)
jmp‘ing into our code
passing control back
.text section
containing code
`Cave` where we have

put our new code

Dealing with runtime
encryption (I)
PE File Header
Entry point
.text section
containing code 1. The de-scrambling code is added
to the end of the executable
.data section
containing data 2. The entry point is moved to the
descrambler
.rsrc section
containing code 3. The contents of the file are
scrambled
descrambling code Entry point

Dealing with runtime
encryption (II)
Steps to undertake:
• Trace through the descrambler until it passes control
back to the application
• Repair the damage done to the executable structure by
the scrambler/descrambler/executable loader
• Dump the memory to disk
• Very time consuming !
• Automated tools exist to do this for many scramblers
(e.g. IceDump)

Automating the scanning for
suspicious sprintf()-calls
Criteria for suspicious sprintf() calls:
• Does the call expand data using a `%s`format

character without size checking ?
• Does the call expand a non-static string
through the ´%s´ ?
• Does the call suffer from an argument
deficiency ?
• If so, is the format string dynamic or static ?
Demonstration script: sprintf.idc

suspicious strncpy()-calls
Criteria for suspicious strncpy() calls:
• Is the size_t parameter smaller or equal to the

size of the target buffer ?
• Does the call copy dynamic data into a stack
buffer ?
Demonstration script: strncpy.idc

format string vulnerabilities (I)
As we will frequently encounter wrapper functions that
implement printf() – like functionality using either
vsprintf() or vsnprintf(), it is desirable to have a script that
can be used for all functions. The data it needs to get from
the auditor is:
1. The address of the function that gets analyzed

2. The proper minimum stack correction of that function
3. The argument number of the format string

format string vulnerabilities (II)
The criteria the script should then apply are:
• Is the stack correction smaller than our supplied

minimum value ?
• Is the format string dynamic or static ?
Demonstration script: format.idc

Reasons why we need to
reconstruct structures
Many applications store data in large structures which are
passed around between functions. The information about
the layout of these structures is lost during the compilation.
This is bad for the reverse engineer for a variety of reasons:
• Without knowing how large target/source buffers are,

it becomes very hard to evaluate the danger posed by a
suspicious construct
• Many overflows happen within structures. Without
knowing what we‘re overwriting, it becomes hard to see
if a condition is exploitable at all

Demonstration of manual
structure reconstruction
While the manual reconstruction of structures using IDA‘s
built-in capabilities is great for `real` reverse engineering,
it takes too much time when only looking for suspicious
constructs.
Automated ways to at least reconstruct the structure

member sizes is desirable.

Automated structure
reconstruction
Frequently, we have a pointer to a structure as a local variable
in a function. What we want the script to do is:
• Trace through the entire function and find all places

where this pointer is loaded into a register
• Each time the pointer is loaded, trace the code until the
register is overwritten. Each time anything is referenced
relative to the register, retrieve that value
• Use the retrieved values to add members to a structure,
thus reconstructing accesses to it
Demonstration script: bas_objrec.idc

Why is this interesting when
auditing IIS ?
Because it consists mostly of OOP code, and OOP code is
notoriously annoying to read in the disassembly.
Now, automated structure reconstruction can be of great
interest when auditing OOP code:
• The more functions we can analyze which access the

same structure, the more exact our reconstruction of that
structure will be
• A class is nothing but a collection of functions which all
work with the same structure

Considerations concerning
class reconstruction
vTable
Method1(...) Every vTable entry points to

Method2(...) a function which accesses the
same structure via the this –
Method3(...) pointer. The vTable therefore
Method4(...) gives us a list of functions
we can use to reconstruct the
Method5(...) class data layout.

Any Questions ?

Auditing Binaries For Security Vulnerabilities

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Auditing Binaries For Security Vulnerabilities

Загружено:

Авторское право:

Доступные форматы

Auditing binaries for security

© 2001 Halvar Flake

© 2001 Halvar Flake

The vendor could theoretically sue the reverse engineer

Depending on your local law, there are different ways

© 2001 Halvar Flake

• Section 6 grants the right to decompilation for

Under EU Law, these rights cannot be contracted away

© 2001 Halvar Flake

• Reverse engineering for interoperability

One should ask his lawyer if these rights can be

© 2001 Halvar Flake

© 2001 Halvar Flake

© 2001 Halvar Flake

© 2001 Halvar Flake

The Blackhat is at an advantage here

strcpy() and strcat() copying dynamic data

© 2001 Halvar Flake

Since sprintf() can expand an arbitrary string

© 2001 Halvar Flake

As *scanf() parses data of dynamic origin

© 2001 Halvar Flake

While strncpy supports size checking, it does not

strncpy(destbuff, srcbuff, sizeof(destbuff));

problems will arise.

© 2001 Halvar Flake

Destination string data with a \x0 somewhere

© 2001 Halvar Flake

If the buffer that is targeted is the first one which

© 2001 Halvar Flake

© 2001 Halvar Flake

saved_EBP Function epilogue: pop ebp

© 2001 Halvar Flake

saved_EIP Function epilogue: ret

© 2001 Halvar Flake

.. but it lands lands here ...

© 2001 Halvar Flake

strncpy(buff, userdata, sizeof(buff));

len is pushed to –1 which is 0xFFFFFFF

© 2001 Halvar Flake

printf(userdata); Argument deficiency

© 2001 Halvar Flake

void *memcpy(void *dest, void *src, size_t n);

© 2001 Halvar Flake

The source is variable, not a static string

This call targets a stack buffer

© 2001 Halvar Flake

Target buffer is a stack buffer

© 2001 Halvar Flake

Format string contains „%s“

Data is parsed into stack buffers

© 2001 Halvar Flake

Copying data into a stack buffer again ...

If the source is larger than n (4000 bytes),

© 2001 Halvar Flake

The target buffer is only n bytes long

© 2001 Halvar Flake

Dangerous handling of len parameter

© 2001 Halvar Flake

© 2001 Halvar Flake

Format string is a dynamic variable

© 2001 Halvar Flake

void memcpy(void dest, void *src, size_t n);