Вы находитесь на странице: 1из 53

Analysis and Visualization of Common Packers

HITBSecConf2008 - Kuala Lumpur Ero Carrera - ero.carrera@gmail.com Reverse Engineer at zynamics GmbH Chief Research Ofcer at VirusTotal

Introduction

An historical perspective
Originally meant to save space by reducing the redundancy in executable le formats Simply compressed parts or the whole of the executable Created a new "envelope" around it that restored the original executable and the passed control to it The decompressing envelope did not much more than just restoring the executable

Evolution of the techniques

Compression provided a trivial degree of obfuscation, but obfuscation nonetheless Was easy to add additional measures in the decompressing envelope

Overview of the techniques

Destruction of informational components


Import address table Simple late reconstruction into an original form Construction of new connectivity artifacts between the original code and imported modules Strings

Anti-debug
Aimed at making tracing hard Using SEHs triggered by hard to handle exceptions Confuse debuggers throwing INTs they use Calling hard-to-hook low level APIs/syscalls Checking for hooks

Anti-environment
VM detection. VMWare, VirtualPC, etc Techniques aimed against specic tools OllyDBG, IDA, Softice, etc

Breaking tools
Tricks detecting, confusing or aimed at crashing some of the most common tools IDA OllyDBG Procdump Softice

Anti-analysis
Code obfuscation Adding junk code, using opaque predicates Code transformation Virtual machines Flow obfuscation (SEH, Nanomites)

Tools
Bochs Provides with a high-level view No need to worry about most of the anti-* techniques Windbg Can do kernel-mode debugging, hook syscalls, look deeper that user-mode tools Inspection of physical/virtual memory Memoryze, for the real hardcore

Obfuscation & Anti-Analysis

Basic trickery against analysis algorithms


Most tools will linearly disassemble a chunk of code Introduce and non-terminal ow branching instruction (not a ret or jmp) Make it point later in the code, to the middle of what would be an instruction if disassembling linearly Result => confusion (latest IDA, 5.3, has some workarounds against this) Also: indirect obfuscation through heavy optimization

Example: ASPack (original)

0101F001 60________ 0101F002 E803000000 0101F007 E9EB045D45

pusha call jmp near ptr loc_101F007+1 near ptr 465EF4F7h

Example: ASPack (xed)

0101F001 60________ pusha 0101F002 E803000000 call loc_101F008

0101F007 E9________ db 0E9h ; T 0101F008 EB04______ jmp short loc_101F00E

Example: Linear Disassembly


0DFE000000 or 3D02000000 cmp E901000000 jmp 75B8______ jnz eax, 0FEh eax, 2 0x1 short near ptr 0FFFFFFC9h

F3C001C0__ rep rol byte ptr [ecx], 0C0h C3________ ret

Example: Linear Disassembly


0DFE000000 or 3D02000000 cmp E901000000 jmp 75________ db 0x75 HERE: B8F3C001C0 mov C3________ ret eax, 0xc001c0f3 eax, 0FEh eax, 2 0x1 // HERE

Example: Opaque Predicate


0DFE000000 and 3D02000000 cmp 7401______ jz 75B8______ jnz eax, 2 eax, 2 0x01 short near ptr 0FFFFFFC6h

F3C001C0__ rep rol byte ptr [ecx], 0C0h C3________ ret

Example: Opaque Predicate


0DFE000000 and 3D02000000 cmp 7401______ jz 75________ db 0x75 HERE: B8F3C001C0 mov C3________ ret eax, 0xc001c0f3 eax, 2 eax, 2 HERE

Executable Image Memory Page


Function Chunk Function Chunk Function Chunk Function Chunk Function Chunk

Function
address address address ... instruction instruction instruction (operand, ...) (operand, ...) (operand, ...)

Memory Page
address address address ... instruction instruction instruction (operand, ...) (operand, ...) (operand, ...)

address address address ...

instruction instruction instruction

(operand, ...) (operand, ...) (operand, ...)

Memory Page

address address address ...

instruction instruction instruction

(operand, ...) (operand, ...) (operand, ...) address address address ... instruction instruction instruction (operand, ...) (operand, ...) (operand, ...)

Executable Image

Function
address address address ... instruction instruction instruction (operand, ...) (operand, ...) (operand, ...)

Memory Page
Function Chunk Function Chunk
address address address ... instruction instruction instruction (operand, ...) (operand, ...) (operand, ...)

Memory Page
Function Chunk

address address address ...

instruction instruction instruction

(operand, ...) (operand, ...) (operand, ...)

Memory Page
Function Chunk Function Chunk

address address address ...

instruction instruction instruction

(operand, ...) (operand, ...) (operand, ...) address address address ... instruction instruction instruction (operand, ...) (operand, ...) (operand, ...)

Function A

address address address ...

( ( (

, ...) , ...) , ...)

Function B

address address address ...

( ( (

, ...) , ...) , ...)

address address address ...

( ( (

, ...) , ...) , ...)

address address address ...

( ( (

, ...) , ...) , ...)

address address address ...

( ( (

, ...) , ...) , ...)

address address address ...

( ( (

, ...) , ...) , ...)

address address address ...

( ( (

, ...) , ...) , ...)

Shared Blocks
address address address ... ( ( ( , ...) , ...) , ...)

address address address ...

( ( (

, ...) , ...) , ...)

Junk. Polymorphic and static

Junk Code
pusha popa Non-Standard Branching Junk JMP insertion

Exmple: Junk Code I (Themida)


018900CE 48________ dec 018900CF 60________ pusha 018900D0 B9FBDEF000 mov 018900D5 50________ push 018900D6 9C________ pushf 018900D7 E912000000 jmp 018900D7 [junk data] loc_18900EE ecx, 0F0DEFBh eax eax

Exmple: Junk Code II (Themida)


018900EE loc_18900EE: 018900EE E90E000000__ jmp 018900EE [junk data] 01890101 loc_1890101: 01890101 9D__________ popf 01890102 5E__________ pop 01890103 61__________ popa 01890104 0F844E06FA7A jz loc_7C830758 esi loc_1890101

Virtual Machines
Visual Basic, Java, Python, Ruby, Perl, .NET Starforce, VMProtect, x86 Virtualizer, Themida/ CodeVirtualizer At a high-level its a: fetch, decode, handle algorithm

Virtualized Code

Runs

Virtual CPU Standard Code

Runs

Runs

Real CPU

Real CPU

Registers
-General Purpose -Instruction Pointer -Stack pointer

Virtual CPU opcodes

Fetch Instruction Pointer 1

opA reg1, reg2 opB reg2 branchA XYZ

4 Execute handler 3
Real CPU opcodes

Update registers

Registers Virtual CPU

handler for opA

2
handler for opB

Decode

Decoder

handler for opC handler for branchA

Decoder
...

-Look up operand in table -Call handler

Virtual Machine Countermeasures


Rolf Rolles and Boris Lau have already shown that optimization/reduction techniques can help Translating to an intermediate representation and performing optimization in the code leads to reduced forms You could use a tool like Peters Find executable code to discover instruction handlers from a memory dump of the VM

Advanced Packers
Some of the hardest current packers are VMProtect, Themida, Armadillo They incorporate some complex, custom techniques Usually commercial products protectors

Armadillo

Armadillo
Double process debugging, debug blocker Nanomites Strategic Code Splicing Armadillo's invalid instructions LOCK prex Invalid MOV

Parent process

Debug

Child process

Child's code Parent catches it Transfer control INT 3


push ebp mov ebp, esp push 0 push 0 call XYZ cmp eax, 0 INT 3 . . . . mov [UWZ], 0xff pop ebp ret

Look up address

Find target

Set target in child context

Resume child

Debug

Child process

Parent catches it

Transfer control

INT 3

Themida

Themida's API obfuscation


The general algorithm can be summarized as: Retrieve the API's function body Perform a basic analysis and disassembly Reconstruct the API's function body inserting junk in between each of the real instructions Re-assemble functionality, keep the semantics, change the syntax

Standard Imports
Executable Imported DLL

A function references other code


Executable Imported DLL

Exported DLL Function

Internal DLL Function

Some of the references are kept


Themida protected executable Imported DLL

Exported DLL Function

DLL Function (Obfuscated) Internal DLL Function

Reconstruction
The algorithm has limitations References to other functions within the DLL are kept Same for true branches of conditional branches Those two points can allow us to do API discovery by studying their connectivity

Themida's obfuscation
Adds lots of branching and junk Keeps few "real" instructions per obfuscated block IDA can easily deal with the branching Although bogus calls break IDA analysis and lead to broken obfuscated functions Some scripting can make this look better

Current state
Packing vs unpacking Packing is not always a symmetric proces, sometimes it can't be undone perfectly You wont get the original process back Can it be done generically? Some cases the answer is "mostly" yes You will mostly always be able to obtain code close to its original form

Recent techniques
skape documented an elegant trick on uninformed 10 a few weeks ago Attacks a basic heuristic used by most generic unpackers Tracking execution transfer to dirty-memory

Virtual Address Range A

Virtual Address Range A

Virtual Address Range A

WRITE

WRITE

TIME

Virtual Address Range A

Physical Memory

MMU
Virtual Address Range A

Virtual Address Range A

WRITE

Physical Memory

MMU
Virtual Address Range A

EXECUTE

Countermeasures
Windbg can see the mappings from virtual to physical Not to hard to spot doubled mapped regions Bochs and other low level emulators can easily do it as well Requires kernel-mode access or higher

References
Reversing. Secrets of Reverse Engineering. Eldad Eilam Dprotection semi-automatique de binaire, Yoann Guillot & Alexandre Gazet
http://metasm.cr0.org/SSTIC08-article-Guillot_GazetDeprotection_Semi_Automatique_Binaire.pdf

Virtual Machine Threats , Peter Ferrie


http://www.symantec.com/avcenter/reference/Virtual_Machine_Threats.pdf

References II
A Quick Survey on Automatic Unpacking Techniques, Daniel Reynaud
http://indenitestudies.wordpress.com/2008/09/25/automatic-unpacking/

Using dual-mappings to evade automated unpackers, skape


http://www.uninformed.org/?v=10&a=1

Dealing with Virtualization packer, Boris Lau


http://www.datasecurity-event.com/uploads/boris_lau_virtualization_obfs.pdf

References III
Rolf Rolles blog in OpenRCE
https://www.openrce.org/blog/browse/RolfRolles

Oreans Themida/CodeVirtualizer
http://www.oreans.com

ReWolf's x86 Virtualizer


http://www.openrce.org/blog/view/847/x86_Virtualizer_-_source_code

References IV
VMProtect
http://www.vmprotect.ru/

Deroko's Nanomite's write up


http://www.phearless.org/i3/Nanomites_And_Misc_Stuff.txt

Memoryze, Mandiant
http://www.mandiant.com/software/memoryze.htm

Thanks Dhillon & the HitB crew! Q&A

Вам также может понравиться