Вы находитесь на странице: 1из 10

Syscall number is stored in EAX

EBX- first argument


ECX- Second Argument
EDX- Third Argument
ESI- fourth argument
EDI- Fifth Argument

Sometimes we may face linking problem in assembly program, in


that case we need to use “ld –e _start abc abc.o” as command in
terminal.

STACK CFI records


STACK CFI ("Call Frame Information") records describe how to walk
the stack when execution is at a given machine instruction. These
records take one of three forms:
STACK CFI
INIT address size register1: expression1 register2: expression2 ...
STACK CFI address register1: expression1 register2: expression2 ...
STACK CFI END address
For example:
STACK CFI INIT 804c4b0 40 .cfa: $esp 4 + $eip: .cfa 4
- ^
STACK CFI 804c4b1 .cfa: $esp 8 + $ebp: .cfa 8 - ^
The address and size fields are hexadecimal numbers. Each registeri is
the name of a register or pseudoregister. Each expression is a
Breakpad postfix expression, which may contain spaces, but never
ends with a colon. (The appropriate register names for a given
architecture are determined when STACK CFI records are first enabled
for that architecture, and should be documented in the
appropriate stackwalker_architecture.ccsource file.)
STACK CFI records describe, at each machine instruction in a given
function, how to recover the values the machine registers had in the
function's caller. Naturally, some registers' values are simply lost, but
there are three cases in which they can be recovered:
• You can always recover the program counter, because that's the
function's return address. If the function is ever going to return,
the PC must be saved somewhere.
• You can always recover the stack pointer. The function is
responsible for popping its stack frame before it returns to the
caller, so it must be able to restore this, as well.
• You should be able to recover the values of callee-saves
registers. These are registers whose values the callee must
preserve, either by saving them in its own stack frame before
using them and re-loading them before returning, or by not using
them at all.
(As an exception, note that functions which never return may not save
any of this data. It may not be possible to walk the stack past such
functions' stack frames.)
Given rules for recovering the values of a function's caller's registers,
we can walk up the stack. Starting with the current set of registers ---
the PC of the instruction we're currently executing, the current stack
pointer, etc. --- we use CFI to recover the values those registers had in
the caller of the current frame. This gives us a PC in the caller whose
CFI we can look up; we apply the process again to find that function's
caller; and so on.
Concretely, CFI records represent a table with a row for each machine
instruction address and a column for each register. The table entry for a
given address and register contains a rule describing how, when the PC
is at that address, to restore the value that register had in the caller.
There are some special columns:
• A column named .cfa, for "Canonical Frame Address", tells how
to compute the base address of the frame; other entries can refer
to the CFA in their rules.
• A column named .ra represents the return address.
For example, suppose we have a machine with 32-bit registers, one-
byte instructions, a stack that grows downwards, and an assembly
language that resembles C. Suppose further that we have a function
whose machine code looks like this:
func: ; entry point;
return address at sp
func+0: sp -= 16 ; allocate space
for stack frame
func+1: sp[12] = r0 ; save 4-byte r0
at sp+12
... ; stuff that
doesn't affect stack
func+10: sp -= 4; *sp = x ; push some 4-
byte x on the stack
... ; stuff that
doesn't affect stack
func+20: r0 = sp[16] ; restore saved
r0
func+21: sp += 20 ; pop whole stack
frame
func+22: pc = *sp; sp += 4 ; pop return
address and jump to it
The following table would describe the function above:
code address .cfa r0 r1 ... .ra

func+0 sp cfa[0]

func+1 sp+16 cfa[0]


func+2 sp+16 cfa[-4] cfa[0]

func+11 sp+20 cfa[-4] cfa[0]

func+21 sp+20 cfa[0]

func+22 sp cfa[0]
Some things to note here:
• Each row describes the state of affairs before executing the
instruction at the given address. Thus, the row for func+0
describes the state before we execute the first instruction, which
allocates the stack frame. In the next row, the formula for
computing the CFA has changed, reflecting the allocation.
• The other entries are written in terms of the CFA; this allows them
to remain unchanged as the stack pointer gets bumped around.
For example, to find the caller's value for r0 at func+2, we would
first compute the CFA by adding 16 to the sp, and then subtract
four from that to find the address at which r0 was saved.
• Although the example doesn't show this, most calling conventions
designate "callee-saves" and "caller-saves" registers. The callee
must restore the values of "callee-saves" registers before
returning (if it uses them at all), whereas the callee is free to use
"caller-saves" registers without restoring their values. A function
that uses caller-saves registers typically does not save their
original values at all; in this case, the CFI marks such registers'
values as "unrecoverable".
• Exactly where the CFA points in the frame --- at the return
address? below it? At some fixed point within the frame? --- is a
question of definition that depends on the architecture and ABI in
use. But by definition, the CFA remains constant throughout the
lifetime of the frame. It's up to architecture- specific code to know
what significance to assign the CFA, if any.
To save space, the most common type of CFI record only mentions the
table entries at which changes take place. So for the above, the CFI
data would only actually mention the non-blank entries here:
insn cfa r0 r1 ... ra

func+0 sp cfa[0]

func+1 sp+16

func+2 cfa[-4]

func+11 sp+20

func+21 r0
func+22 sp
A STACK CFI INIT record indicates that, at the machine instruction
at address, belonging to some function, the value that registern had in
that function's caller can be recovered by evaluating expressionn. The
values of any callee-saves registers not mentioned are assumed to be
unchanged. (STACK CFI records never mention caller-saves registers.)
These rules apply starting at address and continue up to, but not
including, the address given in the next STACK CFI record.
The size field is the total number of bytes of machine code covered by
this record and any subsequent STACK CFI records (until the
next STACK CFI INIT record). The address field is relative to the
module's load address.
A STACK CFI record (no INIT) is the same, except that it mentions
only those registers whose recovery rules have changed from the
previous CFI record. There must be a prior STACK CFI
INIT or STACK CFI record in the symbol file. The address field of this
record must be greater than that of the previous record, and it must not
be at or beyond the end of the range given by the most recent STACK
CFI INIT record. The address is relative to the module's load address.
Each expression is a breakpad-style postfix expression. Expressions
may contain spaces, but their tokens may not end with colons. When an
expression mentions a register, it refers to the value of that register in
the callee, even if a prior name/expression pair gives that register's
value in the caller. The exception is .cfa, which refers to the canonical
frame address computed by the .cfa rule in force at the current
instruction.
The special expression .undef indicates that the given register's value
cannot be recovered.
The register names preceding the expressions are always followed by
colons. The expressions themselves never contain tokens ending with
colons.
There are two special register names:
• .cfa ("Canonical Frame Address") is the base address of the
stack frame. Other registers' rules may refer to this. If no rule is
provided for the stack pointer, the value of .cfa is the caller's
stack pointer.
• .ra is the return address. This is the value of the restored
program counter. We use .ra instead of the architecture-specific
name for the program counter.
The Breakpad stack walker requires that there be rules in force
for .cfa and .ra at every code address from which it unwinds. If those
rules are not present, the stack walker will ignore the STACK CFI data,
and try to use a different strategy.
So the CFI for the example function above would be as follows,
if func were at address 0x1000 (relative to the module's load address):
STACK CFI INIT 1000 .cfa: $sp .ra: .cfa ^
STACK CFI 1001 .cfa: $sp 16 +
STACK CFI 1002 $r0: .cfa 4 - ^
STACK CFI 100b .cfa: $sp 20 +
STACK CFI 1015 $r0: $r0
STACK CFI 1016 .cfa: $sp

Tell the linker to put this into the executable's .text section:
.text

Export main as a external symbol:


.globl _main

Define the main function itself:


_main:
LFB2:

Save the previous frame pointer:


pushq %rbp
LCFI0:

Set up a new frame pointer:


movq %rsp, %rbp
LCFI1:

Restore the old frame pointer and return to caller:


leave
ret

The following directives are setting up an .eh_frame section,


containing information required by the C++ runtime for exception
handling.
LFE2:
.section
__TEXT,__eh_frame,coalesced,no_toc+strip_static_syms+li
ve_support
This is the Common Information Entry table:
EH_frame1:

It starts with a length, calculated from the difference of


the LSCIE1 and LECIE1 labels:
.set L$set$0,LECIE1-LSCIE1
.long L$set$0

(The .long, .byte, .ascii and .quad cause a value of the


appropriate type to be directly emitted by the assembler). Then
follows the CIE table itself:
LSCIE1:
.long 0x0
.byte 0x1
.ascii "zR\0"
.byte 0x1
.byte 0x78
.byte 0x10
.byte 0x1
.byte 0x10
.byte 0xc
.byte 0x7
.byte 0x8
.byte 0x90
.byte 0x1
.align 3
LECIE1:

Another external symbol, this one for the main function's Frame
Description Entry (still part of the exception handling information):
.globl _main.eh
_main.eh:

Again, the FDE starts with a length:


LSFDE1:
.set L$set$1,LEFDE1-LASFDE1
.long L$set$1
..and continues with the rest of the FDE table.
LASFDE1:
.long LASFDE1-EH_frame1
.quad LFB2-.
.set L$set$2,LFE2-LFB2
.quad L$set$2
.byte 0x0
.byte 0x4
.set L$set$3,LCFI0-LFB2
.long L$set$3
.byte 0xe
.byte 0x10
.byte 0x86
.byte 0x2
.byte 0x4
.set L$set$4,LCFI1-LCFI0
.long L$set$4
.byte 0xd
.byte 0x6
.align 3
LEFDE1:
.subsections_via_symbols

A data segment is one of the sections of a program in an object file or


in memory, which contains the global variables and static variables that
are initialized by the programmer. It has a fixed size, since all of the
data in this section is set by the programmer before the program is
loaded. However, it is not read-only, since the values of the variables
can be altered atruntime. This is in contrast to the Rodata (constant,
read-only data) section, as well as the code segment (also known as
text segment).
In the PC architecture there are four basic read-write memory regions in
a program: Stack, Data, BSS, and Heap. Sometimes the data, BSS,
and heap areas are collectively referred to as the "data segment".
To resume, RAM area contains:
 Data Segment (Data + BSS + Heap)
 Stack
In detail:
 The Data area contains global and static variables used by the
program that are initialized. This segment can be further classified
into initialized read-only area and initialized read-write area. For
instance the string defined by char s[] = "hello world"; in C and a
C statement like int debug=1; outside the main would be stored in
initialized read-write area. And a C statement like char *string =
"hello world"; makes the string literal "hello world" to be stored in
initialized read-only area and the character pointer variable string in
initialized read-write area.
Ex: static int i = 10; will be stored in data segment and global int i
= 10; will be stored in data segment
 The BSS segment also known as Uninitialized data starts at the
end of the data segment and contains all uninitialized global
variables and static variables that are initialized to zero by default.
For instance a variable declared static int i; would be contained in
the BSS segment.
 The heap area begins at the end of the BSS segment and grows
to larger addresses from there. The Heap area is managed
by malloc, realloc, and free, which may use the brk and sbrk system
calls to adjust its size (although, note that the use of brk/sbrk and a
single "heap area" is not required to fulfil the contract of
malloc/realloc/free; they may also be implemented using mmap to
reserve potentially non-contiguous regions of virtual memory into the
process' virtual address space). The Heap area is shared by all
shared libraries and dynamically loaded modules in a process.
 The stack is a LIFO structure, typically located in the higher parts
of memory. It usually "grows down" with every register, immediate
value or stack frame being added to it. A stack frame consists at
minimum of a return address.

Examining memory
You can use the command x (for "examine") to examine memory in any of
several formats, independently of your program's data types.
x/nfu addr
x addr
x
Use the x command to examine memory.
n, f, and u are all optional parameters that specify how much memory to
display and how to format it; addr is an expression giving the address where
you want to start displaying memory. If you use defaults for nfu, you need
not type the slash `/'. Several commands set convenient defaults for addr.
n, the repeat count
The repeat count is a decimal integer; the default is 1.
It specifies how much memory (counting by units u) to
display.
f, the display format
The display format is one of the formats used
by print, `s' (null-terminated string), or `i' (machine
instruction). The default is `x' (hexadecimal) initially.
The default changes each time you use either x or print.
u, the unit size
The unit size is any of
b
Bytes.
h
Halfwords (two bytes).
w
Words (four bytes). This is the initial default.
g
Giant words (eight bytes).
Each time you specify a unit size with x, that size becomes the default
unit the next time you use x. (For the `s' and `i' formats, the unit
size is ignored and is normally not written.)
addr, starting display address
addr is the address where you want GDB to begin
displaying memory. The expression need not have a
pointer value (though it may); it is always interpreted
as an integer address of a byte of memory. See
section Expressions, for more information on
expressions. The default for addr is usually just after
the last address examined--but several other
commands also set the default address: info
breakpoints (to the address of the last breakpoint
listed), info line (to the starting address of a line),
and print (if you use it to display a value from memory).
For example, `x/3uh 0x54320' is a request to display three halfwords
(h) of memory, formatted as unsigned decimal integers (`u'), starting at
address 0x54320. `x/4xw $sp' prints the four words (`w') of memory
above the stack pointer (here, `$sp'; see section Registers) in hexadecimal
(`x').
Since the letters indicating unit sizes are all distinct from the letters
specifying output formats, you do not have to remember whether unit size or
format comes first; either order works. The output
specifications `4xw' and `4wx' mean exactly the same thing. (However,
the count n must come first; `wx4' does not work.)
Even though the unit size u is ignored for the formats `s' and `i', you
might still want to use a count n; for example, `3i' specifies that you want
to see three machine instructions, including any operands. The
command disassemble gives an alternative way of inspecting machine
instructions; see Source and machine code.
All the defaults for the arguments to x are designed to make it easy to
continue scanning memory with minimal specifications each time you use x.
For example, after you have inspected three machine instructions
with `x/3i addr', you can inspect the next seven with just `x/7'. If
you use RET to repeat the x command, the repeat count n is used again; the
other arguments default as for successive uses of x.
The addresses and contents printed by the x command are not saved in the
value history because there is often too much of them and they would get in
the way. Instead, GDB makes these values available for subsequent use in
expressions as values of the convenience variables $_ and $__. After
an x command, the last address examined is available for use in expressions
in the convenience variable $_. The contents of that address, as examined,
are available in the convenience variable $__.
If the x command has a repeat count, the address and contents saved are
from the last memory unit printed; this is not the same as the last address
printed if several units were printed on the last line of output.

Вам также может понравиться