Binary Exploitation 2a
Return Oriented Programming & ASLR
Milan Patnaik
Indian Institute of Technology Madras
Where Are We ?
2
Where Are We ?
• Executable, No Canaries and No ASLR.
• Overwrite return address.
• Shellcode in stack.
3
Where Are We ?
• Executable, No Canaries and No ASLR.
• Overwrite return address.
• Shellcode in stack.
• Non Executable, No Canaries and No ASLR.
• Overwrite the return address.
• Return to libc restricted by system().
4
Where Are We ?
• Executable, No Canaries and No ASLR.
• Overwrite return address.
• Shellcode in stack.
• Non Executable, No Canaries and No ASLR.
• Overwrite the return address.
• Return to libc restricted by system().
• Non Executable, No Canaries and No ASLR.
• Overwrite return address.
• Return Oriented Programming.
• Execute arbitrary code.
5
Return Oriented Programming
(ROP)
6
Return Oriented Programming
7
Return Oriented Programming Attacks
• Discovered by Hovav Shacham of Stanford University
• Subverts execution.
– As with the regular ret-2-libc, can be used with non executable stacks since the
instructions can be legally executed.
– Unlike ret-2-libc does not require to execute functions in libc (can execute any
arbitrary code).
8
The Geometry of Innocent Flesh on the Bone: Return-into-libc without
Function Calls on the x86
Stack : Function Call
EBP
Parameters
for function
return Address
Locals of function
prev frame pointer
push $3
push $2
push $1
Stack
push %ebp
movl %esp, %ebp
sub $20, %esp
%ebp : Frame Pointer
In main In function
ESP
ESP
ESP
ESP
ESP
ESP
%esp : Stack Pointer
9
Call instruction has 2 steps:
 Push the contents pointed to by EIP.
 Decrease ESP by 4 (32bit machine)
Stack : Function Return
EBP
Parameters
for function
return Address
Locals of function
prev frame pointer
Stack
%ebp : Frame Pointer
ESP
ESP
ESP
ESP
ESP
ESP
%esp : Stack Pointer
10
Ret instruction has 2 steps:
 Pops the contents pointed to by
ESP into EIP
 Increment ESP by 4 (32bit machine)
movl %ebp, %esp
In function
push $3
push $2
push $1
In main
Action by leave instruction
Target Payload
Lets say this is the payload needed to be executed by an attacker.
Suppose there is a function in libc, which has exactly this sequence of
instructions … then we are done.. we just need to subvert execution
to the function
11
Target Payload
Lets say this is the payload needed to be executed by an attacker.
Suppose there is a function in libc, which has exactly this sequence of
instructions … then we are done.. we just need to subvert execution
to the function
What if such a function does not exist?
If you can’t find it then build it
12
Step 1: Find Gadgets
• Find gadgets.
• A gadget is a short sequence of instructions followed by a return.
• Useful instructions : should not transfer control outside the gadget.
• This is a pre-processing step by statically analysing the libc library.
useful instruction(s)
ret
13
Step 2: Stitching
• Stitch gadgets so that the payload is built
Program Binary
movl %esi, 0x8(%esi)
ret
G1
movb $0x0, 0x7(%esi)
ret G2
movb $0x0, 0xc(%esi)
ret G3
movl $0xb, %eax
ret
G4
14
Step 3: Construct the Stack
15
xxx
xxx
xxx
AG1
AG2
AG3
AG4
xxx
buffer
xxx
Return Address
Program Binary
movl %esi, 0x8(%esi)
ret
G1
movb $0x0, 0x7(%esi)
ret G2
movb $0x0, 0xc(%esi)
ret G3
movl $0xb, %eax
ret
G4
Program Stack
AGi: Address of
Finding Gadgets
• Static analysis of libc
• To find
1. A set of instructions that end in a ret (0xc3).
The instructions can be intended (put in by the compiler) or
unintended.
2. Besides ret, none of the instructions transfer control
out of the gadget.
16
Intended vs Unintended Instructions
• Intended : machine code intentionally put in by the
compiler
• Unintended : interpret machine code differently in order
to build new instructions
17
F7 C7 07 00 00 00 0F 95 45 C3Machine Code :
What the compiler intended..
What was not intended
Highly likely to find many diverse instructions of this form in x86.
Not so likely to have such diverse instructions in RISC processors.
Geometry
• Given an arbitrary string of machine code, what is
the probability that the code can be interpreted
as useful instructions.
– x86 code is highly dense.
– RISC processors like (SPARC, ARM, etc.) have low
geometry.
• Thus finding gadgets in x86 code is considerably
more easier than that of ARM or SPARC.
• Fixed length instruction set reduces geometry.
18
Finding Gadgets
• Static analysis of libc.
• Find any memory location with 0xc3 (RET instruction).
• Build a trie data structure with 0xc3 as a root.
• Every path from leaf to the root is a possible gadget.
19
C
3
C
3
0
0
0
0
2
4
2
4
3
7
3
7
2
4
2
4
4
6
4
6
4
3
4
3
1
6
1
6
8
9
8
9
9
4
9
4
child of
Finding Gadgets
• Scan libc from the beginning toward the end
• If 0xc3 is found
– Start scanning backward
– With each byte, ask the question if the subsequence
forms a valid instruction
– If yes, add as child
– If no, go backwards until we reach the maximum
instruction length (20 bytes)
– Repeat this till (a predefined) length W, which is the max
instructions in the gadget
20
33 b2 23 12 a0 31 a5 67 22 ab ba 4a 3c c3 ff ee ab 31 11 09
Gadgets : Constant into
Register
Loading a constant into a register (edx = deadbeef)
21
deadbeefdeadbeef
GadgetAddGadgetAdd
stack
pop %edx
ret
pop %edx
ret
esp
• A previous return will pop the gadget address into
%eip
• %esp will also be incremented to point to
deadbeef
(4 bytes on 32 bit platform)
• The pop %edx will pop deadbeef from the stack
and increment %esp to point to the next 4 bytes
on the stack
Gadgets : Arbitrary Data into eax
22
pop %edx
ret
pop %edx
ret
G1
mov 64(%edx), %eax
ret
mov 64(%edx), %eax
ret
G2
G2G2
addraddr
G1G1
stack
esp
deadbeefdeadbeef
+64
Load arbitrary data into %eax register using
Gadgets G1 and G2
Gadgets: Store Constants
• Store the contents of a register to a memory
location in the stack
23
GadgetAddr 2GadgetAddr 2
00
GadgetAddr 1GadgetAddr 1
stack
pop %edx
ret
pop %edx
ret
esp
mov %eax, 24(%edx)
ret
mov %eax, 24(%edx)
ret
24
Gadget: Addition
24
addl (%edx), %eax
push %edi
ret
addl (%edx), %eax
push %edi
ret
Add the memory pointed
to by %edx to %eax.
The result is stored in %eax
pushes %edi.. onto the stack
why is this present?
…. This is unnecessary, but
this is best gadget that we can
find for addition
But can create problems!!
We need work arounds!
GadgetAddr2GadgetAddr2
GadgetAddrGadgetAddr
stack
esp
Some gadgetSome gadget
Gadget: Addition
25
addl (%edx), %eax
push %edi
ret
addl (%edx), %eax
push %edi
ret
Add the memory pointed
to by %edx to %eax.
The result is stored in %eax
pushes %edi.. onto the stack
why is this present?
…. This is unnecessary, but
this is best gadget that we can
find for addition
But can create problems!!
We need work arounds!
GadgetAddrGadgetAddr
stack
esp
ModifiedModified
Some gadgetSome gadget
Gadgets: Addition with NOP
26
addl (%edx), %eax
push %edi
ret
addl (%edx), %eax
push %edi
ret
1. First put gadget ptr for 0xC3 into
%edi
2. 0xC3 corresponds to NOP in
ROP
3. push %edi in gadget 2 just pushes
0xc3 back into the stack
Therefore not disturbing the stack
contents
4. Gadget 3 executes as planned
GadgetAddr3GadgetAddr3
Gadget_RETGadget_RET
GadgetAddr2GadgetAddr2
Gadget_RETGadget_RET
GadgetAddr1GadgetAddr1
stack
esp 0xc30xc3
0xc3 is ret in ROP and ret is equivalent to NOP instruction
pop %edi
ret
pop %edi
ret
Unconditional Branches
• Changing the %esp
27
GAGA
stack
esp
pop %esp
ret
Conditional Branches
28
In x86 instructions conditional branches have 2 parts.
1. An instruction which modifies a condition flag (eg CF, OF, ZF).
eg. CMP %eax, %ebx (will set ZF if %eax = %ebx)
2. A branch instruction (eg. JZ, JCC, JNZ, etc).
which internally checks the conditional flag and
changes the EIP accordingly.
In ROP conditional branches have 3 parts.
1. An ROP which modifies a condition flag (eg CF, OF, ZF).
eg. CMP %eax, %ebx (will set ZF if %eax = %ebx)
2. Transfer flags to a register or memory.
3. Perturb %esp based on flags stored in memory.
In ROP, we need flags to modify %esp register instead of EIP
Needs to be explicitly handled
Step 1 : Set the flags
Find suitable ROPs that set appropriate flags
29
CMP %eax,
%ebx
RET
subtraction
Affects flags CF, OF, SF, ZF,
AF, PF
NEG %eax
RET
2s complement negation
Affects flags CF
Step 2: Transfer flags to
memory or register
• Using lahf instruction
stores 5 flags (ZF, SF, AF, PF, CF) in the %ah register
• Using pushf instruction
pushes the eflags into the stack
ROPs for these two not easily found.
A third way – perform an operation whose result depends on
the flag contents.
30
where would
one use this
instruction?
Step 2: Indirect way to transfer flags
to memory
Several instructions operate using the contents of
the flags
31
ADC %eax, %ebx : add with carry that performs eax <- eax + ebx + CF.
(if eax and ebx are 0 initially, then the result will be either 1 or 0 depending on the CF)
RCL : rotate left with carry.
RCL %eax, 1
(if eax = 0. then the result is either 0 or 1 depending on CF)
Gadgets: Transfer Flags to
Memory
32
%edx will have value A
%ecx will contain 0x0
A
Step 3: Perturb %esp depending
on flag
33
If (CF is set){
perturb %esp
}else{
leave %esp as it is
}
What we hope to achieve
* CF stored in a memory
location (say X).
* Current %esp.
* Delta, how much to perturb
%esp.
What we have
negate X
offset = Delta & X
%esp = %esp+offset
One way of achieving …
1. Negate X (eg. Using instruction negl)
finds the 2’s complement of X
if (X = 1) 2’s complement is 111111111…
if (X = 0) 2’s complement is 000000000...
2. offset = Delta if X = 1
offset = 0 if X = 0
3. %esp = %esp + offset if X = 1
%esp = %esp if X = 0
Turing Complete
• Gadgets can do much more…
invoke libc functions,
invoke system calls, ...
• For x86, gadgets are said to be turning complete.
– Can program just about anything with gadgets.
• For RISC processors, more difficult to find gadgets.
– Instructions are fixed width.
– Therefore can’t find unintentional instructions.
• Tools available to find gadgets automatically.
Eg. ROPGadget (https
://github.com/JonathanSalwan/ROPgadget)
Ropper (https://github.com/sashs/Ropper)
34
Address Space Layout Randomization
(ASLR)
35
Address Space Randomization
• Address space layout
randomization (ASLR)
randomizes the address space
layout of the process.
• Each execution would have a
different memory map, thus
making it difficult for the attacker
to run exploits.
• Initiated by Linux PaX project in
2001.
• Now a default in many operating
systems.
36
Memory layout across boots for a Windows box
Non Executable Stack Attack Prevention
ASLR in the Linux Kernel
• Locations of the base, libraries, heap, and stack can be
randomized in a process’ address space.
• Built into the Linux kernel and controlled by
/proc/sys/kernel/randomize_va_space
• randomize_va_space can take 3 values:
0 : disable ASLR.
1 : positions of stack, VDSO, shared memory regions are
randomized the data segment is immediately after the
executable code.
2 : (default setting) setting 1 as well as the data segment
location is randomized.
37
Non Executable Stack Attack Prevention
ASLR in Action
38
First Run
Another Run
Non Executable Stack Attack prevention
ASLR in the Linux Kernel
• Permanent changes can be made by editing the
/etc/sysctl.conf file.
Two requirements:-
Make the code relocatable.
- Generate random address.
39
/etc/sysctl.conf, for example:
kernel.randomize_va_space = value
sysctl -p
Non Executable Stack Attack Prevention
Internals : Making code
relocatable
• Load time relocatable.
– where the loader modifies a program executable so
that all addresses are adjusted properly.
– Relocatable code.
• Slow load time since executable code needs to be modified.
• Requires a writeable code segment, which could pose
problems.
• PIE : position independent executable.
– a.k.a PIC (position independent code).
– code that executes properly irrespective of its absolute address.
– Used extensively in shared libraries.
• Easy to find a location where to load them without overlapping with other
modules.
40
Non Executable Stack Attack Prevention
Load Time Relocatable
41
11
Non Executable Stack Attack Prevention
Load Time Relocatable
42
note the 0x0 here…
the actual address of mylib_int is not filled in
22
Non Executable Stack Attack Prevention
Load Time Relocatable
43
Relocatable table present in the executable
that contains all references of mylib_int33
Offset in memory where the fix needs to be made
Store binary value in the symbol memory location
Non Executable Stack Attack Prevention
Load Time Relocatable
44
The loader fills in the actual address of mylib_in
at run time.
44
Non Executable Stack Attack Prevention
Load Time Relocatable
45
Limitations
 Slow load time since executable code needs to be
modified.
 Requires a writeable code segment, which could pose
problems.
 Since executable code of each program needs to be
customized, it would prevent sharing of code sections.
Non Executable Stack Attack Prevention
Position Independent
Executable
• An additional level of indirection for all global
data and function references.
• Uses a lot of relative addressing schemes and a
global offset table (GOT).
• For relative addressing,
– data loads and stores should not be at absolute
addresses but must be relative.
46
http://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-
shared-libraries/
Non Executable Stack Attack Prevention
Global Offset Table (GOT)
• Table at a fixed (known) location in
memory space and known to the linker.
• Has the location of the absolute address
of variables and functions.
47
Without GOT
With GOT
Non Executable Stack Attack Prevention
Enforcing Relative Addressing
(example)
48
With load time relocatable
With PIC
Non Executable Stack Attack Prevention
Enforcing Relative Addressing
(example)
49
With load time relocatable
With PIC
Get address of next instruction
to achieve relativeness
Index into GOT and get the
actual address of mylib_int into
eax
Now work with the actual
address.
Non Executable Stack Attack Prevention
Advantage of the GOT
• With load time relocatable code, every variable reference
would need to be changed.
– Requires writeable code segments.
– Huge overheads during load time.
– Code pages cannot be shared.
• With GOT, the GOT table needs to be constructed just once
during the execution.
– GOT is in the data segment, which is writeable.
– Data pages are not shared anyway.
– Drawback : runtime overheads due to multiple loads.
50
Non Executable Stack Attack Prevention
An Example of working with
GOT
51
$gcc –m32 –shared –fpic –S got.c
Besides a.out, this compilation also generates got.s
The assembly code for the program.
52
Data section
Text section
Fills %ecx with the eip of the next
instruction.
Why do we need this indirect way of doing
this?
In this case what will %ecx contain?
The macro for the GOT is known by the linker.
%ecx will now contain the offset to GOT
Load the absolute address of myglob from the
GOT into %eax
Non Executable Stack Attack Prevention
More
53
offset of myglob
in GOT
GOT it!
Non Executable Stack Attack Prevention
Internals: Randomizing the
data section
54
loading the executable
Check if
randomize_va_space
is > 1 (it can be 1 or 2)
Compute the end of the
data segment (m->brk
+ 0x20)
Finally Randomize
Non Executable Stack Attack Prevention
Deep Within the Kernel
(randomizing the data section)
55
Non Executable Stack Attack Prevention
Deep Within the Kernel
(randomizing the data section)
56
• The address of the first element of the ‘hash[0]’
array.
• The currently executing process ID for the
processor that handles this.
• The system’s jiffies value.
• CPU cycles number.
Non Executable Stack Attack Prevention
Function Calls in PIC
• Theoretically could be done similar with the data.
– call instruction gets location from GOT entry that is filled
in during load time (this process is called binding).
– In practice, this is time consuming. Much more functions
than global variables. Most functions in libraries are
unused.
• Lazy binding scheme.
– Delay binding till invocation of the function.
– Uses a double indirection – PLT – procedure linkage table
in addition to GOT.
57
Non Executable Stack Attack Prevention
The PLT
58
11
• Instead of directly calling func, invoke an
offset in the PLT instead.
• PLT is part of the executable text section,
and consists of one entry for each external
function the shared library calls.
• Each PLT entry has

A jump location to a specific GOT entry
• Preparation of arguments for a ‘resolver’
• Call to resolver function
Non Executable Stack Attack Prevention
First Invocation of Func
First Invocation of func
59
11
22
(steps 2 and 3)
On first invocation of func, PLT[n]
jumps to GOT[n], which simply jumps
back to PLT[n].
33
Non Executable Stack Attack Prevention
First Invocation of Func
60
11
22
(step 4)
Invoke resolver, which resolves
the actual of func, places this
actual address into GOT and then
invokes func.
The arguments passed to resolver,
that helps to do symbol resolution.
Note that the contents of GOT is
now changed to point to the actual
address of func.
33
44
Non Executable Stack Attack Prevention
Example of PLT
61
Compiler converts the call to set_mylib_int
into set_mylib_int@plt
Non Executable Stack Attack Prevention
Example of PLT
62
ebx points to the GOT
table
ebx + 0x10 is the
offset corresponding
to set_mylib_int
Offset of set_mylib_int in
the GOT (+0x10).
It contains the address of
the next instruction (ie.
0x3c2)
Non Executable Stack Attack Prevention
Example of PLT
63
Push arguments for
the resolver.
Jump to the first entry of
the PLT
Ie. PLT[0]
Jump to the resolver,
which resolves the
actual address of
set_mylib_int and fills
it into the GOT
Non Executable Stack Attack Prevention
Subsequent invocations of
Func
64
11
22
33
Non Executable Stack Attack Prevention
Advantages
• Functions are relocatable, therefore good
for ASLR.
• Functions resolved only on need, therefore
saves time during the load phase.
65
Non Executable Stack Attack Prevention
Bypassing ASLR
• Brute force.
• Return-to-PLT.
• Overwriting the GOT.
• Timing Attacks.
66
Bypassing ASLR
• Brute force.
67
Bypassing ASLR
• Brute force.
68
That’s for the classes