Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSC 495/583 Topics of Software Security Return-oriented programming

Similar presentations


Presentation on theme: "CSC 495/583 Topics of Software Security Return-oriented programming"— Presentation transcript:

1 CSC 495/583 Topics of Software Security Return-oriented programming
Class8 CSC 495/583 Topics of Software Security Return-oriented programming (ROP) Dr. Si Chen

2 Review

3 Format String Bug

4 Format String Bug What is a Format String? A Format String is an ASCII string that contains text and format parameters printf("%s %d\n", str, a); fprintf(stderr, "%s %d\n", str, a); sprintf(buffer, "%s %d\n", str, a); E.g. My name is Chen

5 Format String Bug The wrong way…

6 Example: fmt_wrong.c

7 Example: fmt_wrong.c %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x.%08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. %08x. the argument is passed directly to the “printf” function. the function didn’t find a corresponding variable or value on stack so it will start popping values off the stack

8 Advanced Usage: Format String Direct Access

9 What is this BUG used for?
Disclose sensitive information: Variable(s) EBP value The correct location for putting Shellcode

10 What is this BUG used for?
Disclose StackGuard Canary: By pass stack checking

11 What is this BUG used for?
Read data in any memory address: %s to read data in an arbitrary memory address Write data in any memory address: printf not only allows you to read but also write %n

12 What is this BUG used for?
Disclose Library Address When enable ASLR, the library address will change each time It’s impossible to call these functions in your shellcode (e.g. system()) Use this bug to disclose one function’s address in a given library. you can use it to deduce other function’s address

13 What is this BUG used for?
Disclose Library Address When enable ASLR, the library address will change each time It’s impossible to call these functions in your shellcode (e.g. system()) Use this bug to disclose one function’s address in a given library. you can use it to deduce other function’s address

14 ELF executable

15 ELF executable for Linux
Executable and Linkable Format (ELF) Linux Windows ELF file .exe (PE) .so (Shared object file) .dll (Dynamic Linking Library) .a .lib (static linking library) .o (intermediate file between complication and linking, object file) .obj

16 ELF executable for Linux
ELF32-bit LSB Dynamically linked

17 Shared library ELF is loaded by ld-linux.so.2  in charge of memory mapping, load shared library etc.. You can call functions in libc.so.6

18 Return-oriented programming
(ROP)

19 “Bad” behavior “Good” behavior Attacker code Application code
Bad code versus bad behavior “Bad” behavior “Good” behavior Attacker code Application code Problem: this implication is false!

20 Return-oriented programming thesis
any sufficiently large program codebase arbitrary attacker computation and behavior, without code injection (in the absence of control-flow integrity)

21 Traditional Stack Overflow
NOP Sled Payload Saved EIP

22 Traditional Stack Overflow
The simplest stack overflow exploit operates as follows: Send a payload with a NOP sled, shellcodem, and a pointer to the NOP sled The pointer to the NOP sled overwrites the saved return address and thereby takes over the stored EIP EIP now points to the machine code and the program executes arbitrary code

23 Industry response to code injection exploits
Marks all writeable locations in a process’ address space as nonexecutable Deployment: Linux (via PaX patches); OpenBSD; Windows (since XP SP2); OS X (since 10.5); … Hardware support: Intel “XD” bit,AMD “NX” bit (and many RISC processors)

24 Traditional Stack Overflow
Pros Very easy to trigger Simple to understand Being able to inject code means our payloads are powerful and flexible Cons Just make the stack non-­‐executable Lots of problems with bad characters, buffer sizes, payload detection, etc.

25 Return-to-libc Padding system() exit() “/bin/sh”

26 Return-to-libc Used primarily to streamline exploitation to bypass mitigation and situational limitations We want to spawn a shell. Send a payload that overwrites the saved EIP with the address of system(), the address of exit(), and a pointer to “/bin/sh” The system call will return directly to exit() which will then shut down the program cleanly

27 Return-to-libc Divert control flow of exploited program into libc code
system(), printf(), … No code injection required Perception of return-into-libc: limited, easy to defeat Attacker cannot execute arbitrary code Attacker relies on contents of libc — remove system()?

28 Return-to-libc Pros ▫ Does not need executable stack
▫ Also pretty easy to understand and implement Cons ▫ Relies on access to library functions ▫ Can only execute sequential instructions, no branching or fancy stuff ▫ Can only use code in .text and loaded libraries

29 Mitigation against these classical attacks attacks
Address Space Layout Randomization (ASLR) No execute bit

30 Address Space Layout Randomization (ASLR)
Map your Heap and Stack randomly At each execution, your Heap and Stack will be mapped at different places It's the same for shared libraries So, now you cannot jump on an hardened address like in a classical attack

31 Address Space Layout Randomization (ASLR)
Three executions of the same binary :

32 Data Execution Prevention (DEP): No eXecute bit (NX)
NX bit is a CPU feature On Intel CPU, it works only on x86_64 or with Physical Address Extension (PAE) enable Enabled, it raises an exception if the CPU tries to execute something that doesn't have the NX bit set The NX bit is located and setup in the Page Table Entry

33 Page Table Each process in a multi-tasking OS runs in its own memory sandbox. This sandbox is the virtual address space, which in 32-bit mode is always a 4GB block of memory addresses. These virtual addresses are mapped to physical memory by page tables, which are maintained by the operating system kernel and consulted by the processor. Each process has its own set of page tables. 

34 Page Table To each virtual page there corresponds one page table entry (PTE) in the page tables, which in regular x86 paging is a simple 4-byte record shown below:

35 Data Execution Prevention (DEP): No eXecute bit (NX)
The last bit is the NX bit (exb) 0 = disabled 1 = enabled

36 Return-Oriented Programming: Exploits Without Code Injection
ROP Introduction When Good Instructions Go Bad: Generalizing Return-Oriented Programming to RISC [1] -Buchanan, E.; Roemer, R.; Shacham, H.; Savage, S. (October 2008) Return-Oriented Programming: Exploits Without Code Injection [2] - Shacham, Hovav; Buchanan, Erik; Roemer, Ryan; Savage, Stefan. Retrieved

37 Ordinary programming: the machine level
insn instruction pointer Instruction pointer (%eip) determines which instruction to fetch & execute Once processor has executed the instruction, it automatically increments %eip to next instruction Control flow by changing value of %eip

38 Return-oriented programming: the machine level
insns … ret insns … ret C library insns … ret insns … ret insns … ret stack pointer Stack pointer (%esp) determines which instruction sequence to fetch & execute Processor doesn’t automatically increment %esp; — but the “ret” at end of each instruction sequence does

39 ROP: The Main Idea

40 ROP Gadget “The Gadget”: July 1945

41 Attack Process on x86 So, the real execution is:
Gadget1 is executed and returns Gadget2 is executed and returns Gadget3 is executed and returns So, the real execution is:

42 Several ways to find gadgets
How can we find gadgets? Several ways to find gadgets Old school method : objdump and grep Some gadgets will be not found: objdump aligns instructions Make your own tool which scans an executable segment Use an existing tool

43 Finding instruction sequences
Any instruction sequence ending in “ret” is useful — could be part of a gadget Algorithmic problem: recover all sequences of valid instructions from libc that end in a “ret” insn Idea: at each ret (c3 byte) look back: are preceding i bytes a valid length-i insn? recurse from found instructions Collect instruction sequences in a trie

44 ROPgadget

45 Q & A


Download ppt "CSC 495/583 Topics of Software Security Return-oriented programming"

Similar presentations


Ads by Google