Intro to Exploitation Stack Overflows James McFadyen UTD Computer Security Group 10/20/2011
Intro to Exploitation Only an intro to stack overflow Basic theory and application One of many types of exploitation
Outline What is a buffer overflow? Tools Vulnerable C Functions Remember the memory Learn to love assembly Stack overflow Protection Mechanisms ret2libc in Linux
Buffer Overflow “In computer security and programming, a buffer overflow, or buffer overrun, is an anomaly where a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory. This is a special case of violation of memory safety.” - Wikipedia
Buffer Overflow In our examples.. Give the program too much input, hijack the instruction pointer (EIP) Control EIP Execute arbitrary code locally or remotely Achieve what we want as elevated user
Tools Linux GDB, gcc, vi, perl/python/ruby, readelf, objdump, ltrace, strace, ropeme Windows WinDBG, OllyDBG, ImmunityDBG, IDA, Python, Mona (ImmunityDBG plugin)
Vulnerable C Code strcpy(), strncpy() strcat(), strncat() sprintf(), snprintf() gets() sscanf() Many others...
Vulnerable C Code strcpy() doesn't check size If we have char buf[128]; strcpy(buf, userSuppliedString); This makes it too easy...
Vulnerable C Code char *strncpy(char *dest, const char *src, size_t n); We have a size, but what if.. strncpy(somebuffer, str, strlen(str)); or.. strncpy(somebuffer, str, sizeof(somebuffer)); Where str is supplied by user
Vulnerable C Code Common bug, proper fix: strncpy(somebuffer, str, sizeof(somebuffer)-1);
Vulnerable C Code char *strncat(char *dest, const char *src, size_t n); Ex: int vulnerable(char *str1, char *str2) { char buf[256]; strncpy(buf, str1, 100); strncat(buf, str2, sizeof(buf)-1); return; }
Vulnerable C Code Fix: strncat(buf, str2, sizeof(buf) - strlen(buf) -1);
Remember the Memory Text Data BSS Heap Stack Initialized global and static variables Uninitialized global and static variables Program scratch space. Local variables, pass arguments, etc.. Code segment, machine instr. Dynamic space. malloc(...) / free(...) new(...) / ~ Low High * Taken from Mitchell Adair's “Stack Overflows”
Remember the Memory: The Stack ESP EBP RET arguments... previous stack frame local variables... High Low EBP EBP - x EBP + x * Taken from Mitchell Adair's “Stack Overflows”
Love the Assembly EIP – Extended Instruction Pointer ESP – Extended Stack Pointer EBP – Extended Base Pointer EAX ECX EDX ESI EDI Next Instruction executed Data register Source index Destination Index Counter register Accumulator register EBX Base register Top of stack Base Pointer * Taken from Mitchell Adair's “Stack Overflows”
Stack Overflow ESP EBP RET argc *argv[] EBP char buf[100] 100 bytes 4 bytes * Taken from Mitchell Adair's “Stack Overflows”
Stack Overflow ESP EBP RET argc *argv[] EBP 100 bytes 4 bytes RET overwritten RET 108 bytes ( 0x41 * 108) Ret will pop the instruction pointer off of the stack EIP will now point to 0x Ex: $./program $(python -c 'print "A" * 108 ') * Taken from Mitchell Adair's “Stack Overflows”
Stack Overflow ESP EBP RET argc *argv[] EBP 100 bytes 4 bytes 0xdeadbeef RET 104 bytes ( 0x41 * 104 EIP will now point to 0xdeadbeef We can now point EIP where we want Ex: $./program $(python -c 'print "A" * “\xef\xbe\xad\xde” ') * Taken from Mitchell Adair's “Stack Overflows”
Stack Overflow $./program $(python -c 'print "A" * “\xef\xbe\xad\xde” ') We have 104 bytes for a payload Payload can be anything, but for our purpose we would spawn a shell The payload will be fixed size, so when we insert it, we must reduce the # of A's by the size of the payload
Stack Overflow $./program $(python -c 'print "A" * “\xef\xbe\xad\xde” ') If we had a 32 byte payload.. (real payload will not be a bunch of \xff) $./program $(python -c 'print "A" * 72 + “\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\ xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff” + “\xef\xbe\xad\xde” ') We have adjusted the buffer so the payload will fit We will then have to point EIP (\xef\xbe\xad\xde) to our payload on the stack
Stack Overflow $./program $(python -c 'print "A" * 72 + “\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\ xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff” + “\xef\xbe\xad\xde” ') “\xef\xbe\xad\xde” would be replaced with the address of our payload EIP will now point to the address of our payload, which will spawn a shell NOPs help create a bigger “landing area” This technique is not very effective anymore... why?
Protection Mechanisms (Windows) DEP – Data execution Prevention Can't execute on the stack /GS Flag – cookie / canary detects if stack has been altered SafeSEH – Structured Exception Handler Try / except, catches exceptions ASLR - Address Space Layout Randomization Randomizes addresses in memory
Protection Mechanisms (Linux) NX – Stack Execute Invalidation Processor feature Like DEP, can't execute on the stack Stack Smashing Protection – cookie / canary Generally enabled by default ASLR - Address Space Layout Randomization Many other compiler protections...
ret2libc Bypasses NX Point EIP to a function in libc system(), exec() etc... system(“/bin/sh”); We will get a shell by using the system() function in libc
ret2libc $./program $(python -c 'print "A" * “\xef\xbe\xad\xde” ') We don't need the payload where the A's are anymore We now will point EIP to the address of system(), then the next 4 bytes will be a return address, followed by system() arguments (which will be /bin/sh) $./program $(python -c 'print "A" * address_of_system + return_address + payload ')
Demo! How to use GDB for exploitation Exploring the stack Finding important memory addresses (ret2libc) Breakpoints Using Perl/Python/Ruby for arguments in GDB Basic Stack Overflow Ret2libc
Additional Resources works works
Sources “Source Code Auditing” - Jared Demott “Smashing the stack in 2010” - Andrea Cugliari + Mariano Graziano “Stack Overflows” - Mitchell Adair libc_attack libc_attack