Buffer Overflow.

Buffer Overflow

Memory four basic read-write memory regions in a program Stack
in the higher parts of memory Grows down Used when a function is called Data Area Global variables not inizialited to zero BSS [Block Started by Symbol] Segment Global variables initialized to zero Heap Area Grows from the end of BSS to larger address Managed by malloc, realloc, and free. Data Segment

Background of the stack

Stack Organization Return address: address to be executed after the function returns Before entering a function the program needs to remember where to return The return address is the address of the instruction right after the function call Frame Pointer (FP) references the local variables and the function parameters Stored in a register variable_a will be referred to as ( $FP-16). buffer will be referred to as ( $FP-12). str will be referred to as ( $FP+8).

The buffer-overflow problem
The function strcpy(buffer, str) copies the contents from str to buffer[]. The string pointed by str has more than 12 chars, while the size of buffer[] is only 12. The function strcpy() does not check whether the boundary of buffer[] has reached. It only stops when seeing the end-of-string character ’\0’. Therefore, contents in the memory above buffer[] will be overwritten by the characters at the end of str.

push the value 1, i.e. the argument to the foo(), into the stack
void foo(int x) { printf(«Hello world: %d\n», x); } int main() { foo(1); return 0; push the value 1, i.e. the argument to the foo(), into the stack This operation increments %esp by 4

pushes the address of the next instruction that immediately
void foo(int x) { printf(«Hello world: %d\n», x); } int main() { foo(1); return 0; pushes the address of the next instruction that immediately follows the call statement into the stack (i.e the return address), and then jumps to the code of foo()

frame pointer. The second line lets %ebp point to the current frame
void foo(int x) { printf(«Hello world: %d\n», x); } int main() { foo(1); return 0; The first line of the function foo() pushes %ebp into the stack, to save the previous frame pointer. The second line lets %ebp point to the current frame

The stack pointer is modified to allocate space (8 bytes) for local
void foo(int x) { printf(«Hello world: %d\n», x); } int main() { foo(1); return 0; The stack pointer is modified to allocate space (8 bytes) for local variables and the two arguments passed to printf. Since there is no local variable in function foo, the 8 bytes are for arguments only.

releases, but was made into an instruction later): mov %ebp, %esp
void foo(int x) { printf(«Hello world: %d\n», x); } int main() { foo(1); return 0; This instruction implicitly performs two instructions (it was a macro in earlier x86 releases, but was made into an instruction later): mov %ebp, %esp pop %ebp

void foo(int x) { printf(«Hello world: %d\n», x); } int main() { foo(1); return 0; This instruction simply pops the return address out of the stack, and then jump to the return address.

Further restore the stack by releasing more memories allocated for
void foo(int x) { printf(«Hello world: %d\n», x); } int main() { foo(1); return 0; Further restore the stack by releasing more memories allocated for Foo. the stack is now in exactly the same state as it was before entering the function foo

Exploit the Buffer overflow Vulnerability
Injecting the malicious code Jumping to the malicious code Writing malicious code

Injecting Malicious code
Let us assume that the malicious code is already written we can simply store the malicious code (in binary form) in the badfile the vulnerable program will copy the malicious code to the buffer on the stack ?

Injecting the malicious code
The challenge: to know the absolute address of the code If target program is Set-UID program -> copy -> debug -> you figure out the address of buffer [] and than calculate the starting point of the malicious code If the target program is running remotely, you can guess. Stack usually starts at the same address. Stack is usually not very deep: most programs do not push more than a few hundred or a few thousand bytes into the stack at any one time. Therefore the range of addresses that we need to guess is actually quite small. Improve the chance: add many NOP operations to the beginning of the malicious code

Shell code To invoke the system call execve(), we need to know the address of the string “/bin/sh”. Where to store this string and how to derive the location of this string are not trivial problems. There are several NULL (i.e., 0) in the code. This will cause strcpy to stop. If the vulnerability is caused by strcpy, we will have a problem.

Countermeasures Apply Secure Engineering Principles
Use strong type language, e.g. java, C#, etc. With these languages, buffer overflows will be detected. Use safe library functions. Don’t use functions that could have buffer overflow problem: gets, strcpy, strcat, sprintf, scanf, etc. These functions are safer: fgets, strncpy, strncat, and snprintf. StackGuard: mark the boundary of buffer StackShield: seperate control (return address) from data.

Stack Guard vs Stack Shield
Stack Guard is based on a "canary" value that is put on the stack with each function call. At the end of the function, the canary is checked. If an overflow has occurred -> this will corrupt the canary and will be detected. Stack Shield is based on copying the return address to a safe area. It checks the return address at the end of the function. If the return address is overwritten -> the attack will be detected.

StackGuard is implemented as a modification for gcc, the new code adds some
assembler directives to the output file. it does is pushing a canary into the stack (for StackGuard v2.0.1 it’s a constant 0x000aff0d, latter we’ll see why), then it continues with standard prologue.

Three kinds of canaries
Terminator canary If we try to write 0x000aff0d over the former canary (effectively not changing it), the 0x00 will stop strcpy() cold, and we won’t be able to alter the return address. NULL canary is a 0x constant value XOR random canary is a random number, generated in runtime, that is not only stored in the stack, but also XORed to the return address

Weakness of StackGuard
local variables located after buf (var1) are not protected at all StackGuard’s check would only detect the attack after the function finishes, giving the attacker a code window to play with When we try to do a frame pointer overwrite attack, we may successfully alter the saved frame pointer without changing the canary

StackShield An implementation difference is that StackShield takes as input assembler files (.s) and produces as output assembler files StackGuard is implemented as a modification to gcc, and as a result, takes as input C source files, and produces binary objects the basic idea is to save return addresses in an alternate memory space named retarray Two other global variables are used: rettop, initialized on startup and never changed, is the address in memory where retarray ends, retptr is the address where the next clone is to be saved. On entry to a protected function, the return address is copied from the stack to retarray and retptr is incremented. If there is no more space for clones,the return address is not saved, but retptr is anyway incremented rettop retarray Depending on command line switches, the cloned return address can be checked against the one in the stack to detect possible attacks, or can be silently used instead of it retptr Next clone

The resulting effect is that return addresses saved in stack are not used.
Instead of them, the cloned return addreses stored in retarray are honored. This effectively stops standard stack based buffer overflows attacks, but opens the door to other possibilities. For example, if we manage to alter retarray’s contents, execution flow would surrender to us.

When the -d command line switch is used, the epilogue is a little different, the cloned return address is compared to that present in stack, and if they are different, a SYS exit system call is issued, abruptly terminating the program, the resulting epilogue is:

Return Range Checking The command line options -r and -g enable Ret Range Checking, which would detect and stop attempts to return into addresses higher than that of the variable shielddatabase, assumed to mark the base for program’s data, where we may say for simplicity, heap and stack are located. This option is a little better in terms of security than just blindly going on, as a change in the return address will abort program’s execution. However, if for some reason we need the program to keep going after we overwrote the return address we just need to overwrite it with its original value. The difference between -r and -g is that the former enables both protection methods (address cloning and ret range checking), while the later only enables ret range checking.

Function Call Target Checking
One more protection is available and enabled by default. It adds a check to see if indirect function calls (made through a function pointer) are targeted to an address below shielddatabase or not. Its main problem is that function pointers usually abused when coding exploits are part of libc (atexit's, malloc hook, free hook, .dtors, etc.) or part of the dynamic linking mechanism (GOT), and are not protected when using StackShield unless you recompile everything. Even with this protection turned on and everything recompiled, some other techniques as return into libc[3] (or jumping into libc) or, as we control the stack, just returning or jumping to a ret (ret2ret technique), may be used to bypass both range checking protections.

Microsoft GS Protection
it does protect the frame pointer, placing a random canary between it and local variables. And as StackGuard, it’s embeded in the C/C++ compiler. When using this protection, if a buffer overflow is used to change the frame pointer or the return address, the random canary would be inevitably altered. If canaries’ randomness is good, this will effectively stop the attacks decribed in Sections 3.2, 3.3 and 3.4. if canaries can be predicted, not only all the attacks described here can be done, but also a standard return address overwrite attack can be used.

SSP (former Propolice)
Most of the protection code is implemented in a file named protector.c, the entry point for the protection code is the function prepare_stack_protection(), which is called once for every function to be compiled SSP aims to protect the saved frame pointer, local variables and function’s arguments, as well as the return address

Random Canary When the canary is modified, the function stack smash handler() from libgcc2.c will print a message to stderr and log it to syslog using /dev/log. After this the program is terminated calling abort(3). Remember that this modifications are not introduced on an assembly level, but rather in gcc’s syntax trees or intermediate language.

Variable Reordering The function arrange var order() will rearrange variables, moving buffers (and structs containing buffers) to a position in stack where, if overflown, only other local buffers may be altered. The saved frame pointer or the return address cannot be modified by a stack based buffer overflow without changing the random canary, what would trigger the protection mechanism on function exit, before using the altered addresses. By placing all non-buffer variables (var1, var3, var5 and p) in lower addresses and buffers (buf2, buf4 and s) in higher addresses, SSP effectively protects all non-buffers from being altered by a local buffer overflow. The saved frame pointer or the return address cannot be modified by a stack based buffer overflow without changing the random canary, what would trigger the protection mechanism on function exit, before using the altered addresses. When a function defines more than a single buffer, all of them will be moved to higher memory addresses, and while non-buffer local variables will be protected, some local buffers may be altered by overflowing other local buffers.

Non-Executable Stack and Return-to-libc Attack
adversaries need to inject a piece of code into the user stack, and then execute the code from the stack. many operating systems, such as Linux, do save code into stacks, and thus need the stack to be executable Return-to-libc attack is an attack that does not need executable stack. we can use the library functions of operating systems to achieve our goal In Unix-like operating systems, the shared library called libc provides the C runtime on UNIX style systems The code of libc is already in the memory as a shared runtime library, and it can be accessed by all applications. Function system is one of the functions in libc. If we can call this function with the argument “/bin/sh”, we can invoke a shell

Return-to-libc attack
The first part of Return-to-libc attack is overflowing the buffer, and modify the return address on the stack. The second part: the return address is not pointed to any injected code; it points to the entry point of the function system in libc. Challenges: How to find the location of the function system? How to find the address of the string "/bin/sh"? How to pass the address of the string "/bin/sh" to the system function?

Finding the location of the system function
In most Unix operating systems, the libc library is always loaded into a fixed memory address.

Finding the location of the system function

Finding the address of «/bin/sh»
Insert the string directly into the stack using the buffer overflow problem, and then guess its address. Before running the vulnerable program, create an environment variable with value “/bin/sh”. When a C program is executed from a shell, it inherits all the environment variables from the shell. In the following, we define a new shell variable MYSHELL and let its value be /bin/sh: $ export MYSHELL=/bin/sh

Finding the address of «/bin/sh»
We also know that the function system uses "/bin/sh" in its own code. Therefore, this string must exist in libc. If we can find out the location of the string, we can use directly this string. You can search the libc library file (/lib/libc.so.6) for the string "rodata": $ readelf -S /lib/lib.so.6 | egrep ’rodata’ [15] .rodata PROGBITS e

Protection in /bin/bash
If the "/bin/sh" is pointed to "/bin/bash", even if we can invoke a shell within a Set-UID program that is running with the root privilege, we will not get the root privilege. if we can turn the current Set-UID process into a real root process, before invoking /bin/bash, we can bypass that restriction of bash. The setuid(0) system call can help you achieve that. Therefore, we need to first invoke setuid(0), and then invoke system("/bin/sh") we need to “return to libc” twice to return to the setuid function in libc If we can let this return address point to system, we can force the function setuid to return to the entry point of system

Heap/BSS Buffer Overflow
Contents in Heap/BSS Constant strings Global variables Static variables Dynamic allocated memory Overwriting File Pointers

Overwriting File Pointers
The (Set-UID) program’s file pointer points to /tmp/vulprog.tmp. The program needs to write to this file during execution using the user’s inputs. If we can cause the file point to point to /etc/shadow, we can cause the program to write to/etc/shadow. We can use the buffer overflow to change the content of the variable tmpfile. Originally, it points to the "/tmp/vluprog.tmp" string. Using the buffer overflow vulnerability, we can change the content of tmpfile to 0x903040, which is the address of the string "/etc/shadow". After that, when the program use tmpfile varialble to open the file to write, it actually opensthe shadow file. How to find the address of /etc/shadow? We can pass the string as the argument to the program, this way the string /etc/shadow is stored in the memory. We now need to guess where it is.

Overwriting Function Pointers

Executable code from outside
A function pointer (i.e., "int (*funcptr)(char *str)") allows a programmer to dynamically modify a function to be called. We can overwrite a function pointer by overwriting its address, so that when it’s executed, it calls the function we point it to instead. argv[] method: store the shellcode in an argument to the program. This causes the shellcode to be stored in the stack. Then we need to guess the address of the shellcode (just like what we did in the stack-buffer overflow). This method requires an executable stack. Heap method: store the shellcode in the heap/BSS (by using the overflow). Then we need to guess the address of the shellcode, and assign this estimated address to the function pointer. This method requires an executable heap (which is more likely than an executable stack).

Function Pointers Function pointers can be stored in heap/BSS through many different means. The do not need to be defined by the programmer. If a program calls atexit(), a function pointer will be stored in the heap by atexit(), and will be invoked when the program terminates. The svc/rpc registration functions (librpc, libnsl, etc.) keep callback functions stored on the heap.

Buffer Overflow.

Similar presentations

Presentation on theme: "Buffer Overflow."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Buffer Overflow.

Similar presentations

Presentation on theme: "Buffer Overflow."— Presentation transcript:

Similar presentations

About project

Feedback