Download presentation
Presentation is loading. Please wait.
Published byMerilyn Caroline Oliver Modified over 9 years ago
1
CS252: Systems Programming Ninghui Li Based on Slides by Prof. Gustavo Rodriguez- Rivera Topic 2: Program Structure and Using GDB
2
What Happens From a C Source Program, to Program Execution Building a program, i.e, generating an executable file from source code What are the steps? What does an executable file look like? Loading a program Each time when you execute a program, a process is created. In Unix/Linux, use “ps” command to show processes in the system
3
Building a Program The programmer writes a program hello.c The preprocessor expands #define, #include, #ifdef etc preprocessor statements and generates a hello.i file. The compiler compiles hello.i, optimizes it and generates an assembly instruction listing hello.s The assembler (as) assembles hello.s and generates an object file hello.o The compiler (cc or gcc) by default hides all these intermediate steps. You can use compiler options to run each step independently.
4
Original file hello.c #include main() { printf("Hello\n"); }
5
After preprocessor gcc -E hello.c > hello.i (-E stops compiler after running preprocessor) hello.i: /* Expanded /usr/include/stdio.h */ typedef void *__va_list; typedef struct __FILE __FILE; typedef int ssize_t; struct FILE {…}; extern int fprintf(FILE *, const char *,...); extern int fscanf(FILE *, const char *,...); extern int printf(const char *,...); /* and more */ main() { printf("Hello\n"); }
6
After assembler gcc -S hello.c (-S stops compiler after generating assembly code) Resulting file is hello.s Actual code depends on the system LC0:.ascii "Hello\0".text.globl _main.def _main;.scl 2;.type 32;.endef _main: pushl %ebp movl %esp, %ebp andl $-16, %esp subl $16, %esp call ___main movl $LC0, (%esp) call _puts leave ret
7
After compiling & assembling “gcc -c hello.c” generates hello.o The main function already has a value in the object file hello.o hello.o has undefined symbols, like the _puts function call that we don’t know where it is placed. The command “nm” can lists the symbols from object files
8
Output of “nm hello.o” 0000000000000000 b.bss// uninitilized data 0000000000000000 d.data// Global and static vars 0000000000000000 t.text U __main// entry point of program 0000000000000000 T main// main function defined in code U puts __main and puts are undefined in “hello.o” They are provided by the libraries
9
Building a program (continued) The linker puts together all object files as well as the object files in static libraries. The linker also takes the definitions in shared libraries and verifies that the symbols (functions and variables) needed by the program are completely satisfied. If there is symbol that is not defined in either the executable or shared libraries, the linker will give an error. Static libraries (.a files) are added to the executable. shared libraries (.so files) are not added to the executable file.
10
Static and Shared Libraries Shared libraries are shared across different processes. There is only one instance of each shared library for the entire system. Static libraries are not shared. There is an instance of an static library for each process.
11
After linking “gcc –o hello hello.c” generates the hello executable The hello.o object code is statically linked with libraries to include code of library functions In linking, “static = compilation/building time” Sometimes, not all functions’ code are included, some code are stored in shared libraries and dynamically linked. In linking, “dynamic = loading/execution time”
12
Building a Program Programmer C Preprocessor Compiler (cc) Optimizer Assembler (as) (static) Linker (ld) Editor hello.c hello.i hello.s hello.o Executable File (hello) Other.o files Static libraries (.a files) They add to the size of the executable.
13
What is a program? A program is a file in a special format that contains all the necessary information to load an application into memory and make it run. A program file includes: machine instructions initialized data List of library dependencies List of memory sections that the program will use List of undefined values in the executable that will be known until the program is loaded into memory.
14
Executable File Formats There are different executable file formats ELF – Executable Link File It is used in most UNIX systems (Solaris, Linux) Can use elfdump to see information in binary file COFF – Common Object File Format It is used in Windows systems a.out – Used in BSD (Berkeley Standard Distribution) and early UNIX It was very restrictive. It is not used anymore. Note: BSD UNIX and AT&T UNIX are the predecessors of the modern UNIX flavors like Solaris and Linux.
15
Loading a Program After one types hello in a shell, the shell creates a new process and load the file hello. The loader is a program that is used to run an executable file in a process. Before the program starts running, the loader allocates space for all the sections of the executable file (text, data, bss etc) It loads into memory the executable and shared libraries (if not loaded yet)
16
Loading a Program It also writes (resolves) any values in the executable to point to the functions/variables in the shared libraries.(E.g. calls to printf in hello.c) Once memory image is ready, the loader jumps to the _start entry point that calls init() of all libraries and initializes static constructors. Then it calls main() and the program begins. _start also calls exit() when main() returns. The loader is also called “runtime linker”.
17
Loading a Program Loader (runtime linker) (/usr/lib/ld.so.1) Executable File Executable in memory Shared libraries (.so,.dll)
18
Memory Structure of a Process
19
Memory of a Process A 32-bit process sees memory as an array of bytes that goes from address 0 to 2 32 -1 (0 to 4GB-1) 0 (4GB-1) 2 32 -1
20
Memory Sections The memory is organized into sections called “memory mappings”. Stack Text Data Bss Heap Shared Libs 0 2 32 -1
21
Memory Sections Each section has different permissions: read/write/execute or a combination of them. Text- Instructions that the program runs Data – Initialized global variables. Bss – Uninitialized global variables. They are initialized to zeroes. Heap – Memory returned when calling malloc/new. It grows upwards. Stack – It stores local variables and return addresses. It grows downwards.
22
Memory Sections Dynamic libraries – They are libraries shared with other processes. Each dynamic library has its own text, data, and bss. Each program has its own view of the memory that is independent of each other. Virtual memory, mapped by OS to physical memory This view is called the “Address Space” of the program. If a process modifies a byte in its own address space, it will not modify the address space of another process.
23
Memory Sections data 61 $ cat /proc/22649/maps 08048000-08049000 r-xp 00000000 00:18 270792 /u/u2/ninghui/252/t01/fac2 08049000-0804a000 r--p 00000000 00:18 270792 /u/u2/ninghui/252/t01/fac2 0804a000-0804b000 rw-p 00001000 00:18 270792 /u/u2/ninghui/252/t01/fac2 f7519000-f751a000 rw-p 00000000 00:00 0 f751a000-f76bc000 r-xp 00000000 08:02 657 /lib32/libc-2.16.so f76bc000-f76be000 r--p 001a2000 08:02 657 /lib32/libc-2.16.so f76be000-f76bf000 rw-p 001a4000 08:02 657 /lib32/libc-2.16.so f76bf000-f76c2000 rw-p 00000000 00:00 0 f76e6000-f76e8000 rw-p 00000000 00:00 0 f76e8000-f76e9000 r-xp 00000000 00:00 0 [vdso] f76e9000-f7709000 r-xp 00000000 08:02 609 /lib32/ld-2.16.so f7709000-f770a000 r--p 0001f000 08:02 609 /lib32/ld-2.16.so f770a000-f770b000 rw-p 00020000 08:02 609 /lib32/ld-2.16.so ffb39000-ffb5a000 rw-p 00000000 00:00 0 [stack]
24
Memory Sections Each row in /proc/$PID/maps describes a region of contiguous virtual memory in a process or thread. Each row has the following fields: address - This is the starting and ending address of the region in the process's address space permissions - This describes how pages in the region can be accessed. There are four different permissions: read, write, execute, and shared. If read/write/execute are disabled, a '- ' will appear instead of the 'r'/'w'/'x'. If a region is not shared, it is private, so a 'p' will appear instead of an 's'. If the process attempts to access memory in a way that is not permitted, a segmentation fault is generated. Permissions can be changed using the mprotect system call. offset - If the region was mapped from a file (using mmap), this is the offset in the file where the mapping begins. If the memory was not mapped from a file, it's just 0. device - If the region was mapped from a file, this is the major and minor device number (in hex) where the file lives. inode - If the region was mapped from a file, this is the file number. pathname - If the region was mapped from a file, this is the name of the file. This field is blank for anonymous mapped regions. There are also special regions with names like [heap], [stack], or [vdso]. [vdso] stands for virtual dynamic shared object. It's used by system calls to switch to kernel mode. Here's a good article about it.Here's a good article about it.
25
Where things are located Program hello.c int a = 5; // Stored in data section int b[20]; // Stored in bss int main() { // Stored in text int x; // Stored in stack int *p =(int*) malloc(sizeof(int)); //In heap }
26
Memory Gaps Between each memory section there may be gaps that do not have any memory mapping. If the program tries to access a memory gap, the OS will send a SEGV signal that by default kills the program and dumps a core file. The core file contains the value of the variables global and local at the time of the SEGV. The core file can be used for “post mortem” debugging. gdb program-name core gdb> where
27
Using a Debugger
28
What is GDB GDB is a debugger that helps you debug your program. The time you spend now learning gdb will save you days of debugging time. A debugger will make a good programmer a better programmer.
29
Compiling a program for gdb You need to compile with the “-g” option to be able to debug a program with gdb. The “-g” option adds debugging information to your program gcc – g – o hello hello.c
30
Running a Program with gdb To run a program with gdb type gdb progname (gdb) Then set a breakpoint in the main function. (gdb) break main A breakpoint is a marker in your program that will make the program stop and return control back to gdb. Now run your program. ( gdb) run If your program has arguments, you can pass them after run.
31
Stepping Through your Program Your program will start running and when it reaches “main()” it will stop. gdb> Now you have the following commands to run your program step by step: (gdb) step It will run the next line of code and stop. If it is a function call, it will enter into it (gdb) next It will run the next line of code and stop. If it is a function call, it will not enter the function and it will go through it. Example: ( gdb) step (gdb) next
32
Setting breakpoints You can set breakpoints in a program in multiple ways: (gdb) break function Set a breakpoint in a function E.g. (gdb) break main ( gdb) break line Set a break point at a line in the current file. E.g. ( gdb) break 66 It will set a break point in line 66 of the current file. ( gdb) break file:line It will set a break point at a line in a specific file. E.g. ( gdb) break hello.c:78
33
Regaining the Control When you type (gdb) run the program will start running and it will stop at a break point. If the program is running without stopping, you can regain control again typing ctrl-c.
34
Where is your Program The command (gdb)where Will print the current function being executed and the chain of functions that are calling that fuction. This is also called the backtrace. Example: (gdb) where #0 main () at test_mystring.c:22 (gdb)
35
Printing the Value of a Variable The command (gdb) print var Prints the value of a variable. E.g. (gdb) print i $1 = 5 (gdb) print s1 $1 = 0x10740 "Hello" (gdb) print stack[2] $1 = 56 (gdb) print stack $2 = {0, 0, 56, 0, 0, 0, 0, 0, 0, 0} (gdb)
36
Exiting gdb The command “quit” exits gdb. (gdb) quit The program is running. Exit anyway? (y or n) y
37
Debugging a Crashed Program This is also called “postmortem debugging” It has nothing to do with CSI When a program crashes, it writes a core file. bash-4.1$./hello Segmentation Fault (core dumped) bash-4.1$ The core is a file that contains a snapshot of the program at the time of the crash. That includes what function the program was running.
38
Debugging a Crashed Program To run gdb in a crashed program type gdb program core E.g. bash-4.1$ gdb hello core GNU gdb 6.6 Program terminated with signal 11, Segmentation fault. #0 0x000106cc in main () at hello.c:11 11 *s2 = 9; (gdb) Now you can type where to find out where the program crashed and the value of the variables at the time of the crash. (gdb) where #0 0x000106cc in main () at hello.c:11 (gdb) print s2 $1 = 0x0 (gdb) This tells you why your program crashed. Isn’t that great?
39
Now Try gdb in Your Own Program Make sure that your program is compiled with the –g option. Remember: One hour you spend learning gdb will save you days of debugging. Faster development, less stress, better results
40
Stack Buffer Overflow
41
CS5 26 Topic 9: Software Vulnerabilities 41 Call Stack Aka. Execution stack, control stack, run-time stack, machine stack Why do we need to use stacks in processes? To support function calls, and especially recursive function calls. What are stored on the stack? Functional call parameters Local Return address Saved state information
42
Stack Frame Parameters Return address Saved Stack Frame Pointer Local variables SP Stack Growth High Address Low Address
43
Code Fragment for Printing Stack Frame (from prstack.c) int fac(int a, int p) { char f[8] = " "; int b = 0; // print stack frame gets(f);// buffer may overflow if (a == 1) { b = 1; } else { b = a * fac(a-1,p); } // print stack frame again } return b; } int main(int argc, char*argv[]) { int n; int r; if (argc == 2) { n = atoi(argv[1]); r = fac(n, 0); } else if (argc == 3) { n = atoi(argv[2]); r = fac(n, 1); } return 0; }
44
Code Fragment for Printing Stack Frame (from prstack.c) int fac(int a, int p) { char f[8] = " "; int b = 0; printf("Address %p: argument int p: 0x%.8x\n", &p, p); printf("Address %p: argument int a: 0x%.8x\n", &a, a); printf("Address %p: return address : 0x%.8x\n", &a-1, *(&a-1)); printf("Address %p: saved stack frame p: 0x%.8x\n", &a-2, *(&a-2)); printf("Address %p: local var f[4-7] : 0x%.8x\n", (char *)(&f)+4, *((int *)(&f[4]))); printf("Address %p: local var f[0-3] : 0x%.8x\n", &f, *((int *)f)); printf("Address %p: local var int b: 0x%.8x\n", &b, b); printf("Address %p: gap : 0x%.8x\n", &b-1, *(&b-1)); … }
45
Printed Stack Frame Entering function call fac(a=2), code at 0x080484a5 Address 0xff98942c: argument int p: 0x00000001 Address 0xff989428: argument int a: 0x00000002 Address 0xff989424: return address : 0x0804860e Address 0xff989420: saved stack frame p: 0xff989440 Address 0xff98941c: local var f[4-7] : 0x00202020 Address 0xff989418: local var f[0-3] : 0x20202020 Address 0xff989414: local var int b: 0x00000000 Address 0xff989410: gap : 0x00000000 Entering function call fac(a=1), code at 0x080484a5 Address 0xff98940c: argument int p: 0x00000001 Address 0xff989408: argument int a: 0x00000001 Address 0xff989404: return address : 0x0804860e Address 0xff989400: saved stack frame p: 0xff989420 Address 0xff9893fc: local var f[4-7] : 0x00202020 Address 0xff9893f8: local var f[0-3] : 0x20202020 Address 0xff9893f4: local var int b: 0x00000000 Address 0xff9893f0: gap : 0x00000000
46
Stack Frame with Overflowed Buffer Entering function call fac(a=1), code at 0x080484a5 Address 0xffd5724c: argument int p: 0x00000001 Address 0xffd57248: argument int a: 0x00000001 Address 0xffd57244: return address : 0x0804860e Address 0xffd57240: saved stack frame p: 0xffd57260 Address 0xffd5723c: local var f[4-7] : 0x00202020 Address 0xffd57238: local var f[0-3] : 0x20202020 Address 0xffd57234: local var int b: 0x00000000 Address 0xffd57230: gap : 0x00000000 123456789012345 Leaving function call fac(a=1), code at 0x80484a5 Address 0xffd5724c: argument int p: 0x00000001 Address 0xffd57248: argument int a: 0x00000001 Address 0xffd57244: return address : 0x00353433 Address 0xffd57240: saved stack frame p: 0x32313039 Address 0xffd5723c: local var f[4-7] : 0x38373635 Address 0xffd57238: local var f[0-3] : 0x34333231 Address 0xffd57234: local var int b: 0x00000001 Address 0xffd57230: gap : 0x00000001 Segmentation fault (core dumped) Overflowing f to overwrite saved sfp and return address. Input 15 bytes.
47
What does a function do? fac 0x080484a5 : push %ebp save stack frame pointer (fp) 0x080484a6 : mov %esp,%ebp set current stack fp 0x080484a8 : sub $0x18,%esp allocate space for local var 0x080484ab : movl $0x20202020,-0x8(%ebp) initialize f[0-3] 0x080484b2 : movl $0x202020,-0x4(%ebp) initialize f[4-7] 0x080484b9 : movl $0x0,-0xc(%ebp) initialize b 0x080484c0 : mov 0xc(%ebp),%eax load value of p to eax 0x080484c3 : test %eax,%eax check if eax is 0 0x080484c5 : je 0x80485e8 if so, skip printing frame.... 0x080485e8 : mov 0x8(%ebp),%eax load value of a to eax 0x080485eb : cmp $0x1,%eax check if a==1 0x080485ee : jne 0x80485f9 if not, call fac 0x080485f0 : movl $0x1,-0xc(%ebp) otherwise, assigns 1 to b 0x080485f7 : jmp 0x8048617 …. 0x08048609 : call 0x80484a5 0x0804860e : mov 0x8(%ebp),%edx 0x08048611 : imul %edx,%eax
48
GDB commands for examining stack frames backtracebtprint all frames framef print brief current frame info info frameinfo f print detailed current frame info See http://web.mit.edu/gnu/doc/html/gdb_8.html for more
49
What is Buffer Overflow? A buffer overflow, or buffer overrun, is an anomalous condition where a process attempts to store data beyond the boundaries of a fixed-length buffer. The result is that the extra data overwrites adjacent memory locations. The overwritten data may include other buffers, variables and program flow data, and may result in erratic program behavior, a memory access exception, program termination (a crash), incorrect results or ― especially if deliberately caused by a malicious user ― a possible breach of system security. Most common with C/C++ programs
50
History Used in 1988’s Morris Internet Worm Alphe One’s “Smashing The Stack For Fun And Profit” in Phrack Issue 49 in 1996 popularizes stack buffer overflows Still extremely common today
51
What are buffer overflows? Suppose a web server contains a function: void func(char *str) { char buf[128]; strcpy(buf, str); do-something(buf); } When the function is invoked the stack looks like: What if *str is 136 bytes long? After strcpy : strret-addrsfpbuf str *str ret
52
Basic stack exploit Main problem: no range checking in strcpy(). Suppose *str is such that after strcpy stack looks like: When func() exits, the user will be given a shell !! Note: attack code runs in stack. top of stack *str ret Code for P Program P: exec( “/bin/sh” ) (exact shell code by Aleph One)
53
Carrying out this attack requires Determine the location of injected code position on stack when func() is called. So as to change stored return address on stack to point to it Location of injected code is fixed relative to the location of the stack frame Program P should not contain the ‘\0’ character. Easy to achieve Overflow should not crash program before func() exits. 53
54
Summary of Stack-based Buffer Overflow Local variables occur before (in lower address) than stored return address If overflow occurs when writing to local variable buffers (e.g., character arrays), the return address may be overwritten. When the current function returns, it will go to address
55
Some unsafe C lib functions strcpy (char *dest, const char *src) strcat (char *dest, const char *src) gets (char *s) scanf ( const char *format, … ) sprintf (conts char *format, … ) Many others exist
56
Review Steps of building a program Static vs. shared library Static vs. dynamic linking Memory structure of a process: text, data, stack, heap Concept and structure of stack frames Concept of buffer overflow; able to identify code that include buffer overflow
57
Coming Attractions Looking at even more detail how a program is compiled Topic 3: Programming in FIZ
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.