Presentation is loading. Please wait.

Presentation is loading. Please wait.

COM850 Computer Hacking and Security

Similar presentations


Presentation on theme: "COM850 Computer Hacking and Security"— Presentation transcript:

1 COM850 Computer Hacking and Security
Lecture 1. x86 Prof. Taeweon Suh Computer Science & Engineering Korea University

2 x86? What is x86? Generic term referring to processors from Intel, AMD and VIA Derived from the model numbers of the first few generations of processors: 8086, 80286, 80386,  x86 Now it generally refers to processors from Intel, AMD, and VIA x86-16: 16-bit processor x86-32 (aka IA32): 32-bit processor * IA: Intel Architecture x86-64: 64-bit processor Intel takes about 80% of the PC market and AMD takes about 20% Apple also have been introducing Intel-based Mac from Nov. 2006 * aka: also known as

3 x86 History (as of 2008)

4 x86 History (Cont.) 8086 in 1978 2009 2011 2013 2012 4-bit 8-bit
32-bit (i386) 32-bit (i586) 32-bit (i686) 64-bit (x86_64) 2009 2011 1st Gen. Core i7 (Nehalem) 2nd Gen. Core i7 (Sandy Bridge) 2013 2012 4th Gen. Core i7 (Haswell) 3rd Gen. Core i7 (Ivy Bridge)

5 Moore’s Law Transistor count will be doubled every 18 months
1.7 billions Montecito 42millions Exponential growth 2,250

6 Feature Size (Technology) Trend

7 P ≈ CVDD2f Power Dissipation
By early 2000, Intel and AMD made every effort to increase clock frequency to enhance the performance of their CPUs But, the power consumption is the problem P ≈ CVDD2f C: Capacitance VDD: Voltage f: Frequency * Prescott: 3.8 GHz. 31 pipe stages * Tejas: was slated to operate at 7GHz or higher; #pipeline stages = 40 ~ 50. Cancel at May 2004.

8 Power Density Trend Source: Intel Corp.

9 Watch this! Click the chip Slide from Prof H.H. Lee in Georgia Tech

10 How to Reduce Power Consumption?
Reduce supply voltage with new technologies i.e., reducing transistor size Keep the clock frequency in modest range No longer increase the clock frequency Then… what would be the problem? So, the strategy is to integrate simple many CPUs in a chip Performance Dual Core, Quad Core….

11 Multi-core Processor Gala
Prof. Sean Lee’s Slide in Georgia Tech

12 Intel’s Core 2 Duo 2 cores on one chip
Two levels of caches (L1, L2) on chip 291 million transistors in 143 mm2 with 65nm technology L2 Cache Core0 Core1 DL1 IL1 Source:

13 Intel’s Core i7 4 cores on one chip
Three levels of caches (L1, L2, L3) on chip 731 million transistors in 263 mm2 with 45nm technology

14 995 million transistors in 216 mm2 with 32nm technology
Intel’s Core i7 (2nd Gen.) 2nd Generation Core i7 Sandy Bridge L1 32 KB L2 256 KB L3 8MB 995 million transistors in 216 mm2 with 32nm technology

15 1.4 billion transistors in 160 mm2 with 22nm technology
Intel’s Core i7 (3rd Gen.) 3rd Generation Core i7 L1 64 KB L2 256 KB L3 8MB 1.4 billion transistors in 160 mm2 with 22nm technology

16 AMD’s Opteron – Barcelona (2007)
4 cores on one chip 1.9GHz clock 65nm technology Three levels of caches (L1, L2, L3) on chip Integrated North Bridge

17 Intel Teraflops Research Chip
80 CPU cores Deliver more than 1 trillion floating-point operations per second (1 Teraflops) of performance Introduced in September 2006

18 Intel’s 48 Core Processor
48 x86 cores manufactured with 45nm technology Nicknamed “single-chip cloud computer” Debuted in December 2009

19 Model of Memory Hierarchy
Reg File L1 Data cache Inst cache L2 Cache Main Memory DISK SRAM DRAM Slide from Prof Sean Lee in Georgia Tech

20 x86 Operation Modes Real Mode (= real address mode) Protected Mode
Programming environment of the 8086 processor 8086 is a 16-bit processor from Intel Protected Mode Native state of the 32-bit Intel processor 32-bit mode IA-32e mode (Intel) or Long mode (AMD) 2 sub modes: Compatibility mode and 64-bit mode Compatibility mode is enabled by the operating system on a code segment basis. It means that a single 64-bit OS can support both 64-bit applications running in 64-bit mode and legacy 32-bit applications running in compatibility mode.

21 Registers in x86 Registers in 8086
4 segment registers (16-bit) CS, DS, SS, ES 8 general-purpose registers (16-bit) AX, BX, CX, DX, SP, BP, SI, DI Registers in x86-32 (Protected Mode) 6 segment registers (16-bit) CS, DS, SS, ES, FS, GS 8 general-purpose registers (32-bit) EAX, EBX, ECX, EDX, ESP, EBP, ESI, EDI

22 Registers in x86 Registers in IA-32e (Long mode)
6 segment registers (16-bit) CS, DS, SS, ES, FS, GS 16 general-purpose registers (64-bit) RAX, RBX, RCX, RDX, RSP, RBP, RSI, RDI, R8 ~ R15

23 EFLGAS in x86

24 EFLGAS in x86

25 Software Compatibility
Compatibility mode allows system software to implement binary compatibility with existing 16-bit and 32-bit x86 applications. Real mode is not supported when the processor is operating in long mode because long mode requires that paged protected mode be enabled. Virtual 8086 mode is not supported when the processor is operating in long mode. My thoughts: applications in protected mode can manipulate either 32-bit or 16-bit data. The 16-bit code can be written without entering to Virtual 8086 mode or real mode according to Table 1-1. AMD64 Architecture Programmer’s Manual. Vol 2 System Programming

26 Segmentation and Paging in Protected Mode

27 Intel Pentium Processor (1993)
TLB in Processor Translation Lookaside Buffer (TLB) TLB is there for Virtual Memory Processor Main Memory virtual address physical address TLB CPU core data Intel Pentium Processor (1993)

28 Real Mode Addressing In real mode (8086), general purpose registers are all 16-bit wide Segment registers specify the base address of each segment Segment registers CS: Code Segment for instructions DS: Data Segment for data SS: Stack Segment for stack ES: Extra Segment could be used to store more data Addressing method Segment << 4 + offset = physical address 0xFFFFF Main Memory (1MB) mov ax, 2000h mov ds, ax mov al, [100h] 20100h 100h offset DS 2000h 20000h = 2000h << 4 0x0

29 Protected Mode Addressing
TI = 0 TI = 1 CPU Main memory Index Segment Selector TI RPL Visible to software GDT LDT Segment Descriptor Segment Descriptor Segment Descriptor Segment Descriptor Invisible to software TI: Table Indicator RPL: Requested Privilege Level Segment Descriptor Segment Descriptor Segment Descriptor Segment Descriptor Base Access info Limit Segment Descriptor Segment Descriptor TI: Table Indicator RPL: Requested Privilege Level

30 Segment Descriptor Format
Software (OS) creates descriptor tables (GDT, LDT) When S == 1

31 Address Translation in Protected Mode

32 Segmentation in Linux (Protected Mode)
All Linux processes running in User mode or Kernel mode use the same pair of segments to address instruction and data CS, DS bases: 0x0 Limit: 0xfffff (4GB) Thus, logical address is the same as linear address

33 Paging

34 Page Translation in Protected Mode (4K Page, Non-PAE)

35 Page Translation in Protected Mode (4KB, PAE)
32-bit linear address 52-bit physical address PAE: Physical Address Extension

36 Address Translation in 64-bit Mode
Descriptor (each entry) in GDT and LDT is 16B-wide Segmentation is disabled in 64-bit mode Thus, switching a logical processor into 64-bit mode causes it to enforce the Flat Memory Model by largely disabling the segmented memory logic However, anytime the 64-bit OS kernel causes the logical processor to jump to a 16- or 32-bit legacy code segment, the segmentation logic is immediately re-enabled in order to maintain backward-compatibility Reference: x86 Instruction Set Architecture, Tom Shanley, MindShare, 2009

37 Code Segment Descriptor
Segmentation is disabled in 64-bit mode Compatibility mode is enabled by the operating system on a code segment basis L (Long) bit 1: 64-bit mode 0: Compatibility mode AMD64 Architecture Programmer’s Manual. Vol 2 System Programming

38 Page Translation in 64-bit Mode
48-bit linear address 52-bit physical address

39 Linear Space Segmentation
A compiled program’s memory is divided into 5 segments: Text segment (code segment) where program (assembled machine instructions) is located Data and bss segments Data segment is filled with the initialized data and static variables bss (Block Started by Symbol) is filled with the uninitialized data and static variables Heap segment for dynamic allocation and deallocation of memory using malloc() and free() Stack segment for scratchpad to store local variables and context during context switch

40 Stack Frame EBP (aka, Frame Pointer (FP) or Local Base (LB) Pointer) for referencing function parameters and local variables in the current stack frame Each stack frame contains Parameters to the function Local variables 2 pointers: Saved Frame Pointer (SFP) and return address SFP for restoring EBP to its previous value Return address for restoring EIP to its previous value

41 Stack Layout with x86 Source: Reversing, Secrets of Reverse Engineering, Eldad Eilam, 2005

42 Stack Frame Example

43 Stack Frame Example memory
High address ESP ( push ebp) ESP ebp EBP (mov ebp, esp) 5 6 4 3 2 ( sub esp, 0x20) ESP 1 ( call 0x ) ESP eip (0x ) Low address Compilation outcome could be different depending on compiler version and optimization flags

44 Stack Frame Example LEAVE instruction memory
High address ESP ( push ebp) ESP ebp EBP (mov ebp, esp) 5 6 4 0x14 0x10 0xC 0x08 0x04 3 2 ( sub esp, 0x20) ESP 1 ( call 0x ) ESP eip ( push ebp) ESP ebp EBP (mov ebp, esp) Result (a+b+c+d) 0x04 0x08 0x0C 0x10 J, I, H, G F, E, D, C ( sub esp, 0x10) ESP B, A Low address Compilation outcome could be different depending on compiler version and optimization flags

45 Stack Frame Example after RET instruction memory
High address ESP ( push ebp) ESP ebp EBP (mov ebp, esp) 5 0x04 0x08 0x0C 0x10 6 Result (a+b+c+d) 4 3 2 ( sub esp, 0x20) ESP 1 eip ebp Result (a+b+c+d) J, I, H, G F, E, D, C B, A Low address Compilation outcome could be different depending on compiler version and optimization flags

46 Backup Slides

47 Segment Selector

48 Floating Point Formats

49 Debugging Tools GDB, the GNU Project Debugger
DDD, the Data Display Debugger GUI front ends to GDB Eclipse Integrated Development Environment (IDE) Eclipse CDT (C/C++ Development Toolkit) “Install New Software” Name: Galileo URL: IDA Pro, the Interactive Disassembler Professional Audit binary with no source code Support more than 50 families of processors IDA 5.0 is free for non-commercial use

50 Just in case… Compile your code with gcc
gcc –g float-d.c -o float-d // compiled with debugging info Disassemble the binary with objdump objdump –M intel –Stx float-d > float-d.dump

51 GDB $ echo “set disassembly-flavor intel” > ~/.gdbinit
Shows disassembly in Intel format (rather than AT&T format) Operation <destination>, <source> mov ebp, esp ; ebp <- esp GDB command summary (gdb) help (gdb) help disass (gdb) list (gdb) list 1,20 (gdb) disass main (gdb) disass /mr main (gdb) info registers (or i r) ; display x86 registers Examples: (gdb) i r (gdb) i r $eip (gdb) x ; examine (gdb) x/10i $eip ; display 10 instructions from eip (gdb) x/2x $eip ; display 2 words (4 bytes) in hex. B (byte), h (halfword), w (word, 4B), g (8B) (gdb) nexti ; execute 1 machine instruction. Will step into subfunctions (gdb) stepi ; execute 1 machine instruction. Will not enter subfunctions (gdb) next ; step program (gdb) step ; step program until it reaches a different source line

52 x86 Instructions CALL – call procedure
In 32-bit near call, Push EIP of the instruction following the CALL instruction Then, branch to the target specified in the operand LEAVE – high level procedure exit Release the stack frame set up by an earlier ENTER instruction In 32-bit, ESP ← EBP; EBP ← pop(); RET – return from procedure In 32-bit near return, EIP ← pop();

53 x86 Instructions PUSH – push word, double-word or quadword onto the stack Decrement the stack pointer and then store the source operand on the top of the stack POP – pop a value from the stack Load the value from the top of the stack and increment the stack pointer LEA – load effective address For instance, LEA ecx, dword ptr [edx+edx] ECX ← EDX + EDX; Note that even though most disassemblers add the words DWORD PTR before the operands, LEA really can’t distinguish between a pointer and an integer. LEA never performs any actual memory accesses. From the book “Reversing” (p512), Starting with Pentium 4, the situation has reversed and most compilers will use ADD and SUB when generating code. However, when surrounded by several other ADD or SUB instructions, the Intel compiler still seems to use LEA. This is probably because the execution unit employed by LEA is separate from the ones used by ADD and SUB. Using LEA makes sense when the main ALUs are busy – it improves the chances of achieving parallelism in runtime

54 x86 Instructions TEST Compute the bit-wise logical AND of the first operand and second operands Set flags (SF, ZF, and PF) according to the result Then, discard the result Example: test eax eax From the book “Reversing” (p512), Starting with Pentium 4, the situation has reversed and most compilers will use ADD and SUB when generating code. However, when surrounded by several other ADD or SUB instructions, the Intel compiler still seems to use LEA. This is probably because the execution unit employed by LEA is separate from the ones used by ADD and SUB. Using LEA makes sense when the main ALUs are busy – it improves the chances of achieving parallelism in runtime

55 Buffer Overflow Protection
gcc –fstack-protector-all gcc –fno-stack-protector objdump -SD –disassembler-options=intel stack_example


Download ppt "COM850 Computer Hacking and Security"

Similar presentations


Ads by Google