Download presentation
Presentation is loading. Please wait.
1
COM850 Computer Hacking and Security
Lecture 1. x86 Prof. Taeweon Suh Computer Science & Engineering Korea University
2
x86? What is x86? Generic term referring to processors from Intel, AMD and VIA Derived from the model numbers of the first few generations of processors: 8086, 80286, 80386, x86 Now it generally refers to processors from Intel, AMD, and VIA x86-16: 16-bit processor x86-32 (aka IA32): 32-bit processor * IA: Intel Architecture x86-64: 64-bit processor Intel takes about 80% of the PC market and AMD takes about 20% Apple also have been introducing Intel-based Mac from Nov. 2006 * aka: also known as
3
x86 History (as of 2008)
4
x86 History (Cont.) 8086 in 1978 2009 2011 2013 2012 4-bit 8-bit
32-bit (i386) 32-bit (i586) 32-bit (i686) 64-bit (x86_64) 2009 2011 1st Gen. Core i7 (Nehalem) 2nd Gen. Core i7 (Sandy Bridge) 2013 2012 4th Gen. Core i7 (Haswell) 3rd Gen. Core i7 (Ivy Bridge)
5
Moore’s Law Transistor count will be doubled every 18 months
1.7 billions Montecito 42millions Exponential growth 2,250
6
Feature Size (Technology) Trend
7
P ≈ CVDD2f Power Dissipation
By early 2000, Intel and AMD made every effort to increase clock frequency to enhance the performance of their CPUs But, the power consumption is the problem P ≈ CVDD2f C: Capacitance VDD: Voltage f: Frequency * Prescott: 3.8 GHz. 31 pipe stages * Tejas: was slated to operate at 7GHz or higher; #pipeline stages = 40 ~ 50. Cancel at May 2004.
8
Power Density Trend Source: Intel Corp.
9
Watch this! Click the chip Slide from Prof H.H. Lee in Georgia Tech
10
How to Reduce Power Consumption?
Reduce supply voltage with new technologies i.e., reducing transistor size Keep the clock frequency in modest range No longer increase the clock frequency Then… what would be the problem? So, the strategy is to integrate simple many CPUs in a chip Performance Dual Core, Quad Core….
11
Multi-core Processor Gala
Prof. Sean Lee’s Slide in Georgia Tech
12
Intel’s Core 2 Duo 2 cores on one chip
Two levels of caches (L1, L2) on chip 291 million transistors in 143 mm2 with 65nm technology L2 Cache Core0 Core1 DL1 IL1 Source:
13
Intel’s Core i7 4 cores on one chip
Three levels of caches (L1, L2, L3) on chip 731 million transistors in 263 mm2 with 45nm technology
14
995 million transistors in 216 mm2 with 32nm technology
Intel’s Core i7 (2nd Gen.) 2nd Generation Core i7 Sandy Bridge L1 32 KB L2 256 KB L3 8MB 995 million transistors in 216 mm2 with 32nm technology
15
1.4 billion transistors in 160 mm2 with 22nm technology
Intel’s Core i7 (3rd Gen.) 3rd Generation Core i7 L1 64 KB L2 256 KB L3 8MB 1.4 billion transistors in 160 mm2 with 22nm technology
16
AMD’s Opteron – Barcelona (2007)
4 cores on one chip 1.9GHz clock 65nm technology Three levels of caches (L1, L2, L3) on chip Integrated North Bridge
17
Intel Teraflops Research Chip
80 CPU cores Deliver more than 1 trillion floating-point operations per second (1 Teraflops) of performance Introduced in September 2006
18
Intel’s 48 Core Processor
48 x86 cores manufactured with 45nm technology Nicknamed “single-chip cloud computer” Debuted in December 2009
19
Model of Memory Hierarchy
Reg File L1 Data cache Inst cache L2 Cache Main Memory DISK SRAM DRAM Slide from Prof Sean Lee in Georgia Tech
20
x86 Operation Modes Real Mode (= real address mode) Protected Mode
Programming environment of the 8086 processor 8086 is a 16-bit processor from Intel Protected Mode Native state of the 32-bit Intel processor 32-bit mode IA-32e mode (Intel) or Long mode (AMD) 2 sub modes: Compatibility mode and 64-bit mode Compatibility mode is enabled by the operating system on a code segment basis. It means that a single 64-bit OS can support both 64-bit applications running in 64-bit mode and legacy 32-bit applications running in compatibility mode.
21
Registers in x86 Registers in 8086
4 segment registers (16-bit) CS, DS, SS, ES 8 general-purpose registers (16-bit) AX, BX, CX, DX, SP, BP, SI, DI Registers in x86-32 (Protected Mode) 6 segment registers (16-bit) CS, DS, SS, ES, FS, GS 8 general-purpose registers (32-bit) EAX, EBX, ECX, EDX, ESP, EBP, ESI, EDI
22
Registers in x86 Registers in IA-32e (Long mode)
6 segment registers (16-bit) CS, DS, SS, ES, FS, GS 16 general-purpose registers (64-bit) RAX, RBX, RCX, RDX, RSP, RBP, RSI, RDI, R8 ~ R15
23
EFLGAS in x86
24
EFLGAS in x86
25
Software Compatibility
Compatibility mode allows system software to implement binary compatibility with existing 16-bit and 32-bit x86 applications. Real mode is not supported when the processor is operating in long mode because long mode requires that paged protected mode be enabled. Virtual 8086 mode is not supported when the processor is operating in long mode. My thoughts: applications in protected mode can manipulate either 32-bit or 16-bit data. The 16-bit code can be written without entering to Virtual 8086 mode or real mode according to Table 1-1. AMD64 Architecture Programmer’s Manual. Vol 2 System Programming
26
Segmentation and Paging in Protected Mode
27
Intel Pentium Processor (1993)
TLB in Processor Translation Lookaside Buffer (TLB) TLB is there for Virtual Memory Processor Main Memory virtual address physical address TLB CPU core data Intel Pentium Processor (1993)
28
Real Mode Addressing In real mode (8086), general purpose registers are all 16-bit wide Segment registers specify the base address of each segment Segment registers CS: Code Segment for instructions DS: Data Segment for data SS: Stack Segment for stack ES: Extra Segment could be used to store more data Addressing method Segment << 4 + offset = physical address 0xFFFFF Main Memory (1MB) mov ax, 2000h mov ds, ax mov al, [100h] 20100h 100h offset DS 2000h 20000h = 2000h << 4 0x0
29
Protected Mode Addressing
TI = 0 TI = 1 CPU Main memory Index Segment Selector TI RPL Visible to software GDT LDT Segment Descriptor Segment Descriptor Segment Descriptor Segment Descriptor Invisible to software TI: Table Indicator RPL: Requested Privilege Level Segment Descriptor Segment Descriptor Segment Descriptor Segment Descriptor Base Access info Limit Segment Descriptor Segment Descriptor TI: Table Indicator RPL: Requested Privilege Level
30
Segment Descriptor Format
Software (OS) creates descriptor tables (GDT, LDT) When S == 1
31
Address Translation in Protected Mode
32
Segmentation in Linux (Protected Mode)
All Linux processes running in User mode or Kernel mode use the same pair of segments to address instruction and data CS, DS bases: 0x0 Limit: 0xfffff (4GB) Thus, logical address is the same as linear address
33
Paging
34
Page Translation in Protected Mode (4K Page, Non-PAE)
35
Page Translation in Protected Mode (4KB, PAE)
32-bit linear address 52-bit physical address PAE: Physical Address Extension
36
Address Translation in 64-bit Mode
Descriptor (each entry) in GDT and LDT is 16B-wide Segmentation is disabled in 64-bit mode Thus, switching a logical processor into 64-bit mode causes it to enforce the Flat Memory Model by largely disabling the segmented memory logic However, anytime the 64-bit OS kernel causes the logical processor to jump to a 16- or 32-bit legacy code segment, the segmentation logic is immediately re-enabled in order to maintain backward-compatibility Reference: x86 Instruction Set Architecture, Tom Shanley, MindShare, 2009
37
Code Segment Descriptor
Segmentation is disabled in 64-bit mode Compatibility mode is enabled by the operating system on a code segment basis L (Long) bit 1: 64-bit mode 0: Compatibility mode AMD64 Architecture Programmer’s Manual. Vol 2 System Programming
38
Page Translation in 64-bit Mode
48-bit linear address 52-bit physical address
39
Linear Space Segmentation
A compiled program’s memory is divided into 5 segments: Text segment (code segment) where program (assembled machine instructions) is located Data and bss segments Data segment is filled with the initialized data and static variables bss (Block Started by Symbol) is filled with the uninitialized data and static variables Heap segment for dynamic allocation and deallocation of memory using malloc() and free() Stack segment for scratchpad to store local variables and context during context switch
40
Stack Frame EBP (aka, Frame Pointer (FP) or Local Base (LB) Pointer) for referencing function parameters and local variables in the current stack frame Each stack frame contains Parameters to the function Local variables 2 pointers: Saved Frame Pointer (SFP) and return address SFP for restoring EBP to its previous value Return address for restoring EIP to its previous value
41
Stack Layout with x86 Source: Reversing, Secrets of Reverse Engineering, Eldad Eilam, 2005
42
Stack Frame Example
43
Stack Frame Example memory
High address ESP ( push ebp) ESP ebp EBP (mov ebp, esp) 5 6 4 3 2 ( sub esp, 0x20) ESP 1 ( call 0x ) ESP eip (0x ) Low address Compilation outcome could be different depending on compiler version and optimization flags
44
Stack Frame Example LEAVE instruction memory
High address ESP ( push ebp) ESP ebp EBP (mov ebp, esp) 5 6 4 0x14 0x10 0xC 0x08 0x04 3 2 ( sub esp, 0x20) ESP 1 ( call 0x ) ESP eip ( push ebp) ESP ebp EBP (mov ebp, esp) Result (a+b+c+d) 0x04 0x08 0x0C 0x10 J, I, H, G F, E, D, C ( sub esp, 0x10) ESP B, A Low address Compilation outcome could be different depending on compiler version and optimization flags
45
Stack Frame Example after RET instruction memory
High address ESP ( push ebp) ESP ebp EBP (mov ebp, esp) 5 0x04 0x08 0x0C 0x10 6 Result (a+b+c+d) 4 3 2 ( sub esp, 0x20) ESP 1 eip ebp Result (a+b+c+d) J, I, H, G F, E, D, C B, A Low address Compilation outcome could be different depending on compiler version and optimization flags
46
Backup Slides
47
Segment Selector
48
Floating Point Formats
49
Debugging Tools GDB, the GNU Project Debugger
DDD, the Data Display Debugger GUI front ends to GDB Eclipse Integrated Development Environment (IDE) Eclipse CDT (C/C++ Development Toolkit) “Install New Software” Name: Galileo URL: IDA Pro, the Interactive Disassembler Professional Audit binary with no source code Support more than 50 families of processors IDA 5.0 is free for non-commercial use
50
Just in case… Compile your code with gcc
gcc –g float-d.c -o float-d // compiled with debugging info Disassemble the binary with objdump objdump –M intel –Stx float-d > float-d.dump
51
GDB $ echo “set disassembly-flavor intel” > ~/.gdbinit
Shows disassembly in Intel format (rather than AT&T format) Operation <destination>, <source> mov ebp, esp ; ebp <- esp GDB command summary (gdb) help (gdb) help disass (gdb) list (gdb) list 1,20 (gdb) disass main (gdb) disass /mr main (gdb) info registers (or i r) ; display x86 registers Examples: (gdb) i r (gdb) i r $eip (gdb) x ; examine (gdb) x/10i $eip ; display 10 instructions from eip (gdb) x/2x $eip ; display 2 words (4 bytes) in hex. B (byte), h (halfword), w (word, 4B), g (8B) (gdb) nexti ; execute 1 machine instruction. Will step into subfunctions (gdb) stepi ; execute 1 machine instruction. Will not enter subfunctions (gdb) next ; step program (gdb) step ; step program until it reaches a different source line
52
x86 Instructions CALL – call procedure
In 32-bit near call, Push EIP of the instruction following the CALL instruction Then, branch to the target specified in the operand LEAVE – high level procedure exit Release the stack frame set up by an earlier ENTER instruction In 32-bit, ESP ← EBP; EBP ← pop(); RET – return from procedure In 32-bit near return, EIP ← pop();
53
x86 Instructions PUSH – push word, double-word or quadword onto the stack Decrement the stack pointer and then store the source operand on the top of the stack POP – pop a value from the stack Load the value from the top of the stack and increment the stack pointer LEA – load effective address For instance, LEA ecx, dword ptr [edx+edx] ECX ← EDX + EDX; Note that even though most disassemblers add the words DWORD PTR before the operands, LEA really can’t distinguish between a pointer and an integer. LEA never performs any actual memory accesses. From the book “Reversing” (p512), Starting with Pentium 4, the situation has reversed and most compilers will use ADD and SUB when generating code. However, when surrounded by several other ADD or SUB instructions, the Intel compiler still seems to use LEA. This is probably because the execution unit employed by LEA is separate from the ones used by ADD and SUB. Using LEA makes sense when the main ALUs are busy – it improves the chances of achieving parallelism in runtime
54
x86 Instructions TEST Compute the bit-wise logical AND of the first operand and second operands Set flags (SF, ZF, and PF) according to the result Then, discard the result Example: test eax eax From the book “Reversing” (p512), Starting with Pentium 4, the situation has reversed and most compilers will use ADD and SUB when generating code. However, when surrounded by several other ADD or SUB instructions, the Intel compiler still seems to use LEA. This is probably because the execution unit employed by LEA is separate from the ones used by ADD and SUB. Using LEA makes sense when the main ALUs are busy – it improves the chances of achieving parallelism in runtime
55
Buffer Overflow Protection
gcc –fstack-protector-all gcc –fno-stack-protector objdump -SD –disassembler-options=intel stack_example
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.