Download presentation
Presentation is loading. Please wait.
1
A Crash Course on x86 Disassembly
Chapter 4
2
HW3 Due Date Postponed More practice in HW3
HW3 Due Date: Sept. 24th, Next Sunday, 23:59:59 EST
3
Equifax Data Breach 150M user data is stolen (including my SSN too)
Hackers use Apache Struts exploit (not zero-day exploit – known threats but not fixed by Equifax) Think: SSN – Address – Phone. etc -> Third Party = Credit ? Safe ? Data leak happens in May, reported just now Management of Equifax has been selling shares during the past few months (insider selling) – moral hazard
4
Apple iPhone X Face ID This week Apple’s release of iPhone X FaceID
3D face recognitions TrueDepth camera/sensing/dot projector – infrared sensors – 30,000 dots for 3D facial construction 3D graphics processing - Fast Processor On-device Deep Learning Verification done in 1 second A11 Bionic Chip (ASICs) – similar to Google’s TPU (Tensor Processing Unit) – Huge # of FP; Energy-conservation Preserve user’s privacy – no data is uploaded to the cloud Norman P. Jouppi, et. al, In-Datacenter Performance Analysis of a Tensor Processing Unit, ISCA 2017
5
Attack FaceID Data Training: Less labeled images -> generate more images Expect attacks from all the famous hackers/security researchers worldwide 3D reconstruction attack Black-box design Gather user data from side-channels (different facet of faces/video recordings) Use 3D printing to reconstruct a facial model of the user to spoof FaceID Learning from Simulated and Unsupervised Images through Adversarial Training, CVPR 2017, Best Paper Award
6
Levels of Abstraction Computer systems: several levels of abstractions – hide implementation details. Six levels of abstractions: Hardware: digital logic gates (AND, OR, XOR, NOT) Microcode: firmware – interface with hardware Machine Code: opcodes, hex digits (after compile) Low-level languages: instruction set, human readable High-level language: C/C++ -> compiled into machine code Interpreted languages: C#, Java ->translated into bytecode (translated into machine code) Microocde: specific to the hardware Machinecode: tell processor what to do Bytecode executes within an interpreter.
7
How software works gcc compiler driver pre-processes, compiles, assembles and links to generate executable Links together object code (i.e. game.o) and static libraries (i.e. libc.a) to form final executable Links in references to dynamic libraries for code loaded at load time (i.e. libc.so.1) Executable may still load additional dynamic libraries at run- time Pre- processor Compiler Assembler Linker hello.c hello.i hello.s hello.o hello Program Source Modified Source Assembly Code Object Code Executable Code
8
PC Architectures Focus: x86 32-bit, Intel IA-32
Other architectures: x64, MIPS, ARM CPU: executes the code RAM: stores data and code I/O: interface with hardware (keyboard, monitors, mouse) Control unit: gets instructions to execute from RAM using a register. Register: basic data storage units. ALU: executes instruction from RAM and place the results in register/memory.
9
Main Memory Data: contains values when a program is initially loaded (static/global values). Code: instructions fetched by CPU to execute the program. Heap: dynamic memory; allocate new values and free values. Stack: local variables and parameters.
10
Distinguish Data from Code
Can we distinguish data from code ? Incorrect specification will lead to errors, and the program is most likely to crash. *Sometimes, IDAPro may have difficulties as well (described later in the course).
11
Common Data Types Bytes—8 bits. Examples: AL, BL, CL
Word—16 bits. Examples: AX, BX, CX Double word—32 bits. Examples: EAX, EBX, ECX Quad word—64 bits. (x86 does not have 64-bit, usually combines two registers)
12
Instruction Move 0x42 into register ecx 0xB9 -> move ecx
Memory address: denoted by value, register or equation between brackets, e.g. [eax] 0x > immediate number (like constant) [0x e] -> immediate hard-coded address
13
Registers AX: reference the lower 16 bits of the EAX AL: lower 8 bits
Small amount of data storage available to CPU Accessed more quickly than storage elsewhere AX: reference the lower 16 bits of the EAX AL: lower 8 bits AH: higher 8 bits EAX/EBX/ECX/EDX
14
Registers Use of registers follow certain conventions
EAX, EDX for multiplication and division Multiplies the unsigned operand by EAX and stores the result in a 64-bit value in EDX:EAX. EDX:EAX means that the low (least significant) 32 bits are stored in EAX and the high (most significant) 32 bits are stored in EDX. Use of registers follow certain conventions E.g. EAX generally contains return value for function calls. Important for malware analyst to know conventions to examine the code quickly
15
Flags EFLAGS register: status register (32 bit, each bit is a flag). Some important falgs ZF: zero flag, set if operation is zero CF: carry flag, set if operation is too large for destination operand SF: sign flag set if operation is negative TF: trap flag used for debugging (x86 execute one instruction at a time if set)
16
Extended Instruction Pointer (EIP)
EIP: a register contains the memory address of the next instruction to be executed Tell the processor what to do next If EIP is corrupted, points to a memory address that is not legit, program will crash Attackers controls EIP through exploitation – have attack code in memory, then change EIP to point to that code to exploit a system Buffer overflow attacks
17
Instructions Mov: mov destination, source – move data into registers or RAM Lea: load effective address – put a memory address into the destination, e.g. lea eax, [ebx+8] -> put EBX+8 into EAX Mov eax, [ebx+8] -> loads the data at memory address specified by EBX+8 Lea eax,[ebx+8] = mov eax, ebx+8
18
Arithmetic Addition: add destination, value
Subtraction: sub destination, value (ZF set if zero; CF set if destination < value) Inc/Dec: increment or decrement a register by one
19
Multiply and division: act on predefined registers
Mul value : multiplies EAX by value. Results stored as 64-bit value: EDX and EAX. EDX most significant 32 bits, EAX least significant 32 bits Div value: divides the 64 bits across EDX and EAX by value. Results stored in EAX, remainder in EDX.
20
Logical Operations Or, AND, XOR: xor eax, eax -> set EAX to zero (optimization for clear register) 33 C0 xor eax eax; B mov eax,1 -> 2 bytes vs. 5 bytes shr/shl: shift register to right/left. Shr destination, count Bits shifted beyond boundary are first shifted into CF. ror/rol: rotate – no fall off, bits shift to the other side Shifting: an optimization of multiplication -> each shift left- > multiples by two; n bits -> ? xor eax, eax Clears the EAX register or eax, 0x7575 Performs the logical or operation on EAX with 0x7575 mov eax, 0xA shl eax, 2 Shifts the EAX register to the left 2 bits; these two instructions result in EAX = 0x28, because 1010 (0xA in binary) shifted 2 bits left is (0x28) mov bl, 0xA ror bl, 2 Rotates the BL register to the right 2 bits; these two instructions result in BL = , because 1010 rotated 2 bits right is
21
Stack Last In First Out (LIFO) What are stored in a stack ?
Functions, local variables, flow controls ESP and EBP registers ESP -> stack pointer (memory adrs top of the stk) EBP-> base pointer (stays consistent within a given function -> for keeping track of local variables/parameters) Short-term storage Addrs “grows” from high to low
22
Function Calls Prologue: prepares the stack and registers
Epilogue: restore the stack and registers 1. Arguments are pushed on the stack 2. Function is called using call memory_location (contents of the EIP register) is pushed onto stack EIP set to memory_location (the start of the function – for return) 3. Space allocated for local variables and EBP is pushed onto the stack. 4. Finish: ESP is adjusted to free local variables; EBP is restored ; pops return address off the stack into EIP (for next instructions) Pusha: push 16-bit registers in order: AX, CX, DX, BX, SP, BP, SI, DI Pushad: push 32-bit registers in order: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI
23
Example (function call)
Push base pointer Set stack point to base pointer Reduce stack pointer by 28h (Why ?) Perform add, call printf Leave- Rtn
24
Conditionals test: identical to and (operands are not set, only flags)
Test against itself is to check for NULL values: test eax, eax -> compare EAX to zero but requires less CPU cycles. cmp: identical to sub (operands are not set, only the flags)
25
Branching Jump instructions: causes the next instruction to be executed.
26
Rep Instructions Manipulating data buffers – in the form of an array of bytes (single or double words). Movsx, cmpsx, stosx, scasx, x = b, w, d (byte, word, double word) Movsb -> move only a single byte, from ESI to EDI; rep prefix is commonly used with movsb to copy a seq. of bytes, with size defined by ECX. Read cmpsx, stosx, scasx in the book.
27
IDAPro
28
Download IDAPro – Free
29
Work can be saved in idb (ida pro database)
30
Reset Desktop
31
Navigate – Use Search -> Text
IDA auto comments – very useful
32
Navigate (Links) Double Click here
It will redirect you to the sub links Could be functions Navigate through history
33
Navigation Band Navigation band: colors represent different address space of the binary See default color in Options->Color->Navigation Band See other colors in the options Dark blue->user written code Light blue->lib code We should perform our analysis in user written code -> dark blue area
34
Jump to Location Press “G” key to jump t virtual memory address or named location, tried previous one: sub_401790 Brings us back in here
35
Using Cross-references
Cross-reference – tells where a function is called or where a string is used. Press“X”- windows will list all locations this loc_4012D1 is called.
36
Analyzing Functions IDAPro: recognize functions/local variables/parameters and label them. IDA discovers local variables for you Local variables: prefix var_ Parameter: prefix arg_ Dummy name: renaming would make more sense during analysis. But need to understand first.
37
Graph Views Play with one of these buttons to see graph views
Cross-reference Graph Can change to decimal, octal, binary – right click
38
Redefining Code and Data
Bytes could be occasionally categorized incorrectly. Code may be defined as data; data defined as code Press ‘U’Key to undefine functions, code or data -> becomes raw bytes; press ‘C’ to define
39
In-class homeworks
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.