Analyzing Memory Accesses in Obfuscated x86 Executables Michael Venable Mohamed R. Choucane Md. Enamul Karim Arun Lakhotia (Presenter) DIMVA 2005 Wien.

Slides:



Advertisements
Similar presentations
Practical Malware Analysis
Advertisements

Fabián E. Bustamante, Spring 2007 Machine-Level Programming II: Control Flow Today Condition codes Control flow structures Next time Procedures.
ByteWeight: Learning to Recognize Functions in Binary Code
© 2006 Nathan RosenblumMarch 2006Unconventional Code Constructs The New Dyninst Code Parser: Binary Code Isn't as Simple as it Used to Be Nathan Rosenblum.
Assembly Language for x86 Processors 6th Edition Chapter 5: Procedures (c) Pearson Education, All rights reserved. You may modify and copy this slide.
Native x86 Decompilation Using Semantics-Preserving Structural Analysis and Iterative Control-Flow Structuring Edward J. Schwartz *, JongHyup Lee ✝, Maverick.
C Programming and Assembly Language Janakiraman V – NITK Surathkal 2 nd August 2014.
Assembly Code Verification Using Model Checking Hao XIAO Singapore University of Technology and Design.
University of Washington Last Time For loops  for loop → while loop → do-while loop → goto version  for loop → while loop → goto “jump to middle” version.
Machine-Level Programming III: Procedures Apr. 17, 2006 Topics IA32 stack discipline Register saving conventions Creating pointers to local variables CS213.
1 ICS 51 Introductory Computer Organization Fall 2006 updated: Oct. 2, 2006.
1 Function Calls Professor Jennifer Rexford COS 217 Reading: Chapter 4 of “Programming From the Ground Up” (available online from the course Web site)
Accessing parameters from the stack and calling functions.
– 1 – , F’02 ICS05 Instructor: Peter A. Dinda TA: Bin Lin Recitation 4.
Assembly תרגול 8 פונקציות והתקפת buffer.. Procedures (Functions) A procedure call involves passing both data and control from one part of the code to.
Stack Activation Records Topics IA32 stack discipline Register saving conventions Creating pointers to local variables February 6, 2003 CSCE 212H Computer.
Stacks and Frames Demystified CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.
David Evans CS201j: Engineering Software University of Virginia Computer Science Lecture 18: 0xCAFEBABE (Java Byte Codes)
CEG 320/520: Computer Organization and Assembly Language ProgrammingIntel Assembly 1 Intel IA-32 vs Motorola
Machine-Level Programming 3 Control Flow Topics Control Flow Switch Statements Jump Tables.
Fabián E. Bustamante, Spring 2007 Machine-Level Programming III - Procedures Today IA32 stack discipline Register saving conventions Creating pointers.
EECS 354 Network Security Reverse Engineering. Introduction Preventing Reverse Engineering Reversing High Level Languages Reversing an ELF Executable.
RIVERSIDE RESEARCH INSTITUTE Deobfuscator: An Automated Approach to the Identification and Removal of Code Obfuscation Eric Laspe, Reverse Engineer Jason.
Machine-Level Programming 3 Control Flow Topics Control Flow Switch Statements Jump Tables.
CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans Lecture 22: Unconventional.
Microprocessors The ia32 User Instruction Set Jan 31st, 2002.
Gogul Balakrishnan, Radu Gruian and Thomas Reps Computer Science Dept., Univ. of Wisconsin GrammaTech, Inc. April, 2005 CodeSurfer / x86 A Platform for.
Assembly Language. Symbol Table Variables.DATA var DW 0 sum DD 0 array TIMES 10 DW 0 message DB ’ Welcome ’,0 char1 DB ? Symbol Table Name Offset var.
Machine-level Programming III: Procedures Topics –IA32 stack discipline –Register saving conventions –Creating pointers to local variables.
Introduction to Assembly II Abed Asi Extended System Programming Laboratory (ESPL) CS BGU Fall 2013/2014.
Buffer Overflow Attack- proofing of Code Binaries Ramya Reguramalingam Gopal Gupta Gopal Gupta Department of Computer Science University of Texas at Dallas.
October 1, 2003Serguei A. Mokhov, 1 SOEN228, Winter 2003 Revision 1.2 Date: October 25, 2003.
University of Amsterdam Computer Systems – the instruction set architecture Arnoud Visser 1 Computer Systems The instruction set architecture.
Introduction to Assembly II Abed Asi Extended System Programming Laboratory (ESPL) CS BGU Fall 2014/2015.
1 Assembly Language: Function Calls Jennifer Rexford.
Calling Procedures C calling conventions. Outline Procedures Procedure call mechanism Passing parameters Local variable storage C-Style procedures Recursion.
Overview of Back-end for CComp Zhaopeng Li Software Security Lab. June 8, 2009.
Analyzing Memory Accesses in x86 Executables Gogul Balakrishnan Thomas Reps University of Wisconsin.
Gogul Balakrishnan Thomas Reps University of Wisconsin Analyzing Memory Accesses in x86 Executables.
Correct RelocationMarch 20, 2016 Correct Relocation: Do You Trust a Mutated Binary? Drew Bernat
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Paradyn Project Safe and Efficient Instrumentation Andrew Bernat.
ICS51 Introductory Computer Organization Accessing parameters from the stack and calling functions.
Chapter 8 String Operations. 8.1 Using String Instructions.
Introduction to Information Security
Recitation 3: Procedures and the Stack
Reading Condition Codes (Cont.)
Conditional Branch Example
Techniques, Tools, and Research Issues
Computer Architecture
Chapter 3 Machine-Level Representation of Programs
Computer Architecture and Assembly Language
Y86 Processor State Program Registers
Ramblr Making Reassembly Great Again
Machine-Level Programming 4 Procedures
Condition Codes Single Bit Registers
C Prog. To Object Code text text binary binary Code in files p1.c p2.c
Procedures – Overview Lecture 19 Mon, Mar 28, 2005.
Assembly Language Programming II: C Compiler Calling Sequences
Machine-Level Programming III: Procedures Sept 18, 2001
EECE.3170 Microprocessor Systems Design I
EECE.3170 Microprocessor Systems Design I
Machine-Level Programming: Introduction
Multi-modules programming
Chapter 3 Machine-Level Representation of Programs
X86 Assembly Review.
CSC 497/583 Advanced Topics in Computer Security
Computer Architecture and System Programming Laboratory
Presentation transcript:

Analyzing Memory Accesses in Obfuscated x86 Executables Michael Venable Mohamed R. Choucane Md. Enamul Karim Arun Lakhotia (Presenter) DIMVA 2005 Wien

Talk Contents Motivation Previous work VSA (Balakrishnan and Reps, 2004) Abstract Stack Graph (Lakhotia and Kumar 2004) VSA+ASG Domain Interpretation example Application Detect Call obfuscations Implementation Future and Conclusions

Motivation Past project: detecting malicious behavior –Match models of malicious behavior models typically involved system calls Typical application of accepted techniques –implemented prototype using IDA Pro –standard application of formal theory (model checking) –tried against common virus (Win32.Evol)

Result: FAILURE Our original tool didn’t stand a chance –our technology wasn’t even tested –virus writer fouled up the entire analysis pipeline Success out of failure: –showed us where a major battle lies –refocus on hardening analysis techniques

Code analysis: benign/adversarial Half a century of program analysis –compilers, optimizers, checkers, refactoring tools –read in programs, analyze, visualize, transform Mostly for friendly programs –yet these are often used to battle malware –not prepared to face adversarial code soft targets fragile infrastructures –would we send Mouseketeers to battlefield?

certify / reject disassemble Typical analysis pipelines extract procedures extract control & data flow verify property VIRUS DATABASE

certify / reject disassemble Problem: Not hardened extract procedures extract control & data flow verify property DATABASE SILENT FAILURE! D I S A B L E D !

Attack: Disassembly decode machine instructions (byte seq) disassemble extract procedures extract control & data flow verify property : 5d pop %ebp : c3 ret : 55 push %ebp : 89 e5 mov %esp,%ebp : 83 ec 08 sub $0x8,%esp 40106b: eb 05 jmp 0x d: e8 ee ff ff ff call 0x : e8 e9 ff ff ff call 0x : c7 45 fc movl $0x0,0xfffffffc(%ebp) 40107e: 81 7d fc e cmpl $0x3e7,0xfffffffc(%ebp) ORIG BYTESASSEMBLY : 5d pop %ebp : c3 ret : 55 push %ebp : 89 e5 mov %esp,%ebp : 83 ec 08 sub $0x8,%esp 40106b: eb 05 jmp 0x d: c7 ee ff ff ff e8 mov $0xe8ffffff,%esi : e9 ff ff ff c7 jmp 0xc : 45 inc %ebp : fc cld malicious func jump over junk bad disassembly (no jump target)

Attack: Extract procedures determine procedure boundaries (entry & exit) disassemble extract procedures extract control & data flow verify property : 5d pop %ebp : c3 ret : : 55 push %ebp : 89 e5 mov %esp,%ebp : 83 ec 08 sub $0x8,%esp 40106b: eb 05 jmp d: e8 ee ff ff ff call : e8 e9 ff ff ff call : c7 45 fc movl $0x0,0xfffffffc(%ebp) malicious func

401063: 5d pop %ebp : c3 ret : : 55 push %ebp : 89 e5 mov %esp,%ebp : 83 ec 08 sub $0x8,%esp 40106b: ff pushl : ff pushl : c3 ret : c7 45 fc movl $0x0,0xfffffffc(%ebp) Attack: Extract CF & DF trace call structure (control flow) disassemble extract procedures extract control & data flow verify property : 5d pop %ebp : c3 ret : : 55 push %ebp : 89 e5 mov %esp,%ebp : 83 ec 08 sub $0x8,%esp 40106b: eb 05 jmp d: e8 ee ff ff ff call : e8 e9 ff ff ff call : c7 45 fc movl $0x0,0xfffffffc(%ebp) L0: call FL0: push L1 L1:  push F L1: ret instr. substitution no call found

40106b: ff pushl : ff pushl : c3 ret : c7 45 fc movl $0x0,0xfffffffc(%ebp) Attack: Verify property verify security or match pattern/signature Transformations destroy signature/pattern match –eg metamorphic viruses: self-transforming –instruction substitution, nop insertion, etc. disassemble extract procedures push x push y  push z ret pop push y ret extract control & data flow verify property

Motivation summary Soft target attacks –disassembler disruption –obfuscated calls –metamorphic transformations Pipeline disruptions –silent failures from every stage

A shift in research focus Question How to make program analysis battle-ready?

Advertisement: Portfolio of results Create malware phylogeny Deobfuscate Calls Reverse self transformations All results are “Patent pending”

Talk Contents Motivation Previous work VSA (Balakrishnan and Reps, 2004) Abstract Stack Graph (Lakhotia and Kumar 2004) VSA+ASG Domain Interpretation example Application Detect Call obfuscations Implementation Future and Conclusions

Value-Set Analysis Determines values of program variables statically RIC: Reduced Interval Congruence –Interval: [0,4] = {0, 1, 2, 3, 4} –RIC: 4[0,4] + 10 = {10, 14, 18, 22, 26} Can ensure memory accesses align to memory boundaries

Example MOV eax, 4 CMP eax, ebx JE Label SUB eax, 2ADD eax, 2 MOV [ADDR], eax eax = 0[0,0]+4 eax = 0[0,0]+6 eax = 0[0,0]+2 eax = 4[0,1]+2

Operations Easily supported operations –ADD –SUB –MOV –others Unsupported Operations –DIV –OR –AND –…

VSA Limitations Assumes executable conforms to standard conventions Depends on control-flow graph We would like to remove these two requirements

Abstract Stack Each node stores an instruction address Example: L1: PUSH 20 L2: PUSH 10 L3: POP ebx L4: PUSH 5 20L110L2 5L4 Actual StackAbstract Stack

TOP Abstract Stack Graph Contains all possible abstract stack Example: L1: PUSH 20 L2: PUSH 10 L3: POP ebx L4: PUSH 5 L1 L2L4 Abstract Stack Graph

Apply set of rules to detect obfuscations Limitations –Does not know register/memory contents –Can’t handle instructions like: sub esp, eax mov esp, eax –Can’t handle indirect jumps/calls

Talk Contents Motivation Previous work VSA (Balakrishnan and Reps, 2004) Abstract Stack Graph (Lakhotia and Kumar 2004) VSA+ASG Domain Interpretation example Application Detect Call obfuscations Implementation Future and Conclusions

Our Goal VSA+ASG Achieve: –Support more instructions (ex. sub esp, eax ) –Support indirect jumps/calls –Not rely on CFG –Do not assume binary obeys common standards

Domain RIC STACK-LOCATION VALUE := 2-tuple –RIC ⊤ –P(STACK-LOCATION) ⊤ STATE := 3-tuple –REGISTER  VALUE –STACK-LOCATION  VALUE –STACK-LOCATION  STACK-LOCATION

Operation: Addition +: VALUE × VALUE × STATE  VALUE Pairwise addition of both components of VALUE +: RIC × RIC  RIC +: STACK-LOCATION × STACK-LOCATION = NULL +: RIC × STACK-LOCATION  P(STACK-LOCATION) RIC+RIC –Equals approximation of the sum of each value in the two RICs Stack-location + Stack-location –Unable to do, since requires knowing actual addresses

Operation: Addition Stack-location + RIC –Travel down the stack N0 N1 N2 N0 N1 N2 esp esp+4 BEFORE AFTER

Operation: Subtraction RIC – RIC, Stack-location – Stack-location –Similar to addition Stack-location – RIC –Travel up the stack or add new nodes N0 N1 N0 N1 N2 esp esp – 4 BEFOREAFTER

Operation: Memory Operations Push –Creates a new stack-location –Stores pushed value at stack-location Pop –Retrieves value at top of stack –Increments top of stack

Operation: Memory Operations Store –Places a value in memory Load –Retrieves a value from memory Other operations defined in paper

Interpretation Example Main –Pushes two values onto stack –Calls Max Max –Places max of parameters in eax VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 RET 8

Interpretation Example VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 RET 8 STATE N0:  eax = ( ⊤, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N0})

Interpretation Example VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 RET 8 STATE eax = ( ⊤, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N1}) N0:  N1: (4, ⊤ )

Interpretation Example VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 RET 8 STATE eax = ( ⊤, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N2}) N0:  N1: (4, ⊤ ) N2: (2, ⊤ )

Interpretation Example VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 RET 8 STATE eax = ( ⊤, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N3}) N0:  N1: (4, ⊤ ) N2: (2, ⊤ ) N3: (103, ⊤ )

Interpretation Example VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 RET 8 STATE eax = (2, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N3}) N0:  N1: (4, ⊤ ) N2: (2, ⊤ ) N3: (103, ⊤ )

Interpretation Example VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 RET 8 STATE eax = (2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N3}) N0:  N1: (4, ⊤ ) N2: (2, ⊤ ) N3: (103, ⊤ )

Interpretation Example VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 RET 8 STATE eax = (2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N3}) N0:  N1: (4, ⊤ ) N2: (2, ⊤ ) N3: (103, ⊤ )

Interpretation Example VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 RET 8 STATE eax = (2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N3}) N0:  N1: (4, ⊤ ) N2: (2, ⊤ ) N3: (103, ⊤ )

Interpretation Example VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 RET 8 STATE eax = (4, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N3}) N0:  N1: (4, ⊤ ) N2: (2, ⊤ ) N3: (103, ⊤ )

Interpretation Example VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 RET 8 STATE eax = (2[0,1]+2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N0}) N0:  N1: (4, ⊤ ) N2: (2, ⊤ ) N3: (103, ⊤ )

Interpretation Example VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 RET 8 STATE eax = (2[0,1]+2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N0}) N0:  N1: (4, ⊤ ) N2: (2, ⊤ ) N3: (103, ⊤ )

Talk Contents Motivation Previous work VSA (Balakrishnan and Reps, 2004) Abstract Stack Graph (Lakhotia and Kumar 2004) VSA+ASG Domain Interpretation example Application Detect Call obfuscations Implementation Future and Conclusions

Application Detect –Call replaced with equivalent instruction(s) –Return address manually removed from stack Flag as obfuscation if –instruction  retn AND topOfStack.creator  call –instruction  pop AND topOfStack.creator  call Rules are applied to each instruction

Replace CALL with JMP VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 PUSH JMP Max 104 … Max: 105 MOV eax, [esp+4] 106 MOV ebx, [esp+8] 107 CMP eax, ebx 108 JG Done 109 MOV eax, ebx Done: 110 RET 8 STATE N0:  eax = ( ⊤, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N0})

Replace CALL with JMP VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 PUSH JMP Max 104 … Max: 105 MOV eax, [esp+4] 106 MOV ebx, [esp+8] 107 CMP eax, ebx 108 JG Done 109 MOV eax, ebx Done: 110 RET 8 STATE N0:  eax = ( ⊤, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N1}) N1: (4, ⊤ ) 100

Replace CALL with JMP VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 PUSH JMP Max 104 … Max: 105 MOV eax, [esp+4] 106 MOV ebx, [esp+8] 107 CMP eax, ebx 108 JG Done 109 MOV eax, ebx Done: 110 RET 8 STATE N0:  eax = ( ⊤, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N2}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101

Replace CALL with JMP VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 PUSH JMP Max 104 … Max: 105 MOV eax, [esp+4] 106 MOV ebx, [esp+8] 107 CMP eax, ebx 108 JG Done 109 MOV eax, ebx Done: 110 RET 8 STATE N0:  eax = ( ⊤, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N3}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (104, ⊤ ) 102

Replace CALL with JMP VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 PUSH JMP Max 104 … Max: 105 MOV eax, [esp+4] 106 MOV ebx, [esp+8] 107 CMP eax, ebx 108 JG Done 109 MOV eax, ebx Done: 110 RET 8 STATE N0:  eax = ( ⊤, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N3}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (104, ⊤ ) 102

Replace CALL with JMP VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 PUSH JMP Max 104 … Max: 105 MOV eax, [esp+4] 106 MOV ebx, [esp+8] 107 CMP eax, ebx 108 JG Done 109 MOV eax, ebx Done: 110 RET 8 STATE N0:  eax = (2, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N3}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (104, ⊤ ) 102

Replace CALL with JMP VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 PUSH JMP Max 104 … Max: 105 MOV eax, [esp+4] 106 MOV ebx, [esp+8] 107 CMP eax, ebx 108 JG Done 109 MOV eax, ebx Done: 110 RET 8 STATE N0:  eax = (2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N3}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (104, ⊤ ) 102

Replace CALL with JMP VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 PUSH JMP Max 104 … Max: 105 MOV eax, [esp+4] 106 MOV ebx, [esp+8] 107 CMP eax, ebx 108 JG Done 109 MOV eax, ebx Done: 110 RET 8 STATE N0:  eax = (2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N3}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (104, ⊤ ) 102

Replace CALL with JMP VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 PUSH JMP Max 104 … Max: 105 MOV eax, [esp+4] 106 MOV ebx, [esp+8] 107 CMP eax, ebx 108 JG Done 109 MOV eax, ebx Done: 110 RET 8 STATE N0:  eax = (2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N3}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (104, ⊤ ) 102

Replace CALL with JMP VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 PUSH JMP Max 104 … Max: 105 MOV eax, [esp+4] 106 MOV ebx, [esp+8] 107 CMP eax, ebx 108 JG Done 109 MOV eax, ebx Done: 110 RET 8 STATE N0:  eax = (4, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N3}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (104, ⊤ ) 102

Replace CALL with JMP VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 PUSH JMP Max 104 … Max: 105 MOV eax, [esp+4] 106 MOV ebx, [esp+8] 107 CMP eax, ebx 108 JG Done 109 MOV eax, ebx Done: 110 RET 8 STATE N0:  eax = (2[0,1]+2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N0}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (104, ⊤ ) 102

Replace CALL with JMP VAR1 EQU 4 VAR2 EQU 2 Main: 100 PUSH VAR1 101 PUSH VAR2 102 PUSH JMP Max 104 … Max: 105 MOV eax, [esp+4] 106 MOV ebx, [esp+8] 107 CMP eax, ebx 108 JG Done 109 MOV eax, ebx Done: 110 RET 8 STATE N0:  eax = (2[0,1]+2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N0}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (104, ⊤ ) 102 Return address placed on stack by push

Removal of Return Address Main: 100 PUSH PUSH CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 POP ebx 110 ADD esp, JMP ebx STATE N0:  eax = ( ⊤, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N0})

Removal of Return Address Main: 100 PUSH PUSH CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 POP ebx 110 ADD esp, JMP ebx STATE N0:  eax = ( ⊤, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N1}) N1: (4, ⊤ ) 100

Removal of Return Address Main: 100 PUSH PUSH CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 POP ebx 110 ADD esp, JMP ebx STATE N0:  eax = ( ⊤, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N2}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101

Removal of Return Address Main: 100 PUSH PUSH CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 POP ebx 110 ADD esp, JMP ebx STATE N0:  eax = ( ⊤, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N3}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (103, ⊤ ) 102

Removal of Return Address Main: 100 PUSH PUSH CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 POP ebx 110 ADD esp, JMP ebx STATE N0:  eax = (2, ⊤ ) ebx = ( ⊤, ⊤ ) esp = ( ⊤, {N3}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (103, ⊤ ) 102

Removal of Return Address Main: 100 PUSH PUSH CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 POP ebx 110 ADD esp, JMP ebx STATE N0:  eax = (2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N3}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (103, ⊤ ) 102

Main: 100 PUSH PUSH CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 POP ebx 110 ADD esp, JMP ebx STATE N0:  eax = (2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N3}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (103, ⊤ ) 102 Removal of Return Address

Main: 100 PUSH PUSH CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 POP ebx 110 ADD esp, JMP ebx STATE N0:  eax = (2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N3}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (103, ⊤ ) 102 Removal of Return Address

Main: 100 PUSH PUSH CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 POP ebx 110 ADD esp, JMP ebx STATE N0:  eax = (4, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N3}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (103, ⊤ ) 102 Removal of Return Address

Main: 100 PUSH PUSH CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 POP ebx 110 ADD esp, JMP ebx STATE N0:  eax = (2[0,1]+2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N2}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (103, ⊤ ) 102 Removal of Return Address

Main: 100 PUSH PUSH CALL Max 103 … Max: 104 MOV eax, [esp+4] 105 MOV ebx, [esp+8] 106 CMP eax, ebx 107 JG Done 108 MOV eax, ebx Done: 109 POP ebx 110 ADD esp, JMP ebx STATE N0:  eax = (2[0,1]+2, ⊤ ) ebx = (4, ⊤ ) esp = ( ⊤, {N2}) N1: (4, ⊤ ) 100 N2: (2, ⊤ ) 101 N3: (103, ⊤ ) 102 Return address is popped off stack Removal of Return Address

Talk Contents Motivation Previous work VSA (Balakrishnan and Reps, 2004) Abstract Stack Graph (Lakhotia and Kumar 2004) VSA+ASG Domain Interpretation example Application Detect Call obfuscations Implementation Future and Conclusions

OllyDbg EclipsePrototype Implemented on Eclipse platform Disassembler Parser Interpreter Analyzer Binary image Assembly instructions ASTAST annotated with state Set of obfuscations

Obfuscations shown as list Obfuscations shown on ruler Register and stack contents

Pushes return address and function address

Talk Contents Motivation Previous work VSA (Balakrishnan and Reps, 2004) Abstract Stack Graph (Lakhotia and Kumar 2004) VSA+ASG Domain Interpretation example Application Detect Call obfuscations Implementation Future and Conclusions

Prototype Limitations More Memory Support –mov [eax], ebx –GetModuleHandle/GetProcAddress PUSH offset DLL_NAME; “kernel32.dll” CALL GetModuleHandleA PUSH offset PROC_NAME; “ExitProcess” PUSH eax CALL GetProcAddress PUSH 0 CALL eax

Prototype Limitations Efficiency –Many paths to each instruction –More efficient algorithms needed CFG with branches

General Limitations More RIC operations –Too many undefined RICs What about obfuscated call–obfuscated return pairs –Not a problem with system calls

General Limitations Structure Exception Handling Main: PUSH offset Handler ; set up exception handler PUSH dword ptr FS:[0] MOV FS:[0], esp MOV edx, 0 ; force an exception MOV eax, 0 DIV eax ; (edx / eax) PUSH 0 CALL ExitProcess Handler:... ; real code goes here

Conclusion Method –Combines VSA+ASG –Interpret using abstract values –Apply rules to abstract stack graph to detect obfuscations Results –Can handle more instructions ( mov esp, eax ) –Can handle indirect jumps/calls –Anticipate support for GetModuleHandle/GetProcAddress Prototype performed as expected, though more work is needed