Julio Auto [julio {funny a} julioauto com]. The Problem The Solution Demo Solution Details Whats Next? Greetings & References.

Slides:



Advertisements
Similar presentations
Practical Malware Analysis
Advertisements

Part IV: Memory Management
Announcements You survived midterm 2! No Class / No Office hours Friday.
Code Composer Department of Electrical and Computer Engineering
Week 3. Assembly Language Programming  Difficult when starting assembly programming  Have to work at low level  Use processor instructions >Requires.
Recursion CS 367 – Introduction to Data Structures.
There are two types of addressing schemes:
Annoucements  Next labs 9 and 10 are paired for everyone. So don’t miss the lab.  There is a review session for the quiz on Monday, November 4, at 8:00.
Lecture 6 Machine Code: How the CPU is programmed.
Assembly 02. Outline mov Command Registers Memory EFLAGS Arithmetic 1.
Snick  snack A Working Computer Slides based on work by Bob Woodham and others.
PC hardware and x86 3/3/08 Frans Kaashoek MIT
CS2422 Assembly Language & System Programming October 3, 2006.
1 ICS 51 Introductory Computer Organization Fall 2006 updated: Oct. 2, 2006.
Practical Session 3. The Stack The stack is an area in memory that its purpose is to provide a space for temporary storage of addresses and data items.
Interrupt Processing Haibo Wang ECE Department
CS 300 – Lecture 19 Intro to Computer Architecture / Assembly Language C Coding & The Simulator Caches.
8/14/03ALADDIN REU Symposium Implementing TALT William Lovas with Karl Crary.
Microprocessors Introduction to ia64 Architecture Jan 31st, 2002 General Principles.
1 Programming & Programming Languages Overview l Machine operations and machine language. l Example of machine language. l Different types of processor.
Microprocessors Frame Pointers and the use of the –fomit-frame-pointer switch Feb 25th, 2002.
September 22, 2014 Pengju (Jimmy) Jin Section E
Chapter 4 Basic Instructions. 4.1 Copying Data mov Instructions mov (“move”) instructions are really copy instructions, like simple assignment statements.
Memory & Storage Architecture Seoul National University Computer Architecture “ Bomb Lab Hints” 2nd semester, 2014 Modified version : The original.
Code Generation CS 480. Can be complex To do a good job of teaching about code generation I could easily spend ten weeks But, don’t have ten weeks, so.
Garbage Collection and High-Level Languages Programming Languages Fall 2003.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
© Janice Regan, CMPT 128, Jan CMPT 128 Introduction to Computing Science for Engineering Students Creating a program.
CEG 320/520: Computer Organization and Assembly Language ProgrammingIntel Assembly 1 Intel IA-32 vs Motorola
Dr. José M. Reyes Álamo 1.  The 80x86 memory addressing modes provide flexible access to memory, allowing you to easily access ◦ Variables ◦ Arrays ◦
CS 11 C track: lecture 5 Last week: pointers This week: Pointer arithmetic Arrays and pointers Dynamic memory allocation The stack and the heap.
Discussion of Assignment 9 1 CSE 2312 Computer Organization and Assembly Language Programming Vassilis Athitsos University of Texas at Arlington.
Code Generation Gülfem Savrun Yeniçeri CS 142 (b) 02/26/2013.
Introduction: Exploiting Linux. Basic Concepts Vulnerability A flaw in a system that allows an attacker to do something the designer did not intend,
(A radical interpretation) Tomo Lennox Bow Tie computer services Why Agile Works.
Testing and Debugging Version 1.0. All kinds of things can go wrong when you are developing a program. The compiler discovers syntax errors in your code.
The x86 Architecture Lecture 15 Fri, Mar 4, 2005.
IA32 (Pentium) Processor Architecture. Processor modes: 1.Protected (mode we will study) – 32-bit mode – 32-bit (4GB) address space 2.Virtual 8086 modes.
Welcome to CS61A Disc. 29/47 :D Dickson Tsai OH: Tu, Th 4-5pm 411 Soda Previous stop: None >>> Today: Working effectively in.
CNIT 127: Exploit Development Ch 3: Shellcode. Topics Protection rings Syscalls Shellcode nasm Assembler ld GNU Linker objdump to see contents of object.
1 ICS 51 Introductory Computer Organization Fall 2009.
26-Nov-15 (1) CSC Computer Organization Lecture 6: Pentium IA-32.
CNIT 127: Exploit Development Ch 1: Before you begin.
Assembly Language. Symbol Table Variables.DATA var DW 0 sum DD 0 array TIMES 10 DW 0 message DB ’ Welcome ’,0 char1 DB ? Symbol Table Name Offset var.
1 Carnegie Mellon Assembly and Bomb Lab : Introduction to Computer Systems Recitation 4, Sept. 17, 2012.
Pointers COP3275 – PROGRAMMING USING C DIEGO J. RIVERA-GUTIERREZ.
Superoptimization Venkatesh Karthik Srinivasan Guest Lecture in CS 701, Nov. 10, 2015.
Efficiently Solving Computer Programming Problems Doncho Minkov Telerik Corporation Technical Trainer.
Arrays. Outline 1.(Introduction) Arrays An array is a contiguous block of list of data in memory. Each element of the list must be the same type and use.
1 Assembly Language: Function Calls Jennifer Rexford.
Some of the utilities associated with the development of programs. These program development tools allow users to write and construct programs that the.
Chapter 8 String Operations. 8.1 Using String Instructions.
©SoftMoore ConsultingSlide 1 Code Optimization. ©SoftMoore ConsultingSlide 2 Code Optimization Code generation techniques and transformations that result.
Programs – Calling Conventions
Component 1.6.
Optimization Code Optimization ©SoftMoore Consulting.
Introduction to Information Security
Malware Incident Response  Dynamic Analysis - 2
CS/COE0447 Computer Organization & Assembly Language
Computer Architecture “Bomb Lab Hints”
Compiler Construction
Data-Related Operators and Directives
Systems I Pipelining II
MIPS Procedure Calls CSE 378 – Section 3.
Practical Session 4.
X86 Assembly Review.
Systems I Pipelining II
Systems I Pipelining II
Computer Architecture and System Programming Laboratory
Presentation transcript:

Julio Auto [julio {funny a} julioauto com]

The Problem The Solution Demo Solution Details Whats Next? Greetings & References

We will be talking about analyzing closed- source software here Absolutely no debugging information needed However... Depending on the complexity of the bug, even people with the source might opt for this analysis too E.g. Vendors receiving crash reports

Sometimes people just have to analyze bugs in closed-source software These bugs may come from: A fuzzing session Contributor-sent Proof-of-Concept codes In-the-wild exploit code Etc... As varying as the sources of bugs are the reasons why one wants to analyze them, but this is irrelevant. The fact is...

ANALYZING BUGS CAN BE HARD! A seasoned reverse engineer may take weeks to get somewhere If the target software is too big If the data consumed is in a very complex and/or undisclosed format If bugs in this target are so rare that your reversing team has no previous experience with it But which bugs do we mostly care for?

Analyzing bugs is very broad No./write-me-a-very-detailed-advisory We will concentrate in answering one question: what exact part of my data made the program crash? Understanding that and how such data is transformed is primordial

Dynamic Dataflow Analysis Watching data and its ramifications as the doomed program executes What we do really is Taint Analysis We start with a subset of the programs data: the attackers input – assume its evil Its ramifications are tainted memory, tainted registers... but we do it backwards.

Is any of these from the Evil Input? This is of interest Is any of these of interest? This is the Evil Input TAINT ANALYSIS BACKWARDS TAINT ANALYSIS

So we really dont care about every tainted piece of data in the process space Most of it is legitimate, anyway Thus, we avoid the explosion of watched data Plus we can do stuff like: Bug: mov eax, [esi] (where esi = DEADBEEFh) Analysis runs and reports: esi = user[4] + var_unk * 8

This is all done in two steps: tracing and analysis First we trace the program from a good point until it crashes The trace is incrementally dumped to a file Not just the disassembly, but also some extra info E.g.: In the past slides example, effective address ([esi]) == DEADBEEFh Then the trace file goes under analysis

Target starts Evil Input enters (and we start tracing) Target crashes! The good starting point

So we feed the trace file to the analyzer and tell it: Address ranges from ABCDh to ACCDh and from DCBAh to DCCAh held Evil Input I wanna know if esi was tainted by Evil Input And magic happens!

Considerations Tracing is very time-consuming For the bug Ill analyze as an example, it takes about 2 hours to dump the 650,000+ instructions it executes The analysis... not so much 1 to 2 minutes May sound like much, but how long would take to do it manually? Plus, you can always use this time to do something else while the computer is working for you

Introducing... Visual Data Tracer!

The VDT Tracer is implemented as a WinDbg extension Because WinDbg is free and its a great debugger The VDT Analyzer is a stand-alone C++ app The tracer needs to understand some simple instruction semantics E.g.: The source and destination operands Currently only the basic x86 subset is implemented (no x87, MMX, etc)

The semantic rules are simplified to avoid dumping useless info to the trace file E.g.: a push does not meaninfgully change esp (same for inc, dec, and their destination ops) They are also written to fit the very simplistic format of the trace file entries All of this makes the analysis easier, thus faster, and yet useful

Trace file entry: Mnemonic Destination operand Source operand Up to three source operand dependences Dependences are, for example, the elements of an indirectly addressed memory operand This effectively exposes the dataflow relations as a Tree (rooted at the crash instruction) Performing the backwards taint analysis becomes then a matter of searching the tree, which VDT does with a BFS algorithm

Putting it together so far mov edi, 0x1234 ; dst=edi, src=0x1234 mov eax, [0xABCD] ; dst=eax, src=ptr 0xABCD ; Note 0xABCD is evil addr lea ebx, [eax+ecx*8] ; dst=ebx, src=eax, srcdep1=ecx mov [edi], ebx ; dst=ptr 0x1234, src=ebx mov esi, [edi] ; dst=esi, src=ptr 0x1234, srcdep1=edi mov edx, [esi] ; Crash!!!

Simplifying semantic rules to fit that format is not always easy CMPXCHG r/m32, r32 Compare EAX with r/m32. If equal, ZF is set and r32 is loaded into r/m32. Else, clear ZF and load r/m32 into EAX. The aftermath: the need for conditional taints i.e. One of the possibilities of controlling r/m32 is controlling r32 AND eax Note that alternative taints is also existant, implemented in the form of srcdep{1,2,3}

Other subtleties to watch for AH defines EAX EAX defines AL AL does not define AH Similar problem for 1-byte and 2-byte memory accesses EAX (32)AX (16)AL (8)AH (8) Unnamed (16)

Extending the coverage of x86 Enhancing speed God knows how... Heuristically detecting user input e.g. By making the tracer understand CreateFile() Automatic exploit generation What else? Any ideas, let me know...

SpiderPig Project Very similar ideas, different approach !exploitable A more superficial (but much faster) tool for bug triaging If you have many bugs to triage, you can first run !exploitable on them and, then, use VDT on those that seem really interesting

Julien Vanegue For all the lecturing, motivating and supporting Piotr Bania For discussing DDF analysis and much more People from PSV ( For letting me idle on IRC, leeching their knowledge Everyone else who talks to me about security and similarly cool stuff

Julio Auto [julio {funny a} julioauto com]