BinHunt: Automatically Finding Semantic Differences in Binary Programs Debian Gao Michael K. Reiter Dawn Song ICICS 2008: 10th International Conference.

Slides:



Advertisements
Similar presentations
Overview Structural Testing Introduction – General Concepts
Advertisements

Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
1 CS 201 Compiler Construction Machine Code Generation.
ANTLR in SSP Xingzhong Xu Hong Man Aug Outline ANTLR Abstract Syntax Tree Code Equivalence (Code Re-hosting) Future Work.
Decompilation of Binary Programs Christina Cifuentes & K. John Gough School of Computing Science Queensland University of Technology Presented by Conny.
Program Representations. Representing programs Goals.
IBinHunt: Binary Hunting with Inter-Procedural Control Flow Jiang Ming, Meng Pan, and Debin Gao College of Information Sciences and Technology, Penn State.
AUTOMATIC GENERATION OF CODE OPTIMIZERS FROM FORMAL SPECIFICATIONS Vineeth Kumar Paleri Regional Engineering College, calicut Kerala, India. (Currently,
SMU SRG reading by Tey Chee Meng: Automatic Patch-Based Exploit Generation is Possible: Techniques and Implications by David Brumley, Pongsin Poosankam,
David Brumley, Pongsin Poosankam, Dawn Song and Jiang Zheng Presented by Nimrod Partush.
1 Program Slicing Purvi Patel. 2 Contents Introduction What is program slicing? Principle of dependences Variants of program slicing Slicing classifications.
Differential Slicing: Identifying Causal Execution Differences for Security Applications Noah M. Johnson 1, Juan Caballero 2, Kevin Zhijie Chen 1, Stephen.
Linear Obfuscation to Combat Symbolic Execution Zhi Wang 1, Jiang Ming 2, Chunfu Jia 1 and Debin Gao 3 1 Nankai University 2 Pennsylvania State University.
1 Towards Automatic Discovery of Deviations in Binary Implementations with Applications to Error Detection and Fingerprint Generation David Brumley, Juan.
Introduction to Computer Programming in C
1 Towards Automatic Discovery of Deviations in Binary Implementations with Applications to Error Detection and Fingerprint Generation David Brumley, Juan.
Software Security Lecture 0 Fang Yu Dept. of MIS National Chengchi University Spring 2011.
7th Biennial Ptolemy Miniconference Berkeley, CA February 13, 2007 Causality Interfaces for Actor Networks Ye Zhou and Edward A. Lee University of California,
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
White Box Testing and Symbolic Execution Written by Michael Beder.
Intermediate Code. Local Optimizations
Topic 6 -Code Generation Dr. William A. Maniatty Assistant Prof. Dept. of Computer Science University At Albany CSI 511 Programming Languages and Systems.
Prof. Fateman CS164 Lecture 211 Local Optimizations Lecture 21.
Chapter 6: An Introduction to System Software and Virtual Machines
Data Flow Analysis Compiler Design Nov. 8, 2005.
1 Loop-Extended Symbolic Execution on Binary Programs Pongsin Poosankam ‡* Prateek Saxena * Stephen McCamant * Dawn Song * ‡ Carnegie Mellon University.
General approach to exploit detection and signature generation White-box  Need the source code Gray-box  More accurate. But need to monitor a program's.
Introduction & Overview CS4533 from Cooper & Torczon.
Jarhead Analysis and Detection of Malicious Java Applets Johannes Schlumberger, Christopher Kruegel, Giovanni Vigna University of California Annual Computer.
CCSA 221 Programming in C CHAPTER 2 SOME FUNDAMENTALS 1 ALHANOUF ALAMR.
Software Faults and Fault Injection Models --Raviteja Varanasi.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 17: Code Mining.
Vulnerability-Specific Execution Filtering (VSEF) for Exploit Prevention on Commodity Software Authors: James Newsome, James Newsome, David Brumley, David.
1 Secure Cooperative MIMO Communications Under Active Compromised Nodes Liang Hong, McKenzie McNeal III, Wei Chen College of Engineering, Technology, and.
Packet Vaccine: Blackbox Exploit Detection and Signature Generation Authors: XiaoFeng Wang Zhuowei Li Jong Youl Choi School of Informatics, Indiana University.
Overview of Software Testing 07/12/2013 WISTPC 2013 Peter Clarke.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
1 A Static Analysis Approach for Automatically Generating Test Cases for Web Applications Presented by: Beverly Leung Fahim Rahman.
Automatic Diagnosis and Response to Memory Corruption Vulnerabilities Authors: Jun Xu, Peng Ning, Chongkyung Kil, Yan Zhai, Chris Bookholt In ACM CCS’05.
Tracking with Unreliable Node Sequences Ziguo Zhong, Ting Zhu, Dan Wang and Tian He Computer Science and Engineering, University of Minnesota Infocom 2009.
Software Engineering Research Group, Graduate School of Engineering Science, Osaka University Analysis and Implementation Method of Program to Detect Inappropriate.
Vulnerability-Specific Execution Filtering (VSEF) for Exploit Prevention on Commodity Software Authors: James Newsome, James Newsome, David Brumley, David.
Introduction Algorithms and Conventions The design and analysis of algorithms is the core subject matter of Computer Science. Given a problem, we want.
Requirements-based Test Generation for Functional Testing (© 2012 Professor W. Eric Wong, The University of Texas at Dallas) 1 W. Eric Wong Department.
 Louena L. Manluctao  East Early College High School  Houston Independent School District  Dr. Guofei Gu  Assistant Professor  Department of Computer.
Roberto Paleari,Universit`a degli Studi di Milano Lorenzo Martignoni,Universit`a degli Studi di Udine Emanuele Passerini,Universit`a degli Studi di Milano.
Vigilante: End-to-End Containment of Internet Worms Authors : M. Costa, J. Crowcroft, M. Castro, A. Rowstron, L. Zhou, L. Zhang, and P. Barham In Proceedings.
Jose Sanchez 1 o Tielei Wang†, TaoWei†, Zhiqiang Lin‡, Wei Zou†. o Purdue University & Peking University o Proceedings of NDSS'09: Network and Distributed.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Yuchong Hu, Yinlong Xu, Xiaozhao Wang, Cheng Zhan and Pei.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
Introduction Program File Authorization Security Theorem Active Code Authorization Authorization Logic Implementation considerations Conclusion.
Cross Language Clone Analysis Team 2 February 3, 2011.
1 Software Testing & Quality Assurance Lecture 13 Created by: Paulo Alencar Modified by: Frank Xu.
The Development of a search engine & Comparison according to algorithms Sung-soo Kim The final report.
Lecture #1: Introduction to Algorithms and Problem Solving Dr. Hmood Al-Dossari King Saud University Department of Computer Science 6 February 2012.
Adaptive Android Kernel Live Patching
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 4 Control Flow Testing
Computer Programming.
COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE
QianZhu, Liang Chen and Gagan Agrawal
课程名 编译原理 Compiling Techniques
Lecture 8 Programming Paradigm & Languages. Programming Languages The process of telling the computer what to do Also known as coding.
CSC-682 Advanced Computer Security
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
Chapter 10: Compilers and Language Translation
IntScope: Automatically Detecting Integer overflow vulnerability in X86 Binary Using Symbolic Execution Tielei Wang, TaoWei, ZhingiangLin, weiZou Purdue.
Presentation transcript:

BinHunt: Automatically Finding Semantic Differences in Binary Programs Debian Gao Michael K. Reiter Dawn Song ICICS 2008: 10th International Conference on Information and Comunications Security

Conference ICICS : A bi-annual International Conference on Information, Communications and Signal Processing. The conference covers areas in Information Engineering, Communication Systems, Signal Processing, Multimedia Processing and Applications.

Papers Session V: Software security  BinHunt: Automatically Finding Semantic Differences in Binary Programs Debin Gao (a), Mike Reiter (b) and Dawn Song (c)  Enhancing Java ME Security Support with Resource Usage Monitoring Paolo Mori, Fabio Martinelli, Alessandro Castrucci and Francesco Roperti IIT-CNR, Italy  Pseudo-randomness Inside Web Browsers Guan Zhi, Zhang Long, Zhong Chen and Nan Xianghao Peking University, China

Author Debin Gao Michael K. Reiter Dawn Song

Debin Gao Automatically Adapting a Trained Anomaly Detector to Software Patches Peng Li, Debin Gao and Michael K. Reiter In Proceedings of the 12th International Symposium on Recent Advances in Intrusion Detection (RAID 2009) Bridging the Gap between Data-flow and Control-flow Analysis for Anomaly Detection Peng Li, Hyundo Park, Debin Gao and Jianming Fu In Proceedings of the 24th Annual Computer Security Applications Conference (ACSAC 2008) Gray-Box Extraction of Execution Graphs for Anomaly Detection Debin Gao, Michael K. Reiter and Dawn Song In Proceedings of the 11th ACM Conference on Computer and Communications Security (CCS 2004) On Gray-Box Program Tracking for Anomaly Detection Debin Gao, Michael K. Reiter and Dawn Song In Proceedings of the 13th USENIX Security Symposium (USENIX Security 2004) Assistant Professor School of Information Systems Singapore Management University

Michael K. Reiter Automatically adapting a trained anomaly detector to software patches P. Li, D. Gao and M. K. Reiter In Recent Advances in Intrusion Detection, 12th International Symposium, RAID 2009 Fast and black-box exploit detection and signature generation for commodity software X. Wang, Z. Li, J. Y. Choi, J. Xu, M. K. Reiter and C. Kil ACM Transactions on Information and System Security 12(2) On gray-box program tracking for anomaly detection D. Gao, M. K. Reiter and D. Song In Proceedings of the 13th USENIX Security Symposium Lawrence M. Slifkin Distinguished Professor Department of Computer Science University of North Carolina at Chapel HIll

Dawn Song Research Projects BitBlaze: Binary analysis for COTS protection and malicious code defense Binary Code Extraction and Interface Identification for Security Applications. Juan Caballero, Noah M. Johnson, Stephen McCamant, and Dawn Song. In Proceedings of the 17th Annual Network and Distributed System Security Symposium, February Loop-Extended Symbolic Execution on Binary Programs. Prateek Saxena, Pongsin Poosankam, Stephen McCamant, and Dawn Song. In Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), July BitBlaze: A New Approach to Computer Security via Binary Analysis. Dawn Song, David Brumley, Heng Yin, Juan Caballero, Ivan Jager, Min Gyung Kang, Zhenkai Liang, James Newsome, Pongsin Poosankam, and Prateek Saxena. In Proceedings of the 4th International Conference on Information Systems Security Associate Professor Computer Science Division University of California, Berkeley

Introduction BinHunt: It bases its analysis on the control flow of the programs using a new graph isomorphism technique, symbolic execution, and theorem proving for finding semantic differences in binary programs. Semantic differences: changes in the program functionality Syntactic differences: e.g. Different register allocation and basic block re-ordering

Challenge  A small change in the source code may cause the compiler to use a different register allocation in other parts of the program in which the corresponding source code remains the same  A small change in the source code may change the size of a small number of basic blocks, which further triggers the compiler to re-order many other basic blocks in the binary file

Idea  The control flow of a program is much more resistant to “superficial” changes like different register allocations and basic block re-ordering, and therefore is a more attractive feature for finding semantic differences

Assumption source code of binary files is not available function name extracted from these binary files are unreliable for the purpose of binary difference analysis, since they can be changed easily

System Overview(1) Input: two binary files Output: a matching between functions in the two binary files a matching between basic blocks in two matched functions a matching strength for each match of functions or basic block

System Overview(2) Decision: The matchings together with the matching strengths tell us where the semantic differences are. Unmatched functions and unmatched basic blocks, as well as matched functions and matched basic blocks with low matching strengths, constitute the semantic differences found between the two binary file.

Disassembler parse each binary file locate the code segment Realization: Implement a plug-in to IDA Pro

IR Converter IR: a dozen different statements, which are type-checked and free of side effects Easy: our symbolic execution and theorem proving are applied on a much simpler set of instructions Reliable: reduce the language variation in performing the same functionality

CFG Constuctor CFG: a set of nodes each representing a basic block and a set of directed edges representing the control flow among the basic blocks CG: the set of nodes corresponding to the functions in the file and the set of directed edges representing calls among the functions

Graph Isomorphism Engine Basic Block Comparison Symbolic Execution and Theorem Proving Maximum common subgraph isomorphism problem Backtracking Algorithm

Symbolic Execution Definition represent values of program variables with symbolic values instead of concrete(initialized) data and to manipulate expressions involving symbolic values Procedure Step1: find all the input and output registers and variables Step2: use symbolic execution to represent the final values of the output registers and variables

Theorem Proving Realization STP: a decision procedure for the satisfiability of quantifier-free formulas in the theory of bit-vectors and arrays Procedure pick the symbolic representation of one register/variable from each basic block and use STP to test if they are equivalent, assuming that the inputs to the basic blocks share the same values Assurance if two basic blocks are found to be different by our technique of symbolic execution and theorem proving, then they must not be functionally equivalent This property holds even if the two binary files are compiled using different compilers or compiler options.

Matching Strength Basic Block  1.0: functionally equivalent and registers used are the same  0.9: functionally equivalent while registers used are different  lower: scored on how functionally equivalent they are Function  1.0: instructions(x86 or IR) of the two functions are the same  others: subgraph measurement divided by the number of nodes in the CFG that has fewer nodes, where subgraph measurement is defined as the summation of matching strengths of matched nodes(basic blocks)

Backtracking Algorithm D: contains all possible pairs of nodes that might still be matched(initially V X M) M: contains matched node pairs(initially empty)

Case Study——gzip

Case Study——tar(1)

Case Study——tar(2)

Case Study——tar(3)

Related Work& Conclusion BinDiff/BindView contruct a maximal subgraph isomorphism between the sets of functions in two versions of the same executable file BinHunt:  contribute a more thorough technique(backtracking technique) for identifying the maximum common subgraph isomorphism  use a novel technique for basic block comparison using symbolic execution and theorem proving

Reference

Thank you!