LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks Feng Qin, Cheng Wang, Zhenmin Li, Ho-seop Kim, Yuanyuan.

Slides:



Advertisements
Similar presentations
Memory Protection: Kernel and User Address Spaces  Background  Address binding  How memory protection is achieved.
Advertisements

CPU Review and Programming Models CT101 – Computing Systems.
Dec 5, 2007University of Virginia1 Efficient Dynamic Tainting using Multiple Cores Yan Huang University of Virginia Dec
Bouncer securing software by blocking bad input Miguel Castro Manuel Costa, Lidong Zhou, Lintao Zhang, and Marcus Peinado Microsoft Research.
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
Virtual Memory Hardware Support
LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks - F. Qin, C. Wang, Z. Li, H. Kim, Y. Zhou, Y. Wu (UIUC,
Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.
Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar.
Efficient and Flexible Architectural Support for Dynamic Monitoring YUANYUAN ZHOU, PIN ZHOU, FENG QIN, WEI LIU, & JOSEP TORRELLAS UIUC.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
Recap. The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of the.
CS 300 – Lecture 22 Intro to Computer Architecture / Assembly Language Virtual Memory.
CS 300 – Lecture 19 Intro to Computer Architecture / Assembly Language C Coding & The Simulator Caches.
Memory Management 1 CS502 Spring 2006 Memory Management CS-502 Spring 2006.
CS-3013 & CS-502, Summer 2006 Memory Management1 CS-3013 & CS-502 Summer 2006.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
Microprocessors Introduction to RISC Mar 19th, 2002.
HARDBOUND: ARCHITECURAL SUPPORT FOR SPATIAL SAFETY OF THE C PROGRAMMING LANGUAGE Kyle Yan Yu Xing 2014/10/15.
1 RAKSHA: A FLEXIBLE ARCHITECTURE FOR SOFTWARE SECURITY Computer Systems Laboratory Stanford University Hari Kannan, Michael Dalton, Christos Kozyrakis.
Efficient Software-Based Fault Isolation—sandboxing Presented by Carl Yao.
Computer Organization
Vulnerability-Specific Execution Filtering (VSEF) for Exploit Prevention on Commodity Software Authors: James Newsome, James Newsome, David Brumley, David.
D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,
A genda for Today What is memory management Source code to execution Address binding Logical and physical address spaces Dynamic loading, dynamic linking,
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
CS533 Concepts of Operating Systems Jonathan Walpole.
Hardware Assisted Control Flow Obfuscation for Embedded Processors Xiaoton Zhuang, Tao Zhang, Hsien-Hsin S. Lee, Santosh Pande HIDE: An Infrastructure.
CS 147 June 13, 2001 Levels of Programming Languages Svetlana Velyutina.
Vigilante: End-to-End Containment of Internet Worms Authors : M. Costa, J. Crowcroft, M. Castro, A. Rowstron, L. Zhou, L. Zhang, and P. Barham In Proceedings.
Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui COMP 203 / NWEN 201 Computer Organisation / Computer Architectures Virtual.
Virtual Machines, Interpretation Techniques, and Just-In-Time Compilers Kostis Sagonas
Buffer Overflow Attack Proofing of Code Binary Gopal Gupta, Parag Doshi, R. Reghuramalingam, Doug Harris The University of Texas at Dallas.
Highly Scalable Distributed Dataflow Analysis Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan Chelsea LeBlancTodd.
Efficient Software Based Fault Isolation Author: Robert Wahobe,Steven Lucco,Thomas E Anderson, Susan L Graham Presenter: Maitree kanungo Date:02/17/2010.
Information Leaks Without Memory Disclosures: Remote Side Channel Attacks on Diversified Code Jeff Seibert, Hamed Okhravi, and Eric Söderström Presented.
Efficient Software-Based Fault Isolation Robert Wahbe, Steven Lucco, Thomas E. Anderson, Susan L. Graham.
Efficient software-based fault isolation Robert Wahbe, Steven Lucco, Thomas Anderson & Susan Graham Presented by: Stelian Coros.
Sampling Dynamic Dataflow Analyses Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan University of British Columbia.
Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software Paper by: James Newsome and Dawn Song.
What is a program? A sequence of steps
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
G. Venkataramani, I. Doudalis, Y. Solihin, M. Prvulovic HPCA ’08 Reading Group Presentation 02/14/2008.
LECTURE 19 Subroutines and Parameter Passing. ABSTRACTION Recall: Abstraction is the process by which we can hide larger or more complex code fragments.
Memory Management memory hierarchy programs exhibit locality of reference - non-uniform reference patterns temporal locality - a program that references.
CSE 351 Caches. Before we start… A lot of people confused lea and mov on the midterm Totally understandable, but it’s important to make the distinction.
Qin Zhao1, Joon Edward Sim2, WengFai Wong1,2 1SingaporeMIT Alliance 2Department of Computer Science National University of Singapore
Memory Protection through Dynamic Access Control Kun Zhang, Tao Zhang and Santosh Pande College of Computing Georgia Institute of Technology.
Translation Lookaside Buffer
Remix: On-demand Live Randomization
Dynamic Compilation Vijay Janapa Reddi
CS161 – Design and Architecture of Computer
Lecture 12 Virtual Memory.
Assembly Language Programming of 8085
Olatunji Ruwase* Shimin Chen+ Phillip B. Gibbons+ Todd C. Mowry*
A Closer Look at Instruction Set Architectures
Microcomputer Programming
CACHE MEMORY.
High Coverage Detection of Input-Related Security Faults
Memory Protection: Kernel and User Address Spaces
CSE 351 Section 10 The END…Almost 3/7/12
Memory Management Tasks
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Memory management Explain how memory is managed in a typical modern computer system (virtual memory, paging and segmentation should be described.
CSE451 Virtual Memory Paging Autumn 2002
CPU Structure CPU must:
Sarah Diesburg Operating Systems COP 4610
Presentation transcript:

LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks Feng Qin, Cheng Wang, Zhenmin Li, Ho-seop Kim, Yuanyuan zhou, Youfeng Wu University of Illinois at Urbana-Champaign Intel Corporation The Ohio State University

Information Flow Tracking Taint Analysis To detect / prevent security attacks For attacks that corrupts control data General: not for specific types of software vulnerabilities Even for unknown attacks

Approach 1. Tag (label) the input data from unsafe channels: network 2. Propagate the data tags through the computation Any data derived from unsafe data are also tagged as unsafe 3. Detect unexpected usages of the unsafe data Switch the program control to the unsafe data

A Simple Example a is unsafe Information flows from a to b: b is unsafe If c is unsafe, jumping to the location pointed by c fails

Three Ways Language-based For programs written in special type-safe programming languages To track information flow at compile time Good: No runtime overhead Bad: Only for specific program languages Not Practical

Three Ways Instrumentation To track the information flow and detect exploits at runtime Source code instrumentation Lower overhead Cannot track in third-party library code Require a specification of library calls Complex, error-prone, side-effects Binary code instrumentation Runtime overhead: 37 times

Three Ways Hardware-based RIFLE Good: low overhead Bad: Non-trivial hardware extensions

Overview of LIFT Dynamically instruments the binary code (1) tracking information flow (2) detect security exploits Advantages: Low overhead, software-only, No source code Built on top of StarDBT Binary translator by Intel

Design of LIFT Basic design Tag management Information flow tracking Exploit detection Protection of the tag space Optimizations

Tag Management: Design Associate a one-bit tag for each byte of data in memory and general data register 0: safe; 1: unsafe At the beginning: all tags are cleared to zero Data may be tagged with 1 when It is read from network or standard input Information flow from other unsafe data to it An unsafe data can become safe if it is reassigned from some safe data

Tag Management: Storage For memory data Storage: a special memory region (tag space) Look-up: one-to-one mapping between a tag bit and a memory byte in the virtual address space Overhead: 12.5% Compression: memory data nearby each other usually have similar tag values For general registers Store tags in a dedicated extra register (64-bit) Reduce overhead If no spare registers: a special memory area No significant overhead as the L1 cache Hardware ??

Information Flow Tracking Dynamically instrument instructions Instrumented once at runtime, and executed multiple times The instrumentation is done before the instruction in the original program Tracks information flow based on data dependencies but not control dependencies

Information Flow Tracking For data movement-based instructions E.g., MOV, PUSH, POP Tag propagation: source operand  destination For arithmetic instructions E.g., ADD, OR Tag propagation: both source operands  destination For instructions that involve only one operand E.g., INC The tag does not change

Information Flow Tracking Special cases XOR reg, reg: reset reg to zero SUB reg, reg: Clear the corresponding tag

Exploit Detection Also instrument instructions to detect exploits Unsafe data cannot be used as a return address or the destination of an indirect jump instruction

Protection of Tag Space and Code It is necessary to protect them To protect the LIFT code Make the memory pages that store the LIFT code read-only To protect the tag space Turn off the access permission of the pages that store the tag values of the tag space itself Any access of the original program or hijacked code to the tag space results in access to the corresponding tag and triggers a fault

Optimizations 47 times runtime overhead Three binary optimizations

Fast Path (FP): Motivation Observation: for most server applications, majority of tag propagations are zero-to-zero From safe data sources to a safe destination

FP: Approach Before a code segment, insert a check Check whether all its live-in and live-out registers and memory data are safe or not If so, no need to do tracking inside the code segment Run the fast binary version (check version) If not, run the slow version (track version)

FP: Approach Live-in: source operand Live-out: may change to safe after the execution if they are unsafe before the execution Others: (a) not used in the code segment (b) dead at the beginning or end of the code segment

FP: More Technique Details Difficult to know the address of all units at the beginning Run the check version first Postpone the check until the memory location is known Jump to track version when the check fails Granularity of code segments Basic blocks Hot trace Remove unnecessary checks Network processing component

Merged Check (MC): Motivation Temporal / Spatial Locality A recently accessed data is likely to be accessed again in a near future After an access to a location, memory locations that are nearby are also likely to be accessed again in near future To combine multiple checks into one Combine the temporally and spatially nearby checks

Merged Check: Approach Clustering the memory references into groups Scan all the instructions and build a data dependency graph for each memory reference Introduce version number to represent the timing attribute Clustering based on spatially / temporally distance

Fast Switch (FS) When the program execution switches between the original binary code and the instrumented code it requires saving and restoring the context Introduce large runtime overhead because they are inserted at many locations Use cheaper instructions and remove unnecessary saves / restores

Evaluation Effectiveness Performance

Evaluation: Effectiveness

Evaluation: Performance Throughput and response time of Apache Throughput: 6.2% (StarDBT: 3.4%) Time: 90.9%

Evaluation: Performance SPEC2000: 3.6 times on average

Conclusion A “Practical” Information flow tracking system Low-overhead Not requiring hardware extension Not requiring source code

Discussions Source-code instrumentation 81% on average for CPU-intensive C-programs 5% on average for IO-intensive (sever) program If we are able to apply similar optimization techniques to source- code instrumentation, the performance could be “practical” Binary-code instrumentation CPU-bound: 24 times Apache server: worst case 25 times, most cases: 5~10 times

More Discussions Focus on basic design and three optimizations Not much details about the taint analysis Evaluation Effectiveness: false positive / false negative Performance IO-incentive vs. CPU-incentive More benchmarks Formal model to analyze taint analysis