Comprehensive Kernel Instrumentation via Dynamic Binary Translation Peter Feiner, Angela Demke Brown, Ashvin Goel University of Toronto Presenter: Chuong.

Slides:



Advertisements
Similar presentations
Threads, SMP, and Microkernels
Advertisements

Programming Technologies, MIPT, April 7th, 2012 Introduction to Binary Translation Technology Roman Sokolov SMWare
More on Processes Chapter 3. Process image _the physical representation of a process in the OS _an address space consisting of code, data and stack segments.
D. Tam, R. Azimi, L. Soares, M. Stumm, University of Toronto Appeared in ASPLOS XIV (2009) Reading Group by Theo 1.
Secure In-VM Monitoring Using Hardware Virtualization Monirul Sharif, Wenke Lee, Weidong Cui, and Andrea Lanzi Presented by Tyler Bletsch.
OS Memory Addressing.
Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar.
Day 11 Processes. Operating Systems Control Tables.
CSC 501 Lecture 2: Processes. Von Neumann Model Both program and data reside in memory Execution stages in CPU: Fetch instruction Decode instruction Execute.
Memory Management (II)
6-1 I/O Methods I/O – Transfer of data between memory of the system and the I/O device Most devices operate asynchronously from the CPU Most methods involve.
Introduction to Kernel
Advanced OS Chapter 3p2 Sections 3.4 / 3.5. Interrupts These enable software to respond to signals from hardware. The set of instructions to be executed.
CS-502 Fall 2006Processes in Unix, Linux, & Windows 1 Processes in Unix, Linux, and Windows CS502 Operating Systems.
Unix & Windows Processes 1 CS502 Spring 2006 Unix/Windows Processes.
1 Process Description and Control Chapter 3 = Why process? = What is a process? = How to represent processes? = How to control processes?
Chapter 91 Translation Lookaside Buffer (described later with virtual memory) Frame.
Threads CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
Processes in Unix, Linux, and Windows CS-502 Fall Processes in Unix, Linux, and Windows CS502 Operating Systems (Slides include materials from Operating.
Efficient Instruction Set Randomization Using Software Dynamic Translation Michael Crane Wei Hu.
Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Umbra: Efficient and Scalable Memory Shadowing CGO 2010, Toronto, Canada April 26, 2010.
Fast Dynamic Binary Translation for the Kernel Piyus Kedia and Sorav Bansal IIT Delhi.
Process in Unix, Linux, and Windows CS-3013 A-term Processes in Unix, Linux, and Windows CS-3013 Operating Systems (Slides include materials from.
Introduction to Processes CS Intoduction to Operating Systems.
CSC 501 Lecture 2: Processes. Process Process is a running program a program in execution an “instantiation” of a program Program is a bunch of instructions.
Three fundamental concepts in computer security: Reference Monitors: An access control concept that refers to an abstract machine that mediates all accesses.
PROTECTING THE KERNEL FROM UNTRUSTED MODULES Akshay Kumar, Peter Goodman, Peter Feiner, Ashvin Goel, Angela Demke Brown University of Toronto.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto OS-Related Hardware.
29th ACSAC (December, 2013) SPIDER: Stealthy Binary Program Instrumentation and Debugging via Hardware Virtualization Zhui Deng, Xiangyu Zhang, and Dongyan.
Memory Management 3 Tanenbaum Ch. 3 Silberschatz Ch. 8,9.
The Structure of Processes (Chap 6 in the book “The Design of the UNIX Operating System”)
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
Hardware process When the computer is powered up, it begins to execute fetch-execute cycle for the program that is stored in memory at the boot strap entry.
Chapter 4 Memory Management Virtual Memory.
CS333 Intro to Operating Systems Jonathan Walpole.
Joonwon Lee Process and Address Space.
JIT Instrumentation – A Novel Approach To Dynamically Instrument Operating Systems Marek Olszewski Keir Mierle Adam Czajkowski Angela Demke Brown University.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Demand Paging.
Processes and Virtual Memory
Full and Para Virtualization
Threads, Thread management & Resource Management.
Hardware process When the computer is powered up, it begins to execute fetch-execute cycle for the program that is stored in memory at the boot strap entry.
Lecture 5 Rootkits Hoglund/Butler (Chapters 1-3).
OS Memory Addressing. Architecture CPU – Processing units – Caches – Interrupt controllers – MMU Memory Interconnect North bridge South bridge PCI, etc.
Qin Zhao1, Joon Edward Sim2, WengFai Wong1,2 1SingaporeMIT Alliance 2Department of Computer Science National University of Singapore
Introduction to Kernel
Processes and threads.
Process Management Process Concept Why only the global variables?
Memory Protection: Kernel and User Address Spaces
CS 6560: Operating Systems Design
143A: Principles of Operating Systems Lecture 6: Address translation (Paging) Anton Burtsev October, 2017.
What we need to be able to count to tune programs
OS Virtualization.
Memory Protection: Kernel and User Address Spaces
Memory Protection: Kernel and User Address Spaces
Memory Protection: Kernel and User Address Spaces
More examples How many processes does this piece of code create?
Processes in Unix, Linux, and Windows
Virtual Memory Overcoming main memory size limitation
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Prof. Leonardo Mostarda University of Camerino
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Processes in Unix and Windows
CS510 Operating System Foundations
CSE 471 Autumn 1998 Virtual memory
Structure of Processes
Memory Protection: Kernel and User Address Spaces
Dynamic Binary Translators and Instrumenters
Threads CSE 2431: Introduction to Operating Systems
Presentation transcript:

Comprehensive Kernel Instrumentation via Dynamic Binary Translation Peter Feiner, Angela Demke Brown, Ashvin Goel University of Toronto Presenter: Chuong Ngo

THE ORIGIN STORY STARTING IN MEDIAS RES No parents, uncles, or girlfriends were killed during the creation of this presentation

DBT is the Answer! Emulation of one instruction set by another through translation of binary code during execution. More practical than static binary translation. ◦ Simplifies identification of executable code. ◦ Amortization of translation overhead costs over time.

…and I Remember Everything!

The Answer to What? Ports ◦ Abandonware Analysis Bug finding Security

Assemble! User Level JIFL PinOS Pin DynamoRio Valgrind Power Level < 9K

IT’S A BIRD! IT’S A PLANE! IT’S DRK! All the way from Earth-1610 via Cataclysm

But Who Hides Behind the Mask? 4 Goals for kernel DBT framework: ◦ Full coverage of kernel code. ◦ No direct overhead for user level code. ◦ Preserve original concurrency and execution interleaving. ◦ Be transparent. DynamoRio for the kernel.

DynamoRio Flashback! Code cache CTIs return control to dispatcher Direct branching patches Next Executing Tail Client callbacks

Well Victor…I’ve been thinking. All kernel entry points point to dispatcher. ◦ Shadow descriptor table Self-contained dispatcher ◦ Custom heap allocator ◦ “Pull” I/O model CPU-private data Interrupts delayed in code cache, disabled in dispatcher. Exceptions use restored native states.

A Carbonadium Skeleton

DRK Initialization Individual CPU initialization ◦ Allocate CPU resources ◦ All kernel entry points to dispatcher ◦ All interrupts redirected Allocates memory for heap ◦ Checks all processors for successful memory mapping. ◦ Must be within 2GB of text and data segments.

DRK Normal Operations Determine target of control transfer instruction and dispatch. Kernel exit points executed via native instructions. Dispatcher creates and caches code fragment. Context switches to the code fragment.

You Can’t Escape This Timeline! Exceptions run native ◦ Native state must be restored. Interrupts are delayed and emulated. ◦ Other interrupts are disabled. ◦ Captured interrupt executed between block dispatches.

HOW DOES IT STACK UP? How did--? This… you… What are you?

I’ve always found hardware to be more reliable Test System: Dell Optiplex 980 ◦ 8 GB RAM ◦ 4x Intel Core i7s at 2.8 GHz, no hyperthreading 2 Clients: ◦ Null Client ◦ Instruction Count Filebench

I’m the best at what I do?

There’s a whole new master of magnetism in town!

I know everything. I can’t help it.

With great power… 4 Goals for kernel DBT framework: ◦ Full coverage of kernel code. ◦ No direct overhead for user level code. ◦ Preserve original concurrency and execution interleaving. ◦ Be transparent.

I’ll be there…around every corner Full coverage of kernel code. Preserve original concurrency and execution interleaving.

Fastest man alive with a limp No direct overhead for user level code. ◦ Increased cache and TLB misses.

The cosmic rays…what did they do to us? Be transparent. ◦ No code cache consistency. ◦ Shadow descriptor tables readable via hardware registers. ◦ Page table inconsistencies. ◦ CPU-private data.

…comes great responsibility. 4 Goals for kernel DBT framework: ◦ Full coverage of kernel code. ◦ No direct overhead for user level code. ◦ Preserve original concurrency and execution interleaving. ◦ Be transparent.

DRK APPLICATIONS This was the world that I had created.

DRK’s Shadow Memory Storing metadata about memory used. Ported UMBRA. ◦ Simple indirect mapping. ◦ Copy-on-write. ◦ 10x overhead vs. native.

KAddrcheck Memory addressability checking tool. Scans slab allocator’s data structures to locate all pages and freelists. ◦ Triggers shadow memory allocations. Addressability checks run on every memory access.

Stackcheck ◦ Checks for addressability errors. ◦ Kills calling thread and continues. Modified KAddrcheck Resolves overflow without system crash. Stack overflow guard

Triumph! DRK is a kernel-level DBT. DynamoRIO “port”. Heavy implementation. Missing a number of features.