Practical Timing Side Channel Attacks Against Kernel Space ASLR

Slides:

Advertisements

Similar presentations

1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.

Advertisements

Computer Organization and Architecture

Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.

Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.

CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and

1 COVERT CHANNELS Ravi Sandhu. 2 COVERT CHANNELS A covert channel is a communication channel based on the use of system resources not normally intended.

The Memory Hierarchy (Lectures #24) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization.

Memory Management (II)

S.1 Review: The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of.

OS Fall ’ 02 Introduction Operating Systems Fall 2002.

Memory Management 2010.

OS Spring’03 Introduction Operating Systems Spring 2003.

Translation Buffers (TLB’s)

Virtual Memory and Paging J. Nelson Amaral. Large Data Sets Size of address space: – 32-bit machines: 2 32 = 4 GB – 64-bit machines: 2 64 = a huge number.

Computer Organization and Architecture

Memory: Virtual MemoryCSCE430/830 Memory Hierarchy: Virtual Memory CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu.

Mem. Hier. CSE 471 Aut 011 Evolution in Memory Management Techniques In early days, single program run on the whole machine –Used all the memory available.

1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)

OS Spring’04 Introduction Operating Systems Spring 2004.

Virtual Memory By: Dinouje Fahih. Definition of Virtual Memory Virtual memory is a concept that, allows a computer and its operating system, to use a.

Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.

Tanenbaum 8.3 See references

Review of Memory Management, Virtual Memory CS448.

Chapter 8 Memory Management Dr. Yingwu Zhu. Outline Background Basic Concepts Memory Allocation.

Computer Architecture Lecture 28 Fasih ur Rehman.

Lecture 19: Virtual Memory

Lecture 9: Memory Hierarchy Virtual Memory Kai Bu

Virtual Memory Expanding Memory Multiple Concurrent Processes.

Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)

Caltech CS184b Winter DeHon 1 CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] Day14:

Fundamentals of Programming Languages-II Subject Code: Teaching SchemeExamination Scheme Theory: 1 Hr./WeekOnline Examination: 50 Marks Practical:

Virtual Memory Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University.

Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.

Virtual Memory.  Next in memory hierarchy  Motivations:  to remove programming burdens of a small, limited amount of main memory  to allow efficient.

Full and Para Virtualization

CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.

Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.

Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)

Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.

Virtual Memory 1 Computer Organization II © McQuain Virtual Memory Use main memory as a “cache” for secondary (disk) storage – Managed jointly.

CS203 – Advanced Computer Architecture Virtual Memory.

Chapter 7: Main Memory CS 170, Fall Program Execution & Memory Management Program execution Swapping Contiguous Memory Allocation Paging Structure.

CS161 – Design and Architecture of Computer

Memory Management.

Memory COMPUTER ARCHITECTURE

CS161 – Design and Architecture of Computer

CS352H: Computer Systems Architecture

From Address Translation to Demand Paging

William Stallings Computer Organization and Architecture 8th Edition

Section 9: Virtual Memory (VM)

Today How was the midterm review? Lab4 due today.

Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.

Bruhadeshwar Meltdown Bruhadeshwar

Lecture 14 Virtual Memory and the Alpha Memory Hierarchy

CMSC 611: Advanced Computer Architecture

Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory

CSE 451: Operating Systems Autumn 2005 Memory Management

Translation Buffers (TLB’s)

CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs

CSE451 Virtual Memory Paging Autumn 2002

Virtual Memory Prof. Eric Rotenberg

CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management

Translation Buffers (TLB’s)

CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs

CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management

Virtual Memory Lecture notes from MKP and S. Yalamanchili.

Translation Buffers (TLBs)

Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.

Review What are the advantages/disadvantages of pages versus segments?

Virtual Memory 1 1.

Presentation transcript:

Practical Timing Side Channel Attacks Against Kernel Space ASLR Fall 2014 Instructor: Kun Sun, Ph.D. A little background about myself.

Outline Background Side Channel Attack Kernel Space ASLR Use timing-based side channel to defeat kernel space ASLR (A heavy engineering work) How to prevent/mitigate this side channel attack?

Outline Background Side Channel Attack Kernel Space ASLR Use timing-based side channel to defeat kernel space ASLR How to prevent/mitigate this side channel attack?

Side Channel Side channel vs. Covert channel Side channel: incidentally, unintentionally Covert channel: colluding, intentionally A side channel is a communication channel based on the use of system resources not normally intended for communication between the subjects (processes) in the system

Side Channel Types Computer systems abound with side channels Storage Channels use system variables and attributes (other than time) to signal information classic example is resource exhaustion channel Timing Channels vary the amount of time required to complete a task to signal information classic example is load sensing channel Side channels are typically noisy but information theory techniques can be used to achieve error- free communication

Resource Exhaustion Channel Given 5MB pool of dynamically allocated Memory High Process bit = 1  request 5MB of memory bit = 0  request 0MB of memory Low Process request 5MB of memory if allocated then bit = 0 otherwise bit = 1

Load Sensing Channel High Process Low Process bit = 1  enter computation intensive loop bit = 0  go to sleep Low Process perform a task with known computational requirements if completed quickly then bit = 0 otherwise bit =1

Mitigating Side Channel Audit for storage-based Time normalization for timing-based Slow it down, reduce the transmission date rate

Outline Background Side Channel Attack Kernel Space ASLR Use timing-based side channel to defeat kernel space ASLR How to prevent/mitigate this side channel attack?

ASLR Deployment Windows user space and kernel space ASLR since vista. Pax patches in Linux (not in kernel) MacOS since 10.5 Mobile OSes including Android and iOS

User Space ASLR vs. Kernel Space ASLR Shacham et al. showed that user mode ASLR on 32-bit architectures only leaves 16 bit of randomness, which is not enough to defeat brute-force attacks. However, such brute-force attacks are not applicable for kernel space ASLR. An attacker wants to exploit a vulnerability in kernel code, a wrong offset will cause a complete crash of the system and thus an attacker has only one attempt to perform an exploit.

Windows Kernel Space ASLR For both 32-bit and 64 bit Vista, Windows 7, and Windows 8. Obtain the info by reverse-engineering During boot process, Windows loader allocates kernel_region for the kernel image and the hardware abstraction layer (HAL) The entropy is 6 bit, 32*2=64=26

Outline Background Side Channel Attack Kernel Space ASLR Use timing-based side channel to defeat kernel space ASLR How to prevent/mitigate this side channel attack?

Timing-based Side Channel Attack Target at how a local attacker with restricted rights can mount a timing-based side channel attack against the memory management system to deduce information about the privileged kernel space ASLR.

Attack Model System is protected by ASLR and Local adversary can run user-space applications, but cannot execute privileged kernel mode code.

Challenges on Exploiting the Attack Timing attacks against a modern CPU are complicated. Performance optimizations including hardware prefetching, speculative execution, multi-core architectures, or branch prediction. Target on modern CPU and new OS Previous work focused on older processors without such optimization. Instruction Set Architecture: Defined as part of the processor architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external IO. Kernel related registers: Task register (TR), Interrupt Descriptor Table Register (IDTR), Global Descriptor Table Register (GDTR), Local Descriptor Table Register (LDTR)

TLB and Cache Translation lookaside buffer: convert virtual address to physical address Virtually indexed, physically tagged http://www.cs.princeton.edu/courses/archive/fall04/cos471/lectures/19-VirtualMemory.pdf

Intel i7 Memory Architecture plus Clock Latency For Physical address extension (PAE) and 64 bit mode: page map level 4 (PML4) and Page directory point (PDP)

General Approach Two steps: Set the system in a specific state from user mode. Measure the duration of a certain memory access operation. The time span of this operation then (possibly) reveals certain information about the kernel space layout. One example: Cache a system call handler by executing sys call. Access a designated set of user space addresses and execute the system call again. If the system call takes longer than expected, the access of user space addresses evicted the system call handler code and reveals address info of system code.

Handling Noise Circumvent performance optimizations E.g., test memory in non-consecutive order to circumvent memory prefetching. OS Scheduler, shared cache Two steps requires 5000 clock cycles OS time slides several ms Guarantee to find cases with No Interrupts between two steps.

Three Attacks Attacks on L1/L2/L3 caches and TLB/PS caches Cache probing Double page fault Cache preloading

Cache Probing Based on multiple memory addresses mapped into the same cache set and compete for available slots n-way set associative mode. all available slots are grouped into sets of the size n and each memory chunk can be stored in all slots of one particular set. E.g., in 32-bit address, a typical L3 cache of 8 MB is 16-way set associative. It consists of (8, 192∗1, 024)/64 = 131, 072 single slots that are grouped into 131, 072/16 = 8, 192 different sets. Reverse-engineering Sandybridge Hash Function for L3 parallel access among multiple cores

Double Page Fault Based on the observation on Intel CPUs When the page translation was successful, but the access permission check fails (e.g., when kernel space is accessed from user mode), a TLB entry is indeed created. Do not work on AMD CPUs Can be easily fixed by hardware redesign. The double page fault timings yield an estimation for the allocation map of the kernel space, but do not determine at which concrete base addresses the kernel and drivers are loaded to.

Double Page Fault First page fault: for each kernel space page p, access p from user mode and a page fault will be handled by the operating system and forwarded to the exception handler of the process. When p refers to an allocated page, the MMU creates a TLB entry for p although the succeeding permission check fails. When p refers to an unallocated page, since the translation fails, the MMU does not create a TLB entry for p. Second page fault: accessing p again, if p refers to an allocated kernel page, the page fault will be delivered faster due to the inherent TLB hit.

Cache Preloading Double page fault estimates the allocation map of the kernel space, but do not determine at which concrete base addresses the kernel and drivers are loaded to. 3 steps in Cache Preloading: flush all caches (i.e., address translation and instruction/data caches) to start with a clean state. preload the address translation caches by indirectly calling into kernel code, for example by issuing a sysenter operation. Intentionally generate a page fault by jumping to some kernel space address and measure the time. If the faulting address lies in the same memory range as the preloaded kernel memory  a shorter time

Testbed Tested on 32-/64-bit systems running Windows 7/Linux Tested on different CPUs + VM: Intel i7-870 (Nehalem/Bloomfield, Quad-Core) Intel i7-950 (Nehalem/Lynnfield, Quad-Core) Intel i7-2600 (Sandybridge, Quad-Core) AMD Athlon II X3 455 (Triple-Core) VMWare Player 4.0.2 on Intel i7-870 (with VT- x)

Results

Outline Background Side Channel Attack Kernel Space ASLR Use timing-based side channel to defeat kernel space ASLR (A heavy engineering work) How to prevent/mitigate this side channel attack?

Mitigation Approaches Root cause: memory hierarchy is a shared resource Also, no enough randomization: 64 different slots for windows kernel Option one: split caches for user and kernel. Shortcoming: checking overhead, maximum cache size reduced Option two: suppress the creation of TLB entries on successful address translation if an access violation happens Option three: Timing normalization; modify the execution time of the page fault handler

Future works Test MacOS Test Android Obtain the physical address of a certain memory location from user mode.

Questions?