Presentation is loading. Please wait.

Presentation is loading. Please wait.

Practical Timing Side Channel Attacks Against Kernel Space ASLR Fall 2014 Instructor: Kun Sun, Ph.D.

Similar presentations


Presentation on theme: "Practical Timing Side Channel Attacks Against Kernel Space ASLR Fall 2014 Instructor: Kun Sun, Ph.D."— Presentation transcript:

1 Practical Timing Side Channel Attacks Against Kernel Space ASLR Fall 2014 Instructor: Kun Sun, Ph.D.

2 2 Outline  Background  Side Channel Attack  Kernel Space ASLR  Use timing-based side channel to defeat kernel space ASLR (A heavy engineering work)  How to prevent/mitigate this side channel attack?

3 3 Outline  Background  Side Channel Attack  Kernel Space ASLR  Use timing-based side channel to defeat kernel space ASLR  How to prevent/mitigate this side channel attack?

4 Side Channel  Side channel vs. Covert channel  Side channel: incidentally, unintentionally  Covert channel: colluding, intentionally  A side channel is a communication channel based on the use of system resources not normally intended for communication between the subjects (processes) in the system

5 Side Channel Types  Computer systems abound with side channels  Storage Channels  use system variables and attributes (other than time) to signal information  classic example is resource exhaustion channel  Timing Channels  vary the amount of time required to complete a task to signal information  classic example is load sensing channel  Side channels are typically noisy but information theory techniques can be used to achieve error- free communication

6 Resource Exhaustion Channel  Given 5MB pool of dynamically allocated Memory  High Process  bit = 1  request 5MB of memory  bit = 0  request 0MB of memory  Low Process  request 5MB of memory  if allocated then bit = 0 otherwise bit = 1

7 Load Sensing Channel  High Process  bit = 1  enter computation intensive loop  bit = 0  go to sleep  Low Process  perform a task with known computational requirements  if completed quickly then bit = 0 otherwise bit =1

8 Mitigating Side Channel  Audit for storage-based  Time normalization for timing-based  Slow it down, reduce the transmission date rate

9 9 Outline  Background  Side Channel Attack  Kernel Space ASLR  Use timing-based side channel to defeat kernel space ASLR  How to prevent/mitigate this side channel attack?

10 ASLR Deployment  Windows user space and kernel space ASLR since vista.  Pax patches in Linux (not in kernel)  MacOS since 10.5  Mobile OSes including Android and iOS

11 User Space ASLR vs. Kernel Space ASLR  Shacham et al. showed that user mode ASLR on 32-bit architectures only leaves 16 bit of randomness, which is not enough to defeat brute-force attacks.  However, such brute-force attacks are not applicable for kernel space ASLR.  An attacker wants to exploit a vulnerability in kernel code, a wrong offset will cause a complete crash of the system and thus an attacker has only one attempt to perform an exploit.

12 Windows Kernel Space ASLR  For both 32-bit and 64 bit Vista, Windows 7, and Windows 8.  Obtain the info by reverse-engineering  During boot process, Windows loader allocates kernel_region for the kernel image and the hardware abstraction layer (HAL)  The entropy is 6 bit, 32*2=64=2 6

13 13 Outline  Background  Side Channel Attack  Kernel Space ASLR  Use timing-based side channel to defeat kernel space ASLR  How to prevent/mitigate this side channel attack?

14 Timing-based Side Channel Attack  Target at how a local attacker with restricted rights can mount a timing-based side channel attack against the memory management system to deduce information about the privileged kernel space ASLR.

15 Attack Model  System is protected by ASLR and  Local adversary can run user-space applications, but cannot execute privileged kernel mode code.

16 Challenges on Exploiting the Attack  Timing attacks against a modern CPU are complicated.  Performance optimizations including hardware prefetching, speculative execution, multi-core architectures, or branch prediction.  Target on modern CPU and new OS  Previous work focused on older processors without such optimization.

17 TLB and Cache  Translation lookaside buffer: convert virtual address to physical address  Virtually indexed, physically tagged

18 Intel i7 Memory Architecture plus Clock Latency

19 General Approach  Two steps: 1. Set the system in a specific state from user mode. 2. Measure the duration of a certain memory access operation. The time span of this operation then (possibly) reveals certain information about the kernel space layout.  One example:  Cache a system call handler by executing sys call.  Access a designated set of user space addresses and execute the system call again.  If the system call takes longer than expected, the access of user space addresses evicted the system call handler code and reveals address info of system code.

20 Handling Noise  Circumvent performance optimizations  E.g., test memory in non-consecutive order to circumvent memory pre- fetching.  OS Scheduler, shared cache  Two steps requires 5000 clock cycles  OS time slides several ms  Guarantee to find cases with No Interrupts between two steps.

21 Three Attacks  Attacks on L1/L2/L3 caches and TLB/PS caches  Cache probing  Double page fault  Cache preloading

22 Cache Probing  Based on multiple memory addresses mapped into the same cache set and compete for available slots  n-way set associative mode.  all available slots are grouped into sets of the size n and each memory chunk can be stored in all slots of one particular set.  E.g., in 32-bit address, a typical L3 cache of 8 MB is 16-way set associative. It consists of (8, 192 ∗ 1, 024)/64 = 131, 072 single slots that are grouped into 131, 072/16 = 8, 192 different sets.  Reverse-engineering Sandybridge Hash Function for L3 parallel access among multiple cores

23 Double Page Fault  Based on the observation on Intel CPUs  When the page translation was successful, but the access permission check fails (e.g., when kernel space is accessed from user mode), a TLB entry is indeed created.  Do not work on AMD CPUs  Can be easily fixed by hardware redesign.

24 Double Page Fault  First page fault: for each kernel space page p, access p from user mode and a page fault will be handled by the operating system and forwarded to the exception handler of the process.  When p refers to an allocated page, the MMU creates a TLB entry for p although the succeeding permission check fails.  When p refers to an unallocated page, since the translation fails, the MMU does not create a TLB entry for p.  Second page fault: accessing p again, if p refers to an allocated kernel page, the page fault will be delivered faster due to the inherent TLB hit.

25 Cache Preloading  Double page fault estimates the allocation map of the kernel space, but do not determine at which concrete base addresses the kernel and drivers are loaded to.  3 steps in Cache Preloading:  flush all caches (i.e., address translation and instruction/data caches) to start with a clean state.  preload the address translation caches by indirectly calling into kernel code, for example by issuing a sysenter operation.  Intentionally generate a page fault by jumping to some kernel space address and measure the time. If the faulting address lies in the same memory range as the preloaded kernel memory  a shorter time

26 Testbed  Tested on 32-/64-bit systems running Windows 7/Linux  Tested on different CPUs + VM:  Intel i7-870 (Nehalem/Bloomfield, Quad-Core)  Intel i7-950 (Nehalem/Lynnfield, Quad-Core)  Intel i (Sandybridge, Quad-Core)  AMD Athlon II X3 455 (Triple-Core)  VMWare Player on Intel i7-870 (with VT- x)

27 Results

28 28 Outline  Background  Side Channel Attack  Kernel Space ASLR  Use timing-based side channel to defeat kernel space ASLR (A heavy engineering work)  How to prevent/mitigate this side channel attack?

29 Mitigation Approaches  Root cause: memory hierarchy is a shared resource  Also, no enough randomization: 64 different slots for windows kernel  Option one: split caches for user and kernel.  Shortcoming: checking overhead, maximum cache size reduced  Option two: suppress the creation of TLB entries on successful address translation if an access violation happens  Option three: Timing normalization; modify the execution time of the page fault handler

30 Future works  Test MacOS  Test Android  Obtain the physical address of a certain memory location from user mode.

31 Questions?


Download ppt "Practical Timing Side Channel Attacks Against Kernel Space ASLR Fall 2014 Instructor: Kun Sun, Ph.D."

Similar presentations


Ads by Google