1 Ally: OS-Transparent Packet Inspection Using Sequestered Cores Jen-Cheng Huang 1, Matteo Monchiero 2, Yoshio Turner 3, Hsien-Hsin Lee 1 1 Georgia Tech.

Slides:



Advertisements
Similar presentations
Diagnosing Performance Overheads in the Xen Virtual Machine Environment Aravind Menon Willy Zwaenepoel EPFL, Lausanne Jose Renato Santos Yoshio Turner.
Advertisements

Virtualization Technology
Bart Miller. Outline Definition and goals Paravirtualization System Architecture The Virtual Machine Interface Memory Management CPU Device I/O Network,
Computer Science HyperSentry: Enabling Stealthy In-context Measurement of Hypervisor Integrity Ahmed M. Azab, Peng Ning, Zhi Wang, Xuxian Jiang North Carolina.
KMemvisor: Flexible System Wide Memory Mirroring in Virtual Environments Bin Wang Zhengwei Qi Haibing Guan Haoliang Dong Wei Sun Shanghai Key Laboratory.
G Robert Grimm New York University Disco.
Network Implementation for Xen and KVM Class project for E : Network System Design and Implantation 12 Apr 2010 Kangkook Jee (kj2181)
Disco Running Commodity Operating Systems on Scalable Multiprocessors.
KVM/ARM: The Design and Implementation of the Linux ARM Hypervisor Fall 2014 Presented By: Probir Roy.
CS533 Concepts of OS Class 16 ExoKernel by Constantia Tryman.
Operating System Support for Virtual Machines Samuel King, George Dunlap, Peter Chen Univ of Michigan Ashish Gupta.
Jiang Wang, Joint work with Angelos Stavrou and Anup Ghosh CSIS, George Mason University HyperCheck: a Hardware Assisted Integrity Monitor.
Virtual Machine Monitors CSE451 Andrew Whitaker. Hardware Virtualization Running multiple operating systems on a single physical machine Examples:  VMWare,
Xen and the Art of Virtualization Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield.
Xen and the Art of Virtualization. Introduction  Challenges to build virtual machines Performance isolation  Scheduling priority  Memory demand  Network.
Tanenbaum 8.3 See references
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
Zen and the Art of Virtualization Paul Barham, et al. University of Cambridge, Microsoft Research Cambridge Published by ACM SOSP’03 Presented by Tina.
2017/4/21 Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational.
Virtual Machines Xen and Terra Rajan Palanivel. Xen and Terra : Papers Xen and the art of virtualization. -Univ. of Cambridge Terra: A VM based platform.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
Jakub Szefer, Eric Keller, Ruby B. Lee Jennifer Rexford Princeton University CCS October, 2011 報告人:張逸文.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Achieving 10 Gb/s Using Xen Para-virtualized.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Jonathan Walpole (based on a slide set from Vidhya Sivasankaran)
CS533 Concepts of Operating Systems Jonathan Walpole.
1 Xen and the Art of Virtualization Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield.
Kenichi Kourai (Kyushu Institute of Technology) Takuya Nagata (Kyushu Institute of Technology) A Secure Framework for Monitoring Operating Systems Using.
Operating System Support for Virtual Machines Samuel T. King, George W. Dunlap,Peter M.Chen Presented By, Rajesh 1 References [1] Virtual Machines: Supporting.
Benefits: Increased server utilization Reduced IT TCO Improved IT agility.
Xen I/O Overview. Xen is a popular open-source x86 virtual machine monitor – full-virtualization – para-virtualization para-virtualization as a more efficient.
Xen I/O Overview.
Virtualization: Not Just For Servers Hollis Blanchard PowerPC kernel hacker.
Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath.
Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, Institute of Network and Information Systems School of Electrical Engineering.
CS533 Concepts of Operating Systems Jonathan Walpole.
Nathanael Thompson and John Kelm
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard et al. Madhura S Rama.
Supporting Multi-Processors Bernard Wong February 17, 2003.
High Performance Network Virtualization with SR-IOV By Yaozu Dong et al. Published in HPCA 2010.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
Min Lee, Vishal Gupta, Karsten Schwan
Introduction to virtualization
An Integrated Framework for Dependable and Revivable Architecture Using Multicore Processors Weidong ShiMotorola Labs Hsien-Hsin “Sean” LeeGeorgia Tech.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Demand Paging.
Full and Para Virtualization
Lecture 26 Virtual Machine Monitors. Virtual Machines Goal: run an guest OS over an host OS Who has done this? Why might it be useful? Examples: Vmware,
Protection of Processes Security and privacy of data is challenging currently. Protecting information – Not limited to hardware. – Depends on innovation.
Extending Xen * with Intel ® Virtualization Technology Mobile Embedded System Choi, Jin-yong
© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Understanding Virtualization Overhead.
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
Virtualization-optimized architectures
Translation Lookaside Buffer
Virtualization.
Virtual Machine Monitors
Virtualization Technology
Xen and the Art of Virtualization
Why VT-d Direct memory access (DMA) is a method that allows an input/output (I/O) device to send or receive data directly to or from the main memory, bypassing.
Presented by Yoon-Soo Lee
Virtualization Dr. Michael L. Collard
Section 9: Virtual Memory (VM)
Xen: The Art of Virtualization
Disco: Running Commodity Operating Systems on Scalable Multiprocessors
OS Virtualization.
Xen Network I/O Performance Analysis and Opportunities for Improvement
Translation Lookaside Buffer
Windows Virtual PC / Hyper-V
CSE 451: Operating Systems Autumn Module 24 Virtual Machine Monitors
CSE 451: Operating Systems Autumn Module 24 Virtual Machine Monitors
Presentation transcript:

1 Ally: OS-Transparent Packet Inspection Using Sequestered Cores Jen-Cheng Huang 1, Matteo Monchiero 2, Yoshio Turner 3, Hsien-Hsin Lee 1 1 Georgia Tech 2 Intel Labs 3 HP Labs

2 Deep Packet Inspection (DPI) Data Center Middle Boxes Intrusion Detection Content Insertion Traffic Classification Internet Deployment of Packet Processing Services

3 Problem Internet Data Center Middle Boxes Local Traffic is growing in importance… But The traffic within the data center is not inspected!

4 Approach “Co-locate” DPI with the server DPI appliance Server Leverage abundant CPU resources Leverage existing management interfaces on servers, e.g. HP iLO Compatible with heterogeneous architecture, e.g. on-chip accelerators

5 Requirements Transparency –Independent to the server’s software stack Efficiency –Low overhead packet interception Isolation –Resistant to attacks

6 ETTM: a scalable fault tolerant network manager. C. Dixon et al. NSDI ‘11 Related Work Transparency Hypervisor Overhead Hypervisor Vulnerability Virtualization Support for DPI deployment Hypervisor DPI VM Guest VM HW SW Virtualized Platform Processors

7 NIC Unprivileged Partition Multi-core processor core Privileged Partition Ally Architecture Software Stack (OS + Applications) Software Stack (DPI Application) core NIC Traffic

8 Outline Introduction & Motivation Architecture –Overview –Multicore Partitioning –Packet interception Evaluation Conclusions

9 Northbridge MMU Service Processor NIC Memory Controller Interconnect IOMMU Interrupt Unit BIOS External Network Main Memory Core Interrupt Controller MMU Core Interrupt Controller MMU Last Level Cache Management Network Baseline Architecture

10 Northbridge MMU Service Processor NIC Memory Controller Interconnect IOMMU Interrupt Unit BIOS External Network Main Memory Core Interrupt Controller MMU Core Interrupt Controller MMU Last Level Cache Management Network Ally Architecture Unprivileged partitionPrivileged partition

11 Outline Introduction & Motivation Architecture –Overview –Multicore Partitioning –Packet interception Evaluation Conclusions

12 Multicore Partitioning NIC Unprivileged Partition Multi-core processor core Privileged Partition Software Stack (OS + Applications) Software Stack (DPI Application) core Invisible

13 Core Sequestration Modify the BIOS to hide privileged core information from the OS  BSP core - the first core that boots  AP cores - the other cores  IPI - Inter-processor interrupts OS retrieves cores information BSP AP Core Info Table Wakeup IPI Update Ally Booting Procedure: AP DPI Engine DPI core waits for IN/OUT packets Initialize …...

14 Memory Protection TLB TLB Miss Handler CR3 Boundary Register Page Table Range Checking Main Memory Privileged partition Unprivileged partition MMU Unprivileged Core Partition the memory into two physically contiguous regions TLB Miss TLB Fill

15 Outline Introduction & Motivation Architecture –Overview –Multicore Partitioning –Packet interception Evaluation Conclusions

16 NIC Unprivileged Partition Multi-core processor core Privileged Partition Packet Interception Software Stack (OS + Applications) Software Stack (DPI Application) core NIC Traffic

17 Packet Interception Virtualization of the Descriptor Queues NIC OS memory Descriptor queues replicated DPI memory Only one copy of the packet buffers Descriptor queues

18 Packet Interception Virtualization of the Descriptor Queues –Device independent, software independent –No copying on packet buffers Processor and NIC communication –Queue manipulation uses Memory Mapped IO (MMIO) accesses –NIC event notification uses Interrupt

19 MMIO redirection MMU MMU detects specific MMIO addresses MMU redirects RW to a reserved region in DPI memory MMU sends IPI to DPI core DPI memory DPI core OS core IPI R/W redirection Load/store

20 Ally Hardware Properties Simple extensions to existing hardware components No impact expected on critical timing paths Compatible with virtualization support (Intel VT- x/EPT, AMD SVM/NPT)

21 Outline Introduction & Motivation Architecture –Overview –Multicore Partitioning –Packet interception Evaluation Conclusions

22 Evaluation Full system emulation QEMU Core sequestration HW changes Real machine prototype Hardware –Intel Core 2 duo 2.66 GHz with 1 Gbit Intel NIC Benchmarks –Netperf –SPECweb Systems –Ally, Linux and Xen

23 System Configurations Queue Virtualization NIC Driver Kernel Netperf/ Specweb Snort DPI core OS coreHW SW NIC Driver Kernel Netperf/ Specweb Snort DPI core OS coreHW SW IP queue Ally Linux

24 System Configurations Hypervisor Dom0 Kernel Netperf/ Specweb Snort DPI core OS coreHW SW Xen DomU Kernel

25 Netperf CPU Usage

26 SPECweb CPU Usage cycles/request * 10 6

27 Outline Introduction & Motivation Architecture –Overview –Multicore Partitioning –Packet interception Evaluation Conclusions

28 Conclusions  Ally: a framework for transparent deployment of packet inspection appliances  Ally uses a set of simple HW/FW extensions enable reliable multicore partitioning and efficient packet inspection  Ally is fully compatible with new virtualization technology as well as heterogeneous architecture

29 Thanks

30 Throughput

31 DPI using Network Processor

32 NIC Unprivileged Partition Multi-core processor core Conventional Architecture Software Stack (OS + Applications) core cores

33 NIC Unprivileged Partition Multi-core processor core Privileged Partition Transmission Path Software Stack (OS + Applications) Software Stack (DPI Application) core

34 NIC Unprivileged Partition Multi-core processor core Privileged Partition Receive Path Software Stack (OS + Applications) Software Stack (DPI Application) core

35 Integrated Northbridge DPI core Local APIC MMU Interface DPI core Local APIC MMU Interface OS core Local APIC MMU Interface OS core Local APIC MMU Interface Platform Controller Hub NIC Memory Controller On chip interconnect Processor IOMMUPCIe ctrl Interrupt Unit BIOS Network Main Memory DMI Ctrl OS core Local APIC MMU Interface Unprivileged partitionPrivileged partition DPI core Local APIC MMU Interface Last Level Cache IOAPIC Managemen t NIC Service Processor Management Network Privileged partition Unprivilege d partition

36 Integrated Northbridge DPI core Local APIC MMU Interface DPI core Local APIC MMU Interface OS core Local APIC MMU Interface OS core Local APIC MMU Interface Platform Controller Hub NIC Memory Controller On chip interconnect Processor IOMMUPCIe ctrl Interrupt Unit BIOS Network Main Memory DMI Ctrl OS core Local APIC MMU Interface Unprivileged partitionPrivileged partition DPI core Local APIC MMU Interface Last Level Cache IOAPIC Managemen t NIC Service Processor Management Network Privileged partition Unprivilege d partition

37 MMU Modification – Memory Protection TLB TLB Miss Handler CR3Special_reg Page Table DPI core boundary register phys_addr > special_reg ? Main Memory Privileged partition Unprivilege d partition

38 Memory Protection Procedure TLB TLB Miss Handler TLB miss Virtual Address CR3Special_reg Page Table DPI core boundary register phys_addr > special_reg ? Main Memory Privileged partition Unprivilege d partition

39 Memory Protection Procedure TLB TLB Miss Handler TLB miss Virtual Address TLB fill CR3Special_reg Page Table DPI core boundary register phys_addr > special_reg ? Main Memory Privileged partition Unprivilege d partition

40 NIC Unprivileged Partition Multi-core processor core Privileged Partition Memory Protection Software Stack (OS + Applications) Software Stack (DPI Application) core Invisible

41 Integrated Northbridge DPI core Local APIC MMU Interface DPI core Local APIC MMU Interface OS core Local APIC MMU Interface OS core Local APIC MMU Interface Platform Controller Hub NIC Memory Controller On chip interconnect Processor IOMMUPCIe ctrl Interrupt Unit BIOS Network Main Memory DMI Ctrl OS core Local APIC MMU Interface Unprivileged partitionPrivileged partition DPI core Local APIC MMU Interface Last Level Cache IOAPIC Management NIC Service Processor Management Network Privileged partition Unprivilege d partition

42 Integrated Northbridge DPI core Local APIC MMU Interface DPI core Local APIC MMU Interface OS core Local APIC MMU Interface OS core Local APIC MMU Interface Platform Controller Hub NIC Memory Controller On chip interconnect Processor IOMMUPCIe ctrl Interrupt Unit BIOS Network Main Memory DMI Ctrl OS core Local APIC MMU Interface Unprivileged partitionPrivileged partition DPI core Local APIC MMU Interface Last Level Cache IOAPIC Managemen t NIC Service Processor Management Network Privileged partition Unprivilege d partition

43 MMU Modification – MMIO Redirection TLB Redirection BitPhysical Page TLB Miss Handler Check uncacheable address map Redirection Table Physical Address Remapped Address

44 MMIO Redirection – TLB Miss On a TLB miss, the TLB miss handler does the page table walk TLB Redirection BitPhysical Page Virtual Address TLB miss TLB Miss Handler Page Table Lookup

45 MMIO Redirection – TLB Miss The TMH checks if the resulting physical address falls in an uncacheable page and hence potentially a MMIO page TLB Redirection BitPhysical Page TLB Miss Handler Physical Address Check uncacheable address map

46 MMIO Redirection – TLB Miss If the page is uncacheable, the TMH looks up the redirection table to check if any address in this page needs to be redirected TLB Redirection BitPhysical Page TLB Miss Handler Check uncacheable address map Redirection Table Physical Address Remapped Address Physical Address

47 MMIO Redirection – TLB Miss If any address in the page needs to be redirected, the TMH sets the redirection bit in addition to fill the TLB TLB Redirection BitPhysical Page TLB Miss Handler Check uncacheable address map TLB fill Redirection Table Physical Address Remapped Address

48 MMIO Redirection – TLB Hit On a TLB hit, if the redirection bit is set, the MMU looks up the Last Level Cache (LLC) used to cache translations in Redirection Table TLB Redirection Bit Physical Page Offset Physical Address Virtual Address LLC Physical Address Remapped Address

49 MMIO Redirection – TLB Hit If a translation is found, the MMU returns the translated address and sends IPI to privileged cores. TLB Redirection Bit Physical Page LLC Physical Address Remapped Address Translated Address Generate IPI Physical Address Hit

50 MMIO Redirection – TLB Hit If the LLC misses, then Redirection Table Lookup is performed TLB Redirection Bit Physical Page LLC Physical Address Remapped Address Redirection Table Lookup Physical Address Miss

51 Interrupt Unit Modification DPI core OS core Interrupt Unit NIC If Source == NIC, Redirect Interrupt

52 When NIC raises an interrupt, The interrupt Unit redirects the interrupt to DPI core Interrupt Redirection DPI core OS core Interrupt Unit NIC If Source == NIC, Redirect Interrupt Interrupt

53 After the NIC interrupt is handled, DPI core sends an IPI to OS core mimicking NIC interrupt Interrupt Redirection DPI core OS core Interrupt Unit NIC If Source == NIC, Redirect Interrupt IPI

54 Summary of Hardware Modifications UnitDescriptionPurpose OS-core MMU Prevent memory accesses to DPI memory from OS- core Protection Redirect MMIO accesses to DPI memory from OS- core and interrupt DPI core Packet Interception IOMMUPrevent non authorized DMA to DPI MemoryProtection IOAPICRedirect NIC interrupts to DPI-corePacket Interception All UnitsProtected configuration registersProtection

55 Functional Evaluation Full system emulation QEMU Validate Hardware and Firmware Changes

56 DPI core Usage

57 SPECweb Cache Misses

58 NIC Unprivileged Partition Multi-core processor core Privileged Partition Memory Protection Software Stack (OS + Applications) Software Stack (DPI Application) core Invisible How? Modified MMU

59 Challenges -Make privileged partition protected and invisible from the unprivileged partition -Core Sequestration -Memory Protection -Intercept packets efficiently -Packet Interception

60 Ally System NIC Linux kernel NIC Traffic Queue Virtualization NIC Driver Other Apps Snort DPI Core

61 Linux System NIC Linux kernel NIC Traffic IP queue NIC Driver Other Apps Snort Core

62 Xen System NIC Linux VM #0 NIC Traffic IP queue Hypervisor Other Apps Snort Core VM #1