An Integrated Framework for Dependable and Revivable Architecture Using Multicore Processors Weidong ShiMotorola Labs Hsien-Hsin “Sean” LeeGeorgia Tech.

Slides:



Advertisements
Similar presentations
Debugging operating systems with time-traveling virtual machines Sam King George Dunlap Peter Chen CoVirt Project, University of Michigan.
Advertisements

Virtualization Technology
Exploiting Access Semantics and Program Behavior to Reduce Snoop Power in Chip Multiprocessors Chinnakrishnan S. Ballapuram Ahmad Sharif Hsien-Hsin S.
EECS 470 Virtual Memory Lecture 15. Why Use Virtual Memory? Decouples size of physical memory from programmer visible virtual memory Provides a convenient.
1 Ally: OS-Transparent Packet Inspection Using Sequestered Cores Jen-Cheng Huang 1, Matteo Monchiero 2, Yoshio Turner 3, Hsien-Hsin Lee 1 1 Georgia Tech.
Virtual Memory Virtual Memory Management in Mach Labels and Event Processes in Asbestos Ingar Arntzen.
G Robert Grimm New York University Disco.
CS 300 – Lecture 22 Intro to Computer Architecture / Assembly Language Virtual Memory.
An Integrated Framework for Dependable Revivable Architectures Using Multi-core Processors Weiding Shi, Hsien-Hsin S. Lee, Laura Falk, and Mrinmoy Ghosh.
Virtual Memory and Paging J. Nelson Amaral. Large Data Sets Size of address space: – 32-bit machines: 2 32 = 4 GB – 64-bit machines: 2 64 = a huge number.
Checkpoint Based Recovery from Power Failures Christopher Sutardja Emil Stefanov.
CS533 Concepts of OS Class 16 ExoKernel by Constantia Tryman.
Virtual Memory By: Dinouje Fahih. Definition of Virtual Memory Virtual memory is a concept that, allows a computer and its operating system, to use a.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
On-Chip Control Flow Integrity Check for Real Time Embedded Systems Fardin Abdi Taghi Abad, Joel Van Der Woude, Yi Lu, Stanley Bak, Marco Caccamo, Lui.
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Jonathan Walpole (based on a slide set from Vidhya Sivasankaran)
CS533 Concepts of Operating Systems Jonathan Walpole.
CS533 Concepts of Operating Systems Jonathan Walpole.
CSE 451: Operating Systems Section 10 Project 3 wrap-up, final exam review.
Virtualization: Not Just For Servers Hollis Blanchard PowerPC kernel hacker.
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, Institute of Network and Information Systems School of Electrical Engineering.
SafetyNet Improving the Availability of Shared Memory Multiprocessors with Global Checkpoint/Recovery Daniel J. Sorin, Milo M. K. Martin, Mark D. Hill,
Virtual Memory Expanding Memory Multiple Concurrent Processes.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard et al. Madhura S Rama.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.
Chapter 2 Securing Network Server and User Workstations.
Process Architecture Process Architecture - A portion of a program that can run independently of and concurrently with other portions of the program. Some.
Introduction to virtualization
Introduction to Virtual Memory and Memory Management
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
Processes and Virtual Memory
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Full and Para Virtualization
Protection of Processes Security and privacy of data is challenging currently. Protecting information – Not limited to hardware. – Depends on innovation.
Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei.
CS533 Concepts of Operating Systems Jonathan Walpole.
CS203 – Advanced Computer Architecture Virtual Memory.
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
Oracle Database Architectural Components
CS161 – Design and Architecture of Computer
Translation Lookaside Buffer
Virtualization.
Lecture 11 Virtual Memory
Virtual Memory Chapter 7.4.
Memory COMPUTER ARCHITECTURE
CS161 – Design and Architecture of Computer
From Address Translation to Demand Paging
Section 9: Virtual Memory (VM)
From Address Translation to Demand Paging
Morgan Kaufmann Publishers
CS510 Operating System Foundations
CSE 153 Design of Operating Systems Winter 2018
Introduction to Operating Systems
COSC121: Computer Systems. Managing Memory
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
From Address Translation to Demand Paging
Systems Design Chapter 6.
Translation Lookaside Buffer
Virtual Memory Overcoming main memory size limitation
Co-designed Virtual Machines for Reliable Computer Systems
Lecture 8: Efficient Address Translation
CSE 153 Design of Operating Systems Winter 2019
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Presentation transcript:

An Integrated Framework for Dependable and Revivable Architecture Using Multicore Processors Weidong ShiMotorola Labs Hsien-Hsin “Sean” LeeGeorgia Tech Laura FalkUniversity of Michigan Mrinmoy GhoshGeorgia Tech

2 Problem Statement Highly Available, Reliable, and Revivable networked services. Explore new programming and usage models for Multi- core processors Provide “architectural support” for network services to be –Autonomic –Remote-exploits revivable –Self-recoverable Achieve high performance

3 Problem Statement Highly Available, Reliable, and Revivable networked services. Explore new programming and usage models for Multi-core processors Provide “architectural support” for network services to be –Autonomic –Remote-exploits revivable –Self-recoverable Achieve high performance

4 Toward Self-recovery Network Services Causes of Network Service Loss Accidental TransientHeisenbugsDamage Aging Intentional DoS Buffer Overflow Solutions Replication Rejuvenation Checkpoint Remote Exploit Self- recovery

5 Multicore: An ideal platform Exploit insulation: Each core of a multicore can be programmed to run at different privilege levels with different OS. Dual Core (Merome) Server Core Monitor Core Shared L2 Tight coupling of cores comparing with SMP Fine-grained processor state monitoring Concurrent monitoring, efficient state backup and recovery Massive multi-core will have many idle cores

6 INDRA: A Dependable and Revivable Architecture Monitor Core L2 Cache IL1 Cache IL1 Cache DL1 Cache DL1 Cache Monitor Insulation Issue Recovery Control Memory Interface Watch Dog Memory Interface Watch Dog Physical Memory Space (used by service OS and applications) Protected Memory Space (monitor BIOS, OS, and SW) Server Core (Network Apps) Server Core (Network Apps) IL1 Cache IL1 Cache DL1 Cache DL1 Cache Trace Filter Trace Filter Trace FIFO Trace FIFO Code origin check CFG check Control signals

7 Data Page Code Page Monitor Core: Insulated Parallel Inspection [Kiriansky et al., USENIX 2002] Vuln_func() { // Attack!! // Return address changed } FunctionA() { Vuln_func(); A =3; } Malicious_func() { } Code Page Code Origin Check Control Flow Graph Check Exception Handling

8 Server Core: Request Based Recovery Issue state backup request Issue state backup request Read network request (Request for page arch.ece.gatech.edu) Read network request (Request for page arch.ece.gatech.edu) Process network request Monitor Signalled Error? NoYes Restore Checkpointed State Restore Checkpointed State

9 Comparison of Backup and Recovery BackupRecovery Approach Software checkpointing Slow Fast, modify page translation Memory Update Log Fast Log based undo slow Virtual Checkpointing Copy dirty page on demand, slow Fast, modify TLB entry INDRA Fast, no page copy

10 INDRA Backup Page Record Active Page Modified TLB Global Timestamp Register (GT) GT=4 Backup Page TLB Extension for Backup and Rollback Dirty Block Bitvector Backup Page (Physical Address) Rollback Bitvector Rollback Valid Local Timestamp Active Page (Physical Address) Tag Dirty Block Bitvector Backup Page (Physical Address) Local Timestamp Rollback Bitvector Rollback Valid 3 Processor Memory

11 INDRA Backup Page Record Active Page Modified TLB Global Timestamp Register (GT) GT=4 Backup Page TLB Extension for Backup and Rollback Backup Page Record Processor Memory Dirty Block Bitvector Backup Page (Physical Address) Local Timestamp Rollback Bitvector 3 Dirty Block Bitvector Backup Page (Physical Address) Rollback Bitvector Backp Record Rollback Valid Local Timestamp Active Page (Physical Address) Tag Rollback Valid 3

12 INDRA Recovery Example Active Page Global Timestamp Register (GT) GT=5 Backup Page Modified TLB TLB Extension for Backup and Rollback 3 Dirty Block Bitvector Backup Page (Physical Address) Rollback Bitvector Backup Record Rollback Valid Local Timestamp Active Page (Physical Address) Tag Current Operation Wr memory line 7 REQUEST n 5

13 INDRA Recovery Example Active Page Global Timestamp Register (GT) GT=5 Backup Page Modified TLB TLB Extension for Backup and Rollback 3 Dirty Block Bitvector Backup Page (Physical Address) Rollback Bitvector Backup Record Rollback Valid Local Timestamp Active Page (Physical Address) Tag Current Operation REQUEST n 5 Wr memory line 2

14 INDRA Recovery Example Active Page Global Timestamp Register (GT) GT=5 Backup Page Modified TLB TLB Extension for Backup and Rollback 3 Dirty Block Bitvector Backup Page (Physical Address) Rollback Bitvector Backup Record Rollback Valid Local Timestamp Active Page (Physical Address) Tag REQUEST n 5 Failure Signal Restore system resource allocation Restore process context 1

15 INDRA Recovery Example Active Page Global Timestamp Register (GT) GT=5 Backup Page Modified TLB TLB Extension for Backup and Rollback 3 Dirty Block Bitvector Backup Page (Physical Address) Rollback Bitvector Backup Record Rollback Valid 1 Local Timestamp Active Page (Physical Address) Tag REQUEST n+1 5 Current Operation Rd memory line 7

16 INDRA Recovery Example Active Page Global Timestamp Register (GT) GT=5 Backup Page Modified TLB TLB Extension for Backup and Rollback 3 Dirty Block Bitvector Backup Page (Physical Address) Rollback Bitvector Backup Record Rollback Valid 1 Local Timestamp Active Page (Physical Address) Tag REQUEST n+1 5 Current Operation Wr memory line 1

17 INDRA Recovery Example Active Page Global Timestamp Register (GT) GT=5 Backup Page Modified TLB TLB Extension for Backup and Rollback 3 Dirty Block Bitvector Backup Page (Physical Address) Rollback Bitvector Backup Record Rollback Valid 1 Local Timestamp Active Page (Physical Address) Tag REQUEST n+1 5 Current Operation Handle Next Request Global Timestamp Register (GT) GT=6 Record system resource allocation Record process context

18 INDRA Recovery Example Active Page Global Timestamp Register (GT) GT=5 Backup Page Modified TLB TLB Extension for Backup and Rollback 3 Dirty Block Bitvector Backup Page (Physical Address) Rollback Bitvector Backup Record Rollback Valid 1 Local Timestamp Active Page (Physical Address) Tag REQUEST n+2 5 Current Operation Global Timestamp Register (GT) GT=6 Wr memory line 4 6

19 Test Bed (Bochs + TAXI [Vlaovic & Davidson, ICCD’02] ) Monitor (Stripped Down OS, Security SW, 10MB) Monitor (Stripped Down OS, Security SW, 10MB) Linux Network Server Linux Network Server Bochs + TAXI Host OS Network Requests Server Response Run production OS with real service applications, httpd, ftpd, bind, sendmail, etc. Recoverability evaluated by applying real x86 remote exploits from security websites. Experiment with documented exploits

20 Inter-Request Interval (# of Instructions)

21 I-Cache Miss Rate Code Origin Check reads traces of code read from L2 Cache Number of Instructions in the Trace is Proportional to L1 I Cache Miss Rate Overhead of monitoring code origin depends on L1 I Cache Miss Rate

22 Monitoring Overhead

23 Sensitivity of Monitoring Queue Size Queue Size Queue Size vs. Performance Slowdown

24 Backup Overhead of Modified Lines

25 Performance of Recovery + Monitoring

26 Conclusions Real time exploit monitoring with autonomic recovery increases revivability and availability. Multicore architectures are an ideal candidate for new type of revivable system. INDRA-based Multicore system can provide improved reliability and availability. More research is required to explore the trade-off between availability, performance, architecture design, and cost.

27 Questions and Answers Thank you !