Microreboot. References 1.George Candea, Shinichi Kawamoto, Yuichi Fujiki, Greg Friedman, Armando Fox, “Microreboot – A Technique for Cheap Recovery”,

Slides:



Advertisements
Similar presentations
Case Study: Photo.net March 20, What is photo.net? An online learning community for amateur and professional photographers 90,000 registered users.
Advertisements

Performance Testing - Kanwalpreet Singh.
The google file system Cs 595 Lecture 9.
Distributed System Structures Network Operating Systems –provide an environment where users can access remote resources through remote login or file transfer.
Fast and Safe Performance Recovery on OS Reboot Kenichi Kourai Kyushu Institute of Technology.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Spark: Cluster Computing with Working Sets
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
Capacity Planning and Predicting Growth for Vista Amy Edwards, Ezra Freeloe and George Hernandez University System of Georgia 2007.
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
Reliability Week 11 - Lecture 2. What do we mean by reliability? Correctness – system/application does what it has to do correctly. Availability – Be.
Sinfonia: A New Paradigm for Building Scalable Distributed Systems Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, Christonos Karamanolis.
CS-550 (M.Soneru): Recovery [SaS] 1 Recovery. CS-550 (M.Soneru): Recovery [SaS] 2 Recovery Computer system recovery: –Restore the system to a normal operational.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
2001 ©R.P.Martin Using Distributed Data Structures for Constructing Cluster-Based Servers Richard Martin, Kiran Nagaraja and Thu Nguyen Rutgers University.
Rensselaer Polytechnic Institute CSC 432 – Operating Systems David Goldschmidt, Ph.D.
Remote Files. Traditional Memory Interfaces Process Virtual Memory Virtual Memory File Management File Management Physical Memory Physical Memory Storage.
CSE 490dp Resource Control Robert Grimm. Problems How to access resources? –Basic usage tracking How to measure resource consumption? –Accounting How.
Maintaining and Updating Windows Server 2008
Chapter 9 Overview  Reasons to monitor SQL Server  Performance Monitoring and Tuning  Tools for Monitoring SQL Server  Common Monitoring and Tuning.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
Selecting and Implementing An Embedded Database System Presented by Jeff Webb March 2005 Article written by Michael Olson IEEE Software, 2000.
Computer System Architectures Computer System Software
Oracle10g RAC Service Architecture Overview of Real Application Cluster Ready Services, Nodeapps, and User Defined Services.
CH2 System models.
Enterprise JavaBeans. What is EJB? l An EJB is a specialized, non-visual JavaBean that runs on a server. l EJB technology supports application development.
Rensselaer Polytechnic Institute CSCI-4210 – Operating Systems CSCI-6140 – Computer Operating Systems David Goldschmidt, Ph.D.
SafetyNet: improving the availability of shared memory multiprocessors with global checkpoint/recovery Daniel J. Sorin, Milo M. K. Martin, Mark D. Hill,
Usenix Annual Conference, Freenix track – June 2004 – 1 : Flexible Database Clustering Middleware Emmanuel Cecchet – INRIA Julie Marguerite.
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
| JavaOne 2003 | Session #1870 Massive Scale Deployments Tips, Tricks, & Pitfalls Stephen Davidson Principal Associate Stephen Davidson & Associates, Inc.
® IBM Software Group © 2007 IBM Corporation Best Practices for Session Management
Oracle's Distributed Database Bora Yasa. Definition A Distributed Database is a set of databases stored on multiple computers at different locations and.
Kjell Orsborn UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University.
CS338Parallel and Distributed Databases11-1 Parallel and Distributed Databases Lecture Topics Multi-CPU and distributed systems Monolithic system Client–server.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
Chapter 10 Recovery System. ACID Properties  Atomicity. Either all operations of the transaction are properly reflected in the database or none are.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
A Recovery-Friendly, Self-Managing Session State Store Benjamin Ling and Armando Fox
Progress Report Armando Fox with George Candea, James Cutler, Ben Ling, Andy Huang.
Introduction to Performance Testing Performance testing is the process of determining the speed or effectiveness of a computer, network, software program.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
EJB Enterprise Java Beans JAVA Enterprise Edition
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Maintaining and Updating Windows Server 2008 Lesson 8.
Managing Multi-User Databases
Netscape Application Server
CSE-291 Cloud Computing, Fall 2016 Kesden
Maximum Availability Architecture Enterprise Technology Centre.
A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)
GlassFish in the Real World
Unit OS10: Fault Tolerance
Introduction to Operating Systems
Outline Midterm results summary Distributed file systems – continued
Operating Systems : Overview
Operating Systems : Overview
Software System Testing
Virtual Memory: Working Sets
Database System Architectures
Presentation transcript:

Microreboot

References 1.George Candea, Shinichi Kawamoto, Yuichi Fujiki, Greg Friedman, Armando Fox, “Microreboot – A Technique for Cheap Recovery”, Proceedings 6 th Symposium on Operating Systems Design and Implementation (OSDI’04), pp 31 – 44.

The Problem Software has bugs –Memory leaks, race conditions, environment dependent –Many bugs that appear in production have no fix at the time of failure –Enterprise scale systems application level failures are more frequent Modern Operating Systems are comparatively more reliable –Desktop system OS continue to have problems Testing eliminates some of the bugs, but not all the bugs [1]

Recovery - Urgency Enterprise failures focuses the operators on the need to recover from failure and restoration of operations –Diagnosis is for later – no time for real-time diagnosis –Studies show that rebooting often is adequate even if the cause of the failure is unknown Server clusters increase reliability – redundancy to withstand failure –Isolate failed node –Reboot the failed node –Reintegrate the recovered node into the cluster [1]

Recovery using reboots There is high-confidence that reboot will reclaim stale or leaked resources Does not require correct functioning of the rebooted system Easy to implement and automate Returns the system to a “best” (most) tested state [1]

Recovery using reboots - 2 Unexpected reboots can result in data loss and unpredictable recovery times –Data recovery and process recovery are not isolated –Example: write back buffer caches Data for persistent storage is kept in volatile memory Unexpected reboot will loose the buffer contents, i.e. the files would not be updated [1]

Microreboot Individual rebooting of fine-grain application components Potentially the same results as a system reboot, but much faster and less loss of work Safe microreboots require –Well isolated –Stateless components –Application state is saved in specialized state stores –Consequence: isolation of data recovery from process recovery Rejuvenate the system without shutting it down [1]

Microrebooting: High Availability Try microrebooting first –Even if false positives are expected –Even if the failure is not expected to be fixed by microreboot Even if the system reboot is required, the microreboot adds only small amount of time to the process In server clusters, microrebooting should be tried before fail over –Avoids loading of non-failed server [1]

Microrebootable Internet Services Software Many relatively short tasks –Work lost because of microrebooting is small –Few requests lost per day Design goals –Fast and correct component recovery –Strongly localized recovery –Fast and correct reintegration of recovered components [1]

Microrebootable Internet Services Software -2 Fine grain components –Program logic and start up time –Software tools can help build such systems State segregation –Consistent state across microrebooting –Keep state info in state stores outside the application Transaction databases Session state managers –Isolate data recovery from application recovery [1]

Session Persistence: Sticky Sessions User 1User 2User 3User N Load Balancer Traffic Distribution, Session Persistence, SSL Termination Server 1Server 2Server 3Server N

Server Clusters User 1User 2User 3User N Load Balancer Traffic Distribution, Session Persistence, SSL Termination Server 1Server 2Server 3Server N Persistent db Disk, memory resident. Multicasting Shared memory

Virtual Server Implementation: Session Replication User 1User 2User 3User N Load Balancer Traffic Distribution, Session Persistence, SSL Termination Server 1Server 2Server 3Server N Virtual Server 1 Virtual Server 2 Virtual Server 3 Virtual servers are short lived. Persistent db is easy. Multi casting –Additional network traffic. –Reduce traffic through smaller clusters. Shared memory is not recommended.

Storing Session State FastS is an in-memory storage in the JBoss embedded webserver –In case of failure of JVM session state is lost SSM maintains state in a separate machine –Slower but session state is retained even if JVM fails

Microrebootable Internet Services Software -3 Decoupling of components helps to gracefully microreboot –Direct references, e.g. pointers are stored outside the components Retryable requests –Timeouts Leases –Persistent state would have longer term leases –CPU execution time: hanging computations would lead to non-renewal of leases, and program will be terminated by microreboot [1]

Crash Only Design Programs that can be safely crashed in whole or by parts and recover quickly every time Fine-grain components –Fast restart and reinitialization State segregation –E-Commerce handle 3 types of state Presentation (http, gif, jsp, etc – stateless) Session Persistent (session related – lost on session end) Long term data persistence (db – customer info) Isolation and decoupling of EJBs –Compiler enforced interfaces –EJBs cannot use each others internal variables –Microreboot all the “connected” EJBs together [1]

Evaluation Framework Test application: EBid –Ebay type of auction Client Emulator –Human clients are modeled with a 25 state Markov chain Example states: Login, BuyNow or AboutMe In between URL clicks – model user think time –Exponential with a mean = 7 seconds and a max of 70 seconds [1]

Evaluation Framework - 2 Failure Detection –Network level error (File not found, Service not available,…http 4xx or 5xx) –Text analysis and look for key words like “error”, “failed”, “exception” –Application induced error Request logon when user is logged –Compare system under test with a known good instance: much more complex [1]

Evaluation Framework - 3 Fault diagnosis and Recovery Manager (RM) – recursive recovery policy –RM Microreboots or reboots as required –RM monitors the failure reports –Try the cheapest reboot first [1]

Evaluation Framework - 4 Application availability is measured –User operation: login to logout –Each session consists of multiple user actions –Each action is a sequence of operations (http requests) that ends in a commit point – indicates successful completion –All operations must succeed for the action to be marked success; any operations is marked as a failed action [1]

Evaluation Questions Are microreboots successful in recovering from failure? Are microreboots better than JVM restart? Are microreboots useful in clusters? What is the performance overhead? [1]

What faults to induce? Key question but very little supporting evidence on critical software faults in production systems Anecdotal reports of faults –Deadlocked threads –Leak-induced resource exhaustion –Bug-induced corruption of volatile metadata –Incorrect handling of Java exceptions Compare to CVE, CWE. Malicious vs internal [1]

Faults injected Set value to null – NullPointerException on access Invalid value that passes type check – UserID of size greater than Max UserID size Wrong value – valid for application but incorrect Recover using recursive recovery policy [1]

Resuscitation – resume request processing, without fixing db corruptions Recovery – resume + 100% correct db Microreboot successful Microrebooting not successful Additional effort – more coarse operation or manual

Microrebooting vs Full Reboot Availability – assessed by requests denied during down time Microreboots instead of JVM restarts reduced failures by 98% –Width of the dip estimate failed requests see Figure 1Figure 1 Microreboots recover faster – see Table 3see Table 3 Microreboots reduce funtional distruption – see depth of dip in Figure 1 (more requests lost), and analysis in Figure 2Figure 1 Figure 2 Microreboots reduce lost work [1]

Entity Group – 5 EJBs: Category, Region, User, Item and Bid Restarting EBid does not restart all the EJBs Initialization dominates the time taken

[1]

Microrebooting in Clusters Failover under normal load –Sticky session serversservers –Number of failed requests depends on number of failed sessions –More requests failed with JVM restart –More nodes per cluster reduces the microrebooting advantage [1]

Microrebooting in Clusters - 2 Behavior under changing load –Peak maybe several times average –In experiment peak = 2*average –Consumers distracted if the request response more than 8 seconds [1]

Performance Impact Impact of microrebooting on steady state fault-free throughput and latency Throughput changes are small Latency increase is more serious –Human perceptible delay is 100 ms [1]