Presentation is loading. Please wait.

Presentation is loading. Please wait.

Improving the Reliability of Commodity Operating Systems Michael M. Swift, Brian N. Bershad, Henry M. Levy Presented by Ya-Yun Lo EECS 582 – W161.

Similar presentations


Presentation on theme: "Improving the Reliability of Commodity Operating Systems Michael M. Swift, Brian N. Bershad, Henry M. Levy Presented by Ya-Yun Lo EECS 582 – W161."— Presentation transcript:

1 Improving the Reliability of Commodity Operating Systems Michael M. Swift, Brian N. Bershad, Henry M. Levy Presented by Ya-Yun Lo EECS 582 – W161

2 Outline Introduction Nooks Implementation Evaluating Reliability Performance EECS 582 – W162

3 Device Driver A module that translates high-level OS requests to device-specific requests Programmers writing device drivers are often less experienced EECS 582 – W163 Kernel Application Virtual Memory File Systems Networking Scheduling … Device Drivers 70% of Linux kernel code!

4 Motivation Kernel extensions are a major source of system failures EECS 582 – W164 Kernel Application Virtual Memory File Systems Networking Scheduling … Device Drivers 70% of Linux kernel code!

5 Motivation Kernel extensions are a major source of system failures EECS 582 – W165 Kernel Application Virtual Memory File Systems Networking Scheduling … Device Drivers 70% of Linux kernel code!

6 Goal Eliminate downtime caused by drivers Isolation - Prevent system crashes Recovery - Keep applications running EECS 582 – W166 Kernel Driver Application

7 Goal Eliminate downtime caused by drivers Isolation - Prevent system crashes Recovery - Keep applications running EECS 582 – W167 Kernel Driver Application

8 Nooks A reliability subsystem that Isolates extensions from the kernel For fault resistance, not fault tolerance System must prevent and recover from most extension mistakes For mistakes, not abuse Exclude malicious behavior EECS 582 – W168

9 Nooks Isolation Isolate kernel from extension failures Detect extension failures before they corrupt kernel Backward-compatible with existing systems and extensions Practical Efficient EECS 582 – W169

10 Nooks Isolation Manager (NIM) Transparent OS layer inserted between the kernel and kernel extensions EECS 582 – W1610

11 Nooks Isolation Manager (NIM) Isolation Lightweight kernel protection domain Extension Procedure Call (XPC): Communication between kernel and extensions must go this new kernel service Interposition Control flow: XPC Data transfer: Object tracking All interfaces are done through Wrappers (similar to stubs in RPC) EECS 582 – W1611

12 Nooks Isolation Manager (NIM) Object Tracking Control all modifications of data structures by each extensions Extensions cannot directly modify kernel data structures Recovery Detect and recover from various extension faults Recovery helped by Nooks isolation mechanisms EECS 582 – W1612

13 Implementation of Nooks Inside Linux 2.4.18 kernel on Intel x86 architecture Linux kernel over 700 functions callable by extensions over 650 extension-entry functions callable by the kernel Most interactions between kernel and extensions go through function calls EECS 582 – W1613

14 Isolation Memory management Lightweight protection domains with virtual memory protection Read-only access to kernel Read-write access to its own domain Extension Procedure Call (XPC) Transfer control safely between extensions and the kernel Similar to Remote Procedure Call (RPC) EECS 582 – W1614

15 Interposition Bind extensions to wrappers when the extensions are loaded Enable the extension to execute within its lightweight protection domain Wrapper Check parameters for validity Implement call by value and result Perform an XPC to execute the desired function EECS 582 – W1615

16 Implementation Limitations Does not provide complete isolation or fault tolerance for all possible extension errors Current implementation of Recovery assumes that extensions can be killed and restarted safely EECS 582 – W1616

17 Evaluating Reliability Tested eight extensions Two sound card drivers Four Ethernet drivers A Win95 compatible file system (VFAT) An in-kernel Web server Injected 400 faults 317 resulted in extension failures EECS 582 – W1617

18 Reliability Results Nooks eliminated 99% of the crashes observed with native Linux EECS 582 – W1618

19 Reliability Results Overall, Nooks eliminated 55% of non-fatal extension failures caused by fault injection trials EECS 582 – W1619

20 Performance Dell 1.7 GHz Pentium 4 PC running Linux 2.4.18 890 MB RAM SoundBlaster 16 sound card Intel Pro/1000 Gigabit Ethernet adapter single 7200 RPM, 41 GB IDE hard disk drive EECS 582 – W1620

21 Performance EECS 582 – W1621 Relative performance is determined by Comparing latency: Play-mp3, Compile-local Throughput: Send/Receive-stream, Serve-simple/complex-web-page

22 Conclusion Nooks focuses on achieving backward compatibility Cannot provide complete isolation and fault tolerance With modest engineering effort, isolation and recovery can dramatically improve the system’s reliability Performance loss rating from 0 to 60% EECS 582 – W1622


Download ppt "Improving the Reliability of Commodity Operating Systems Michael M. Swift, Brian N. Bershad, Henry M. Levy Presented by Ya-Yun Lo EECS 582 – W161."

Similar presentations


Ads by Google