Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presented by: Sagnik Bhattacharya Kingshuk Govil, Dan Teodosiu, Yongjang Huang, Mendel Rosenblum.

Similar presentations


Presentation on theme: "Presented by: Sagnik Bhattacharya Kingshuk Govil, Dan Teodosiu, Yongjang Huang, Mendel Rosenblum."— Presentation transcript:

1 Presented by: Sagnik Bhattacharya Kingshuk Govil, Dan Teodosiu, Yongjang Huang, Mendel Rosenblum

2 Overview Problems of current shared memory multiprocessors and our requirements Cellular Disco as a solution –architecture –prototype –hardware-fault containment –CPU management –Memory management –statistics Cellular Disco and ubiquitous environments Conclusion

3 Problem Extending modern Operating systems to run efficiently on shared memory multiprocessors. Software development has not kept pace with hardware development. Common operating systems fail beyond 12 processors.

4 What we need…. the system should be reliable it should be scalable it should be fault-tolerant it should not take too much of development time or effort.

5 Traditional approaches Hardware partitioning - lacks resource sharing, makes physical clusters. Software-centric approaches : (significant development time and cost) –modify existing OS –develop new OS

6 A scenario…. Control unit Smart Space ProcProc ProcProc (No rebooting necessary)

7 Solution : Cellular Disco Extension of previous work - Disco Uses the concept of Virtual machine monitors Partitions the multiprocessor system into virtual clusters.

8 Virtual Machine Monitor VM1 µP1µP2µP3 VM2 µP1µP3µP8 VM1 - µP’s 1,2,3 µP5 VM2 - µP’s 1,3,5,8 OS (Win NT) OS (IRIX 6.2) Virtual Machine Hardware

9 Virtual Machine Monitor VM1 µP1µP2µP3 VM2 µP1µP3µP8 VM1 - µP’s 1,2,3 µP5 VM2 - µP’s 1,3,5,8 OS (Win NT) OS (IRIX 6.2) I/O request

10 Virtual Machine Monitor VM1 µP1µP2µP3 VM2 µP1µP3µP8 VM1 - µP’s 1,2,3 µP5 VM2 - µP’s 1,3,5,8 OS (Win NT) OS (IRIX 6.2) Trap I/O request & perform I/O

11 Virtual Machine Monitor VM1 µP1µP2µP3 VM2 µP1µP3µP8 VM1 - µP’s 1,2,3 µP5 VM2 - µP’s 1,3,5,8 OS (Win NT) OS (IRIX 6.2) Perform I/O and send interrupt

12 Virtual Machine Monitor VM1 µP1µP2µP3 VM2 µP1µP3µP8 VM1 - µP’s 1,2,3 µP5 VM2 - µP’s 1,3,5,8 OS (Win NT) OS (IRIX 6.2)

13 Issues it addresses Address scalability NUMA awareness Hardware fault-containment Resource management

14 Basic Cellular Disco Architecture

15 Prototype Runs on a 32-processor SGI-Origin 2000 Supports shared memory systems based on MIPS R1000 architecture. The prototype runs piggybacked on IRIX 6.4 The host OS is made dormant and is only used to invoke some device drivers.

16 Hardware Virtualization Physical Resources - visible to a virtual machine Machine Resources - actual resources; allocated by Cellular Disco CD operates in the kernel mode of the MIPS processor CD intercepts all system calls.

17 Resource Management CPU management - Each processor maintains its own run queue Memory Management - Memory borrowing mechanism Each OS instance is only given as many resources as it can handle. Large applications are split and communications between the parts is established by using the shared-memory regions.

18 CPU Management VCPU migration : - Intra node (37 µsec) - Inter node (520 µsec) - Inter Cell (1520 µsec)

19 VCPU migration Cellular Disco Interconnect InterconnectNodeNodeNodeNodeNodeNode CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCellCell VCPU

20 Intra Node Cellular Disco Interconnect InterconnectNodeNodeNodeNodeNodeNode CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCellCell VCPU

21 Inter Node Cellular Disco Interconnect InterconnectNodeNodeNodeNodeNodeNode CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCellCell VCPU

22 Inter Cell Cellular Disco Interconnect InterconnectNodeNodeNodeNodeNodeNode CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCellCell VCPU

23 CPU Management (contd.) CPU balancing : Idle Balancer Periodic balancer Load Balancing Scenario

24 Idle balancer CPU0CPU1CPU2CPU3 VC B0 VC A1 VC B1 VC A0 Does this have enough cache affinity to CPU2? (Idle) Asks

25 Idle balancer CPU0CPU1CPU2CPU3 VC B0 VC A1 VC B1 VC A0 Does this have enough cache affinity to CPU2? NO!! (Idle) Asks

26 Idle balancer CPU0CPU1CPU2CPU3 VC B0 VC A1 VC B1 VC A0 VC B1

27 Periodic Balancer Does depth-first traversal of the load tree 4 13 1021 Traversal

28 Periodic Balancer Checks difference of 2 siblings, ignores if<2 4 13 1021 Traversal Diff=1Diff=1

29 Periodic Balancer If diff>=2 does load balancing if benefit>cost 4 13 1021 Traversal Diff=2 Diff=2

30 Gang Scheduling For all the CPU’s we select the VCPU that is to run on the physical CPU. The VCPU selected is the highest priority be gang-runnable VCPU –all non-idle VCPU’s of that VM are either running or, waiting on run queues of processors running lower- priority VM’s.

31 Example µP1 : µP2 : µP3 : VC1 VC2 VC5 VC7VC5 VC1VC9 VC3VC4 Currently Executing VCPU Wait Queue VM1 VC’s - 1,3,8 (idle) VM2 VC’s - 2,4,6 (idle),7 VM3 VC’s - 5,9 Priority

32 Example µP1 : µP2 : µP3 : VC1 VC2 VC5 VC7VC5 VC1VC9 VC3VC4 VM1 VC’s - 1,3,8 (idle) VM2 VC’s - 2,4,6 (idle),7 VM3 VC’s - 5,9 Priority Gang Runnable

33 Example µP1 : µP2 : µP3 : VC5 VC9 VC5 VC7VC1 VC1VC2 VC3VC4 New Executing VCPU New Wait Queue VM1 VC’s - 1,3,8 (idle) VM2 VC’s - 2,4,6 (idle),7 VM3 VC’s - 5,9 Priority

34 Memory Management Each cell maintains its own freelist, and allocates memory to other cells in it allocation preference list on request(RPC). Speed - 758 µsec for 4 MB. A threshold is set for min. amount of local free memory As far as possible Paging is avoided.

35 Memory Borrowing freelist - list of free pages in the cell allocation preference list - list of cells from which borrowing memory is more beneficial than paging.

36 Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold

37 Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold asks

38 Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold refused

39 Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold cannot ask

40 Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold asks

41 Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold Gives 4 MB 4 MB

42 Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold

43 Memory Management (contd.) Paging : Algo - Second Chance FIFO Page sharing information by some control data structure Cellular Disco traps all read and write requests made by the Operating Systems

44 Second-chance FIFO A reference bit is added to each page in FIFO scheme Every time the page is accessed the bit is set to 1 If the page is selected by FIFO, and the reference bit is 1, then it is set to 0 and another page is looked for. A page is the target page if it is selected b FIFO and the reference bit is 0

45 Example Page Fault 1 Oldest Page 1 Oldest Page 0 Second Oldest Page Oldest Page FIFO RB Page Table

46 Example Page Fault 0 Oldest Page 0 Oldest Page 0 Second Oldest Page Oldest Page Second- chance FIFO RB Page Table

47 Example 0 Oldest Page 0 Oldest Page RB Page Table

48 Hardware fault-containment Failure rate increases with increase in processors. Internally structured as a set of semi- independent cells. Failure in one cell does not impact VM’s running in other cells (localization of faults) Assumption - CD is a trusted software layer

49 Cellular Structure Fault in one cell does not affect others

50 Hardware fault-containment (contd.) Communication modes - Fast inter-processor RPC - Message Side benefit - Software fault containment, i.e., individual OS crashes do not impact the system.

51 Hardware-Fault recovery liveset - set of still functioning nodes. Failure - removal from liveset Recovery - insert back to liveset Virtual machines dependent on the failed cell are terminated. Memory dependencies are updated when a cell fails.

52 Example Cellular Disco Interconnect InterconnectNode1Node4Node5Node6Node3Node2 CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCell Cell VM 1 VM 2 VM 3 Liveset - 1,2,3,4,5,6

53 Example Cellular Disco Interconnect InterconnectNode1Node4Node5Node6Node3Node2 CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCell Cell VM 1 VM 2 VM 3 Liveset - 1,2,3,4,5,6 BOOM

54 Example Cellular Disco Interconnect InterconnectNode4Node5Node6Node3 CPUCPUCPUCPUCPUCPU CellCell Cell VM 2 Liveset - 5,6

55 Example Cellular Disco Interconnect InterconnectNode1Node4Node5Node6Node3Node2 CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCell Cell VM 2 Liveset - 5,6 Interrupt

56 Example Cellular Disco Interconnect InterconnectNode1Node4Node5Node6Node3Node2 CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCell Cell VM 2 Liveset - 1,2,3,4,5,6

57 Fault-Recovery overhead

58 Virtualization Overheads (the first column shows the exec. Time on IRIX 6.4 and the second shows the exec. time on Cellular Disco).

59 Cellular Disco and Ubiquitous environments Provides raw computational power for our smart spaces. More importantly it does not fail. Fault- recovery present. Adaptable to new Operating systems

60 Grey Areas Will the source simplicity remain if it is not piggybacked on IRIX 6.4? Will it work on non-uniform multiprocessor systems? –Probable solution - development of a hardware virtualization standard

61 In conclusion…. Cellular Disco present a midway path between hardware and software directed techniques. It can be used on the central control unit for our smart spaces because it is scalable and fault-tolerant.


Download ppt "Presented by: Sagnik Bhattacharya Kingshuk Govil, Dan Teodosiu, Yongjang Huang, Mendel Rosenblum."

Similar presentations


Ads by Google