Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards Elastic Operating Systems Amit Gupta Ehab Ababneh Richard Han Eric Keller University of Colorado, Boulder.

Similar presentations


Presentation on theme: "Towards Elastic Operating Systems Amit Gupta Ehab Ababneh Richard Han Eric Keller University of Colorado, Boulder."— Presentation transcript:

1 Towards Elastic Operating Systems Amit Gupta Ehab Ababneh Richard Han Eric Keller University of Colorado, Boulder

2 2 OS/Process Resources Limited Thrashing CPUs limited I/O bottlenecks Network Storage Present Workarounds Additional Scripting/Code changes Extra Modules/Frameworks Coordination Synch/Aggregating State OS + Cloud Today ELB/ Cloud Mgr

3 3 Advantages Expands available Memory Extends the scope of Multithreaded Parallelism (More CPUs available) Mitigates I/O bottlenecks Network Storage Stretch Process OS/Process

4 ElasticOS : Our Vision 4

5 ElasticOS: Our Goals  “Elasticity” as an OS Service  Elasticize all resources – Memory, CPU, Network, …  Single machine abstraction  Apps unaware whether they’re running on 1 machine or 1000 machines  Simpler Parallelism  Compatible with an existing OS (e.g Linux, …) 5

6 “Stretched” Process Unified Address Space 6 OS/Process VR Elastic Page Table Location

7 Movable Execution Context 7 OS/Process OS handles elasticity – Apps don’t change Partition locality across multiple nodes Useful for single (and multiple) threads For multiple threads, seamlessly exploit network I/O and CPU parallelism

8 Replicate Code, Partition Data 8 CODE Data 1 Data 2 CODE Unique copy of data (unlike DSM) Execution context follows data (unlike Process Migration, SSI )

9 Exploiting Elastic Locality 9 We need an adaptive page clustering algorithm LRU, NSWAP i.e “always pull” Execution follows data i.e “always jump” Hybrid (Initial): Pull pages, then Jump

10 Status and Future Work  Complete our initial prototype  Improve our page placement algorithm  Improve context jump efficiency  Investigate Fault Tolerance issues 10

11 11 Thank You Questions ? Contact: amit.gupta@colorado.edu

12 Algorithm Performance(1) 12

13 Algorithm Performance(2) 13

14 Page Placement Multinode Adaptive LRU CPUs Mem SwapCPUsSwap Mem Pulls Threshold Reached ! 14 Pull First Jump Execution Context

15 Locality in a Single Thread CPUs Mem SwapCPUsSwap Mem Temporal Locality 15

16 Locality across Multiple Threads CPUs Mem SwapCPUsSwap Mem 16 CPUsSwap

17 Unlike DSM… 17

18 Exploiting Elastic Locality 18 Assumptions Replicate Code Pages, Place Data Pages (vs DSM) We need an adaptive page clustering algorithm LRU, NSWAP Us (Initial): Pull pages, then Jump

19 Replicate Code, Distribute Data 19 CODE Data 1 Data 2 CODE Unique copy of data (vs DSM) Execution context follows data (vs Process Migration) Accessing Data 1 Accessing Data 2 Accessing Data 1

20 Benefits  OS handles elasticity – Apps don’t change  Partition locality across multiple nodes  Useful for single (and multiple) threads  For multiple threads, seamlessly exploit network I/O and CPU parallelism 20

21 Benefits (delete)  OS handles elasticity  Application ideally runs unmodified  Application is naturally partitioned …  By Page Access locality  By seamlessly exploiting multithreaded parallelism  By intelligent page placement 21

22 How should we place pages ? 22

23 Execution Context Jumping A single thread example 23 Address Space Node 1 Address Space Node 2 Process TIME

24 24 Address Space Node 1 Address Space Node 2 Process V R Page Table IP Addr “Stretch” a Process Unified Address Space

25 Operating Systems Today  Resource Limit = 1 Node OS 25 CPUs Mem Disks Process

26 Cloud Applications at Scale 26 Cloud Manager Load Balancer Process More Resources ? Process Framework (eg. Map Reduce) Partitioned Data More Queries ?

27 Our findings  Important Tradeoff  Data Page Pulls Vs Execution Context Jumps  Latency cost is realistic  Our Algorithm: Worst case scenario  “always pull” == NSWAP  marginal improvements 27

28 Advantages  Natural Groupings: Threads & Pages  Align resources with inherent parallelism  Leverage existing mechanisms for synchronization 28

29 “Stretch” a Process : Unified Address Space V R CPUs Mem Swap CPUs Mem Swap Page Table A “Stretched” Process = Collection of Pages + Other Resources { Across Several Machines } 29 IP Addr

30 delete Exec. context follows Data  Replicate Code Pages  Read-Only => No Consistency burden  Smartly distribute Data Pages  Execution context can jump  Moves towards data  *Converse also allowed* 30

31 Elasticity in Cloud Apps Today D1 ~~~~ Input Data …. ~~~~ CPUs Mem Disk Output Data D2Dx 31

32 D1 Load Balancer …. ~~~~ CPUs Mem Disk Output Data D2Dx Input Queries Dy 32

33 (delete)Goals : Elasticity dimensions  Extend Elasticity to  Memory  CPU  I/O  Network  Storage 33

34 Thank You 34

35 Bang Head Here ! 35

36 Stretching a Thread 36

37 Overlapping Elastic Processes 37

38 *Code Follows Data* 38

39 Application Locality 39

40 Possible Animation? 40

41 Multinode Adaptive LRU 41

42 Possible Animation? 42

43 Open Topics  Fault tolerance  Stack handling  Dynamic Linked Libraries  Locking 43

44 Elastic Page Table Virtual AddrPhy. AddrValidNode (IP addr) AB1Localhost CD0 EF1128.138.60.1 GH0 Local Mem Swap space Remote Mem Remote Swap 44

45 “Stretch” a Process  Move beyond resource boundaries of ONE machine  CPU  Memory  Network, I/O 45

46 D1D2 ~~~~ Input Data …. ~~~~ CPUs Mem Disk Output Data CPUs Mem Disk 46

47 D1 CPUs Mem Disk D2 CPUs Mem Disk ~~~~ Data 47

48 Reinventing Elasticity Wheel 48


Download ppt "Towards Elastic Operating Systems Amit Gupta Ehab Ababneh Richard Han Eric Keller University of Colorado, Boulder."

Similar presentations


Ads by Google