Download presentation
Presentation is loading. Please wait.
Published byJalynn Darke Modified over 9 years ago
1
Towards Elastic Operating Systems Amit Gupta Ehab Ababneh Richard Han Eric Keller University of Colorado, Boulder
2
2 OS/Process Resources Limited Thrashing CPUs limited I/O bottlenecks Network Storage Present Workarounds Additional Scripting/Code changes Extra Modules/Frameworks Coordination Synch/Aggregating State OS + Cloud Today ELB/ Cloud Mgr
3
3 Advantages Expands available Memory Extends the scope of Multithreaded Parallelism (More CPUs available) Mitigates I/O bottlenecks Network Storage Stretch Process OS/Process
4
ElasticOS : Our Vision 4
5
ElasticOS: Our Goals “Elasticity” as an OS Service Elasticize all resources – Memory, CPU, Network, … Single machine abstraction Apps unaware whether they’re running on 1 machine or 1000 machines Simpler Parallelism Compatible with an existing OS (e.g Linux, …) 5
6
“Stretched” Process Unified Address Space 6 OS/Process VR Elastic Page Table Location
7
Movable Execution Context 7 OS/Process OS handles elasticity – Apps don’t change Partition locality across multiple nodes Useful for single (and multiple) threads For multiple threads, seamlessly exploit network I/O and CPU parallelism
8
Replicate Code, Partition Data 8 CODE Data 1 Data 2 CODE Unique copy of data (unlike DSM) Execution context follows data (unlike Process Migration, SSI )
9
Exploiting Elastic Locality 9 We need an adaptive page clustering algorithm LRU, NSWAP i.e “always pull” Execution follows data i.e “always jump” Hybrid (Initial): Pull pages, then Jump
10
Status and Future Work Complete our initial prototype Improve our page placement algorithm Improve context jump efficiency Investigate Fault Tolerance issues 10
11
11 Thank You Questions ? Contact: amit.gupta@colorado.edu
12
Algorithm Performance(1) 12
13
Algorithm Performance(2) 13
14
Page Placement Multinode Adaptive LRU CPUs Mem SwapCPUsSwap Mem Pulls Threshold Reached ! 14 Pull First Jump Execution Context
15
Locality in a Single Thread CPUs Mem SwapCPUsSwap Mem Temporal Locality 15
16
Locality across Multiple Threads CPUs Mem SwapCPUsSwap Mem 16 CPUsSwap
17
Unlike DSM… 17
18
Exploiting Elastic Locality 18 Assumptions Replicate Code Pages, Place Data Pages (vs DSM) We need an adaptive page clustering algorithm LRU, NSWAP Us (Initial): Pull pages, then Jump
19
Replicate Code, Distribute Data 19 CODE Data 1 Data 2 CODE Unique copy of data (vs DSM) Execution context follows data (vs Process Migration) Accessing Data 1 Accessing Data 2 Accessing Data 1
20
Benefits OS handles elasticity – Apps don’t change Partition locality across multiple nodes Useful for single (and multiple) threads For multiple threads, seamlessly exploit network I/O and CPU parallelism 20
21
Benefits (delete) OS handles elasticity Application ideally runs unmodified Application is naturally partitioned … By Page Access locality By seamlessly exploiting multithreaded parallelism By intelligent page placement 21
22
How should we place pages ? 22
23
Execution Context Jumping A single thread example 23 Address Space Node 1 Address Space Node 2 Process TIME
24
24 Address Space Node 1 Address Space Node 2 Process V R Page Table IP Addr “Stretch” a Process Unified Address Space
25
Operating Systems Today Resource Limit = 1 Node OS 25 CPUs Mem Disks Process
26
Cloud Applications at Scale 26 Cloud Manager Load Balancer Process More Resources ? Process Framework (eg. Map Reduce) Partitioned Data More Queries ?
27
Our findings Important Tradeoff Data Page Pulls Vs Execution Context Jumps Latency cost is realistic Our Algorithm: Worst case scenario “always pull” == NSWAP marginal improvements 27
28
Advantages Natural Groupings: Threads & Pages Align resources with inherent parallelism Leverage existing mechanisms for synchronization 28
29
“Stretch” a Process : Unified Address Space V R CPUs Mem Swap CPUs Mem Swap Page Table A “Stretched” Process = Collection of Pages + Other Resources { Across Several Machines } 29 IP Addr
30
delete Exec. context follows Data Replicate Code Pages Read-Only => No Consistency burden Smartly distribute Data Pages Execution context can jump Moves towards data *Converse also allowed* 30
31
Elasticity in Cloud Apps Today D1 ~~~~ Input Data …. ~~~~ CPUs Mem Disk Output Data D2Dx 31
32
D1 Load Balancer …. ~~~~ CPUs Mem Disk Output Data D2Dx Input Queries Dy 32
33
(delete)Goals : Elasticity dimensions Extend Elasticity to Memory CPU I/O Network Storage 33
34
Thank You 34
35
Bang Head Here ! 35
36
Stretching a Thread 36
37
Overlapping Elastic Processes 37
38
*Code Follows Data* 38
39
Application Locality 39
40
Possible Animation? 40
41
Multinode Adaptive LRU 41
42
Possible Animation? 42
43
Open Topics Fault tolerance Stack handling Dynamic Linked Libraries Locking 43
44
Elastic Page Table Virtual AddrPhy. AddrValidNode (IP addr) AB1Localhost CD0 EF1128.138.60.1 GH0 Local Mem Swap space Remote Mem Remote Swap 44
45
“Stretch” a Process Move beyond resource boundaries of ONE machine CPU Memory Network, I/O 45
46
D1D2 ~~~~ Input Data …. ~~~~ CPUs Mem Disk Output Data CPUs Mem Disk 46
47
D1 CPUs Mem Disk D2 CPUs Mem Disk ~~~~ Data 47
48
Reinventing Elasticity Wheel 48
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.