Asaf Cidon. , Tomer M. London

MARS: Adaptive Remote Execution Scheduler for Multithreaded Mobile Devices
Asaf Cidon*, Tomer M. London*, Sachin Katti, Christos Kozyrakis, Mendel Rosenblum Hi everyone. Today I’m going to present MARS, a system that allows mobile devices to use the cloud as a system computing resource. My collaborator in this project is Tomer London, and we are advised by Sachin Katti, Christos Kozyrakis and Mendel Rosenblum, all of Stanford University. Stanford University *Equal contributors

New Class of Mobile Applications
Computer Vision The computer of the future is the mobile device. It is personal, social and highly connected. Applications are the drive, the problem is the limited resources on the device There is a growing demand for higher computing performance on mobile devices, in order to support applications like computer vision, augmented reality and 3D games The problem is that device CPUs are hitting the energy wall, so we there’s an inherent limit to the amount of compute performance we can have on mobile devices. The solution to this problem is to use the cloud. And more specifically, we think the cloud should be viewed as a system resource by the mobile device (or cloud on chip), which is a new type of CPU that has a slower less reliable interconnect We designed MARS, a simple remote execution scheduler that resides natively on the device, which can utilize this “cloud-on-chip” interface We built a simulator for our system and measured network, power and performance of three application This research is meant as a proof-of-concept for dynamic remote execution, we intend to implement it Augmented Reality Motion Sensing October 23, 2011

Maximum Bandwidth (Mb/s)
Mobile Client Trends Mobile CPU performance increasing Hitting ‘energy wall’ Can we improve performance and reduce energy consumption? Opportunity: network bandwidth increase utilize the cloud Maximum Bandwidth (Mb/s) Why CPUs are hitting energy wall Almost infinite amount of computing resource around us, and almost for free, coupled with wireless network bandwidth increasing exponentially, if you take a look at Wi-Fi… The question is whether we can exploit the exponentially increasing network resources to offload tasks from the mobile phone Consumer demand for performance on mobile devices: vision, graphics, games Multicore devices, new architectures from QLCM, Nvidia, Intel, GPUs Inherent limitation- power density and battery life. Client server may help October 23, 2011

Static Client-Server Partitioning Doesn’t Work
Dynamic resources: Network bandwidth and latency Available CPU, memory Same code, different platforms: Smartphones (single-core, multi-core) Tablets Start off by saying this isn’t a new idea – for example google. The key point is that partitioning is static. It doesn’t work for remote execution in our context. Maybe put a static client server partition that is used now It’s tough to statically partition your code between a mobile device and a server, because of the variable network and hardware resources, and also because the wide range of mobile device capabilities. A code written to an Android device can run on a three year-old G1 and simultaneously on one of Samsung’s new tablets. October 23, 2011

MARS: Adaptive Remote Execution
Opportunistically offload computations to remote server Enhance computational capabilities Decrease energy consumption Make dynamic decisions Adapt to network and CPU variability MARS works because it’s dynamic Data Center Mobile Device October 23, 2011

Agenda Design of MARS Simulator Results and Analysis Conclusions
October 23, 2011

Existing Remote Execution Systems
The Unit of Remote Execution Cloudlets [Satyanarayanan et al., ‘09] CloneCloud [Kirsch et al., ‘11] VM MAUI [Cuervo et al. ‘10] MARS “Cloud-on-Chip” Odessa [Ra et al. ‘11] RPC Say more why VM is bad Transition – the difference is that we do system level performance optimizations. Chroma [Balan et al. ‘03] Target of Performance Optimization Single-thread application Multi-threaded application System October 23, 2011

Previous Systems: Application Partitioning MARS “Cloud-on-Chip”:
System Scheduling RPC 1 Process 1 RPC 2 RPC 3 RPC 4 RPC 5 Local Execution Remote Execution RPC 2 Process 3 RPC 1 Process 1 Process 2 RPC Queue Local Cores Remote CLOUD ON CHIP emphasize both in speech and graphically Communicate complexity of optimization solvers and lack of scalability You have to run an optimization program for each application, and the results change if it’s a multicore or if the resources changes, and this increases complexity doesn’t scale October 23, 2011

Greedy Algorithm Higher POR: better performance gain from offloading
EOR ≥ 1 𝐺 ? Higher POR: better performance gain from offloading Higher EOR: better energy saving from offloading EOR < 𝐺 ? October 23, 2011

G (Greediness) trades-off utilization and energy efficiency
Controller Algorithm Remote Server Available RPC 3 (POR 2.5) EOR Local Remote Both 𝟏 𝑮 Check EOR Threshold RPC 5 (POR 1.9) Priority Queue, sorted by Performance Offload Rank (POR) 𝑮 RPC 6 (POR 1.8) RPC 6 (POR 1.8) RPC 4 (POR 1.3) RPC 2 (POR 0.4) Greediness- Higher G->inf means we don’t use EOR at all G (Greediness) trades-off utilization and energy efficiency Local Core Available October 23, 2011

October 23, 2011

Remote Execution Applications
Augmented Reality Face Recognition Pic Pic Pic Pic Pic Pic Barcode Detection Barcode Detection Barcode Detection Rendering Recognition Rendering Recognition Rendering Recognition Here are two applications, some of their parts are computationally intensive (like face recognition and detection), and the programmer marked them…

Simulator Methodology
Trace-driven simulation Clients: Nokia N900 (single core) NVIDIA Tegra 250 (multicore) Server: Amazon EC2 Opteron 2007 Networks: Outdoors Wi-Fi Indoors Wi-Fi 3G June 4, 2011

MARS vs. Static Policies
Lower is better

Nokia N900 Power Consumption
Wi-Fi 3G Idle Network Power 1.31 Watts 0.66 Watts Upload Network Power Watts 2.36 Watts Download Network Power 1.39 Watts 2.26 Watts Upload Network Power Overhead 10.51% 72.03% In Wi-Fi you need to maximize utilization MARS made the right choice not to fully utilize remote execution WiFi: Performance and energy are highly correlated 3G: trade-off performance and energy October 23, 2011

Same Application, Different Networks
Talk about the fact that these are realistic traces MARS decided to be conservative and to hardly offload tasks

Remote Execution with Multicore
October 23, 2011

October 23, 2011

Conclusions Can’t always be greedy
Performance and energy trade-off MARS is optimized for multiple parallel applications and cores MARS “Cloud-on-Chip”: validation of system-level remote execution scheduling 57% performance increase, 33% energy savings Takeaways not conclusions: - Takeaways (can’t just be greedy, important even for multicore) Performance gains Vision: Cloud on chip October 23, 2011

Asaf Cidon. , Tomer M. London

Similar presentations

Presentation on theme: "Asaf Cidon. , Tomer M. London"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Asaf Cidon. , Tomer M. London

Similar presentations

Presentation on theme: "Asaf Cidon. , Tomer M. London"— Presentation transcript:

Similar presentations

About project

Feedback