Improving and Controlling User-Perceived Delays in Mobile Apps Lenin Ravindranath By the time it loads, the church service is over. Too slow!!! Slow responsiveness,

Slides:



Advertisements
Similar presentations
Multi-DNC Data Collection/Monitoring
Advertisements

University of Michigan Electrical Engineering and Computer Science Anatomizing Application Performance Differences on Smartphones Junxian Huang, Qiang.
Trace Analysis Chunxu Tang. The Mystery Machine: End-to-end performance analysis of large-scale Internet services.
AppInsight: Mobile App Performance Monitoring in the Wild
Automatic and Scalable Fault Detection for Mobile Applications MobiSys’ 14 Presented by Haocheng Huang
VanarSena: Automated App Testing. App Testing Test the app for – performance problems – crashes Testing app in the cloud – Upload app to a service – App.
Automatic and Scalable Fault Detection for Mobile Applications Lenin Ravindranath, Suman Nath, Jitu Padhye, Hari Balakrishnan.
Tracking & Login Data persistence User tracking.
Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.
Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Tools for Investigating Graphics System Performance
Precept 3 COS 461. Concurrency is Useful Multi Processor/Core Multiple Inputs Don’t wait on slow devices.
Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.
A Distributed Proxy Server for Wireless Mobile Web Service Kisup Kim, Hyukjoon Lee, and Kwangsue Chung Information Network 2001, 15 th Conference.
Rensselaer Polytechnic Institute CSC 432 – Operating Systems David Goldschmidt, Ph.D.
Loupe /loop/ noun a magnifying glass used by jewelers to reveal flaws in gems. a logging and error management tool used by.NET teams to reveal flaws in.
What Can You do With BTM? Business Transaction Management touches the following disciplines:  Performance Management  Application Management  Capacity.
IPv6 end client measurement George Michaelson
Junxian Huang 1 Feng Qian 2 Yihua Guo 1 Yuanyuan Zhou 1 Qiang Xu 1 Z. Morley Mao 1 Subhabrata Sen 2 Oliver Spatscheck 2 1 University of Michigan 2 AT&T.
Timecard: Controlling User-Perceived Delays in Server-Based Mobile Applications Lenin Ravindranath, Jitu Padhye, Ratul Mahajan, Hari Balakrishnan.
WebQuilt and Mobile Devices: A Web Usability Testing and Analysis Tool for the Mobile Internet Tara Matthews Seattle University April 5, 2001 Faculty Mentor:
Presented by Tao HUANG Lingzhi XU. Context Mobile devices need exploit variety of connectivity options as they travel. Operating systems manage wireless.
ALBERT PARK EEL 6788: ADVANCED TOPICS IN COMPUTER NETWORKS Energy-Accuracy Trade-off for Continuous Mobile Device Location, In Proc. of the 8th International.
File Systems and N/W attached storage (NAS) | VTU NOTES | QUESTION PAPERS | NEWS | VTU RESULTS | FORUM | BOOKSPAR ANDROID APP.
Xiaoyu Tong and Edith C.-H. Ngai Dept. of Information Technology, Uppsala University, Sweden A UBIQUITOUS PUBLISH/SUBSCRIBE PLATFORM FOR WIRELESS SENSOR.
 Zhichun Li  The Robust and Secure Systems group at NEC Research Labs  Northwestern University  Tsinghua University 2.
Characterizing and Modeling the Impact of Wireless Signal Strength on Smartphone Battery Drain Ning Ding Xiaomeng Chen Abhinav Pathak Y. Charlie Hu 1 Daniel.
Global NetWatch Copyright © 2003 Global NetWatch, Inc. Factors Affecting Web Performance Getting Maximum Performance Out Of Your Web Server.
Maintaining Performance while Saving Energy on Wireless LANs Ronny Krashinsky Term Project
Explain the purpose of an operating system
Ideas to Improve SharePoint Usage 4. What are these 4 Ideas? 1. 7 Steps to check SharePoint Health 2. Avoid common Deployment Mistakes 3. Analyze SharePoint.
©2010 John Wiley and Sons Chapter 12 Research Methods in Human-Computer Interaction Chapter 12- Automated Data Collection.
Turducken: Hierarchical Power Management for Mobile Devices Jacob Sorber, Nilanjan Banerjee, Mark Corner, Sami Rollins University of Massachusetts, Amherst.
Chapter 34 Java Technology for Active Web Documents methods used to provide continuous Web updates to browser – Server push – Active documents.
Procrastinator: Pacing Mobile Apps’ Usage of the Network mobisys 2014.
AjaxScope & Doloto: Towards Optimizing Client-side Web 2.0 App Performance Ben Livshits Microsoft Research (joint work with Emre.
Timecard: Controlling User-Perceived Delays in Server-Based Mobile Applications Lenin Ravindranath, Jitu Padhye, Ratul Mahajan, Hari Balakrishnan.
Rethinking Energy-Performance Trade-Off in Mobile Web Page Loading
OPERETTA: An Optimal Energy Efficient Bandwidth Aggregation System Karim Habak†, Khaled A. Harras‡, and Moustafa Youssef† †Egypt-Japan University of Sc.
Developer TECH REFRESH 15 Junho 2015 #pttechrefres h Understand your end-users and your app with Application Insights.
Scheduling Lecture 6. What is Scheduling? An O/S often has many pending tasks. –Threads, async callbacks, device input. The order may matter. –Policy,
AppInsight: Mobile App Performance Monitoring In The Wild Lenin Ravindranath, Jitu Padhye, Sharad Agarwal, Ratul Mahajan, Ian Obermiller, Shahin Shayandeh.
Suman Nath, Microsoft Research Felix Xiaozhu Lin, Rice University Lenin Ravindranath, MIT Jitu Padhye, Microsoft Research.
End-to-End Performance Analytics For Mobile Apps Lenin Ravindranath, Jitu Padhye, Ratul Mahajan Microsoft Research 1.
Basics of testing mobile apps
© 2015 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other marks contained herein are trademarks of AT&T Intellectual Property.
Improving and Controlling User-Perceived Delays in Mobile Apps Lenin Ravindranath By the time it loads, the church service is over. Too slow!!! Slow responsiveness,
FriendFinder Location-aware social networking on mobile phones.
Randy Pagels Sr. Developer Technology Specialist DX Team (Developer Experience and Evangelism) Application Insights Availability, Performance and Usage.
Thread basics. A computer process Every time a program is executed a process is created It is managed via a data structure that keeps all things memory.
Emir Halepovic, Jeffrey Pang, Oliver Spatscheck AT&T Labs - Research
ASP-2-1 SERVER AND CLIENT SIDE SCRITPING Colorado Technical University IT420 Tim Peterson.
Performance Testing Test Complete. Performance testing and its sub categories Performance testing is performed, to determine how fast some aspect of a.
Detecting, Managing, and Diagnosing Failures with FUSE John Dunagan, Juhan Lee (MSN), Alec Wolman WIP.
WHAT'S THE DIFFERENCE BETWEEN A WEB APPLICATION STREAMING NETWORK AND A CDN? INSTART LOGIC.
A Software Energy Analysis Method using Executable UML for Smartphones Kenji Hisazumi System LSI Research Center Kyushu University.
Gorilla: A Fast, Scalable, In-Memory Time Series Database
Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.
SmartAds: Bringing Contextual Ads to Mobile Apps
Week 01 Comp 7780 – Class Overview.
AppInsight: Mobile App Performance Monitoring in the Wild
Short Circuiting Memory Traffic in Handheld Platforms
Serverless CQRS in Azure!
Capriccio – A Thread Model
ECF: an MPTCP Scheduler to Manage Heterogeneous Paths
Chapter 12: Automated data collection methods
International Symposium on Microarchitecture. New York, NY.
PredictRemainingTime
Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Zaharia, et al (2012)
Presentation transcript:

Improving and Controlling User-Perceived Delays in Mobile Apps Lenin Ravindranath By the time it loads, the church service is over. Too slow!!! Slow responsiveness, long load time. So slow. Did an intern write this app?? Slower than a snail. horrifically slow. Uninstalled. MIT So slow that my screen already dims before it's loaded

~ Two Million Apps > 500,000 Developers

Too slow - killing the usefulness when you really need to go

“So slow. Did an intern write this app??” “Slower than a snail even on 3G.” “Slow and unresponsive like mud” “Sluggish and freezes my HTC phone.” “Very very slow compared to even browsing web.” “Consistently 3 seconds behind where I touch.” “Loading GPS data is *** slow”

Diverse environmental conditions – Network connectivity, GPS signal quality, – Location, Sensor conditions etc. Variety of hardware and OS versions Wide range of user interactions and inputs Hard for developers to test Performance problems are inevitable in the wild Significant diversity in the wild

App store is brutally competitive Need for improving and controlling performance Response time matters – Users are impatient – 100ms delay can cost substantial drop in revenue at Amazon – Similar observations from Google and Bing Help app developers

Find a bar

User InteractionRendering User-perceived delay Monitor in the hands of user Improve and control the user-perceived delay

User InteractionRendering AppInsight OSDI ‘12 Monitor performance in the wild What is the user-perceived delay? Where is the bottleneck? Developer Feedback Developer Feedback Users with slow network – Poor Performance Users with fast network – Not producing the best quality result

User InteractionRendering AppInsight OSDI ‘12 Monitor performance in the wild What is the user-perceived delay? Where is the bottleneck? Developer Feedback Developer Feedback Deal with uncontrolled variability

User InteractionRendering AppInsight OSDI ‘12 Monitor performance in the wild What is the user-perceived delay? Where is the bottleneck? Developer Feedback Developer Feedback Timecard SOSP ‘13 Adapt processing to conditions in the wild How much time has already been spent? How much time will be spent? Runtime Estimation Runtime Estimation Adapt High delays Reduce Tightly control user-perceived delay

User InteractionRendering AppInsight OSDI ‘12 Monitor performance in the wild What is the user-perceived delay? Where is the bottleneck? Developer Feedback Developer Feedback Timecard SOSP ‘13 Adapt processing to conditions in the wild How much time has already been spent? How much time will be spent? Runtime Estimation Runtime Estimation Adapt Tightly control user-perceived delay Low delays Improve result

User InteractionRendering AppInsight OSDI ‘12 Monitor performance in the wild What is the user-perceived delay? Where is the bottleneck? Developer Feedback Developer Feedback Timecard SOSP ‘13 Adapt processing to conditions in the wild How much time will be spent? Runtime Estimation Runtime Estimation How much time has already been spent?

AppInsight Timecard Minimal developer effort Readily deployable - No changes to the OS or runtime Negligible overhead Product Impact at Microsoft, Facebook  Used by dozens of developers

AppInsight Timecard App Instrumented Automatic Binary Instrumentation User Transaction Tracking Monitor app execution across async boundaries in a light-weight manner

AppInsight Monitor performance in the wild What is the user-perceived delay? Where is the bottleneck? Developer Feedback Developer Feedback Significant barrier for most app developers No platform support Only option is to instrument your app – Manage your own logging infrastructure

Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces Performance Improved Developer AppInsight App Store

Highly interactive, UI centric – Single UI thread that should not be blocked Most tasks are performed asynchronously – Asynchronous APIs for Network, Sensor, IO etc. – Computation performed on background threads Highly Asynchronous Programming Pattern Monitoring App Performance is Challenging Tracing async code is challenging

Internet Location Mood Around Me Nearby Tweets GPS sad “Just got engaged :)” “My life sucks” “At last its sunny. But I have to sit and prepare my talk :(” “I am sad that there is no more how I met your mother”

ClickHandler() { location = GPS.Sample(); url = GetUrl(location) tweets = Http.Get(url); result = Process(tweets); UI.Display(result); } GetUrl(location) {... } Process(tweets) {... } ClickHandler Start ClickHandler End LogStart(); LogEnd(); Thread User-perceived Delay Hypothetical Synchronous Code User Interaction Mood Around Me sad

ClickHandler() { GPS.AsyncSample(GPSCallback); } GPSCallback(location) { url = GetUrl(location); Http.AsyncGet(url, DownloadCallback); } DownloadCallback(tweets) { result = Process(tweets); UI.Dispatch(Render, result); } Render(result) { UI.Display(result); } GetUrl(location){... } Process(tweets){... } Render UI Thread Background Thread UI Dispatch ClickHandler Start System Process Download Callback User Interaction Asynchronous Code Background Thread GPS Network ClickHandler End Async Http Call Async GPS Call GPS Callback

Render UI Thread UI Dispatch ClickHandler Start System Async GPS Call Download Callback User Interaction Background Thread Network Async Http Call GPS Callback GPS Background Thread User Transaction Transaction timeUser-perceived Delay

UI Thread Background Thread User Transaction

Apps are highly asynchronous Where is the bottleneck? Focus development efforts 30 popular apps 200,000 transactions – On average, 10 asynchronous calls per user transaction Up to 7000 edges – There are apps with 8 parallel threads per transaction

UI Thread Background Thread Twitter Thread WakeupThread Blocked Fire Render Callback Nearby Tweets Nearby Posts Twitter Facebook Process Posts Process Tweets Facebook User Interaction Aggregate GPS GPS Callback Mood Around Me

UI Thread Background Thread Thread WakeupThread Blocked Fire Render User Transaction User Interaction Aggregate GPS GPS Callback Critical Path Optimizing the critical path reduces the user perceived delay Callback Facebook Callback Twitter Facebook Process Posts Process Tweets

Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces Developer AppInsight App Store

Low Overhead Capturing User Transactions UI Thread Background Thread User Transaction Instrumentation impacts app performance – They are already slow enough!!! Scarce Resources – Compute, Memory, Network, Battery

User Interactions Capture UI Thread Background Thread User Transaction User Interaction Event Handler

User Interactions Thread Execution Capture UI Thread Background Thread User Transaction ClickHandler Start ClickHandler End Render Start DownloadCallback Start End GPSCallback Start GPSCallback End Render End

User Interactions Thread Execution Async Calls and Callbacks Capture UI Thread Background Thread User Transaction UI Dispatch Call Download Callback Async GPS Call GPS Callback Async Http Call UI Dispatcher Callback

User Interactions Thread Execution Async Calls and Callbacks UI Updates Capture UI Thread Background Thread User Transaction UI Update

UI Thread Background Thread User Interactions Thread Execution Async Calls and Callbacks UI Updates Thread Synchronization Fire Thread Blocked Thread Wakeup Capture User Transaction

User Interactions Thread Execution Async Calls and Callbacks UI Updates Thread Synchronization Capture UI Thread Background Thread User Transaction

Trace every method – Prohibitive overhead Enough to log thread boundaries Log entry and exit of Upcalls Capturing Thread Execution Upcalls System App UI Thread Background Thread ClickHandler Start ClickHandler End Render Start End GPSCallback Start GPSCallback End Render End DownloadCallback Start

ClickHandler() { GPS.AsyncSample(GPSCallback); } GPSCallback(location) { url = GetUrl(location); Http.AsyncGet(url, DownloadCallback); } DownloadCallback(tweets) { result = Process(tweets); UI.Dispatch(Render, result); } Render(result) { UI.Display(result); } Event Handlers are Upcalls Function pointers point to potential Upcalls – Callbacks are passed as function pointers Identify Upcalls

Instrument Upcalls Rewrite app – Trace Upcalls ClickHandler() { Logger.UpcallStart(1); GPS.AsyncSample(GPSCallback); Logger.UpcallEnd(1); } GPSCallback(location) { Logger.UpcallStart(2); url = GetUrl(location); Http.AsyncGet(url, DownloadCallback); Logger.UpcallEnd(2); } DownloadCallback(tweets) { Logger.UpcallStart(3); result = Process(tweets); UI.Dispatch(Render, result); Logger.UpcallEnd(3); } Render(result) { Logger.UpcallStart(4); UI.Display(result); Logger.UpcallEnd(4); } GetUrl(location){... } Process(tweets){... } UI Thread Background Thread ClickHandler Start ClickHandler End Render Start DownloadCallback Start End GPSCallback Start GPSCallback End Render End Low Overhead 7000 times less overhead

UI Thread Background Thread UI Dispatch Call Download Callback Async GPS Call GPS Callback Async Http Call UI Dispatcher Callback Async Calls and Callbacks

Log Callback Start – We capture start of the thread Log Async Call Match Async Call to its Callback GPSCallback(location) { Http.AsyncGet(url, DownloadCallback); Logger.AsyncCall(5); } DownloadCallback(tweets) { Logger.UpcallStart(3);.... } Download Callback Async Http Call Detour Callbacks

System App Http.AsyncGet Async Call DownloadCallback DownloadCallback(tweets) { } Callbacks

Detour Callbacks DetourCallback(tweets) { DownloadCallback(tweets); } class DetourObject { } MatchId = 3 obj.DetourCallback MatchId = 3 Http.AsyncGet(url, DownloadCallback); Http.AsyncGet(url, obj.DetourCallback); obj = new DetourObject(DownloadCallback, MatchId++); System App Http.AsyncGet Async Call DownloadCallback DownloadCallback(tweets) { } while(...) { }

Detour Callbacks DetourCallback(tweets) { DownloadCallback(tweets); } class DetourObject { } MatchId = 4 obj.DetourCallback MatchId = 4 obj = new DetourObject(DownloadCallback, MatchId++); System App Http.AsyncGet Async Call DownloadCallback(tweets) { } while(...) { } Http.AsyncGet(url, obj.DetourCallback);

Detour Callbacks DetourCallback(tweets) { DownloadCallback(tweets); } class DetourObject { } MatchId = 5 obj.DetourCallback MatchId = 5 obj = new DetourObject(DownloadCallback, MatchId++); System App Http.AsyncGet Async Call DownloadCallback(tweets) { } while(...) { } Http.AsyncGet(url, obj.DetourCallback);

Detour Callbacks Delayed Callbacks – Track Object Ids Event Subscriptions Async Calls and Callbacks Low Overhead void foo() { } Thread t = new Thread(foo);... t.Start();

User Interactions Thread Execution Async Calls and Callbacks UI Updates Thread Synchronization Capture UI Thread Background Thread User Transaction

Keeping the overhead low Use app idle time – Garbage collect detour objects during idle time – Log to memory, write to disk during idle time Compress log before sending to server – Delta encoding to achieve high compression Background transfer after app quits

Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces Developer AppInsight App Store

Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces Developer AppInsight App Store

Aggregate Analysis Group similar transactions – Same transaction graph Outliers – Points to corner cases Highlight common critical paths – Focus development effort Root causes of performance variability – Highlight what really matters in the wild Async GPS CallAsync Http Call Network GPS Device Compute

Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces Developer AppInsight App Store

Developer Feedback Web based Interface Long User Transactions Critical Path Aggregate Analysis – Outliers – Common Case – Factors affecting

Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces Developer Deployment AppInsight App Store

Deployment 40 apps in the Windows Phone store 30 to few thousand users 6 months to 1 year of data > Million user transactions

AppInsight Overhead Battery <1% Binary Size 1.2% Memory 2% Network 4% Compute 0.02% Low Overhead Impact on app performance is minimal Low resource consumption Based on 6 months of data from 30 apps in the wild

User Transactions and Critical Path 15% of the user transactions took more than 5 seconds! – 30% took more than 2 seconds Top 2 edges responsible for 82% of transaction time Critical path analysis – Focus on less than half the edges for 40% of transactions Aggregate analysis – Pointed to factors creating performance variability – GPS, Device, Network, Sensors Focus development efforts

App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces AppInsight Developer Improved Users App Store

App: TwitReviews Problem: UI hog AppInsight: High performance variability in UI thread Abnormal latencies only at the start of the session System loading DLLs in the critical path – Up to 1 second Fix: Preload the DLL asynchronously when the app starts – Dummy call to DLL Developer Case Study

App: TwitReviews Problem: Radio wakeup delay AppInsight: High performance variability before network call Radio wakeup delay in critical path Dominating when users are on cellular networks – More than 2 seconds Fix: Wakeup radio as the user starts typing before invoking the transactions Developer Case Study

App: Bing App Problem: Slow transactions AppInsight: Custom instrumentation in the critical path – Affecting user perceived delay

App: OneBusAway Problem: Slow transactions AppInsight: Aggregate analysis showed cellular latencies significantly affecting critical paths – Majority of transactions happens when user is in cellular – Known issue but no quantitative data – Current caching policy insufficient Fix: Change caching policy? Developer Case Study

Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces AppInsight Developer App Store

Server-based User Transactions Core processing of the user transaction is at the server Core processing Server Input and sensors User Interaction App Send request Network Parse and render Receive response App Network

GPS Start Http Request GPS Callback Click Handler UI Dispatcher Download Callback Server Network + Server Joint app-server instrumentation – Azure services Server-based User Transactions Request Handler Start Request Handler End Network

Server Threads Spawn Workers Send Response Request Handler GPS Start Http Request Click Handler Download Callback GPS Callback UI Dispatcher Server-based User Transactions Identify and fix bottlenecks at the app and the server Network Server Network HTTP[“x-AppInsight-Id”]

Server User Interaction Rendering App Network Uncontrollable and Variable Delays Reading sensors Radio state Device type (3G, 4G, WiFi, LTE,..) RTT, tput, TCP, DNS Device type

Server User Interaction Rendering App Network Uncontrollable and Variable Delays Reading sensors Radio state Device type (3G, 4G, WiFi, LTE,..) RTT, tput, TCP, DNS Device type Significant variability in user-perceived delays

Server User Interaction Rendering App Network Uncontrollable and Variable Delays Reading sensors Radio state Device type (3G, 4G, WiFi, LTE,..) RTT, tput, TCP, DNS Device type Significant variability in user-perceived delays Users with high uncontrollable delays – Poor user-perceived delays Users with low uncontrollable delays – Not producing the best quality result

Server User Interaction Rendering App Network Uncontrollable and Variable Delays Reading sensors Radio state Device type (3G, 4G, WiFi, LTE,..) RTT, tput, TCP, DNS Device type Tightly control user-perceived delays Adapt to conditions in the wild

Server User Interaction Rendering App Network Adapt to conditions in the wild Adapt Do less or more processing Send less or more data Meet end-to-end deadline Trade-off on quality or quantity of the result Tightly control user-perceived delays

Request Response Deadline Server processing Server Controlling the Server Delay

Request Response Deadline Worker Server Controlling the Server Delay

Request Response Deadline Worker Better result Server Controlling the Server Delay

Deadline Server Server processing Controlling the Server Delay Servers keep fixed deadlines – No visibility into external delays

Server User Interaction Rendering App Network Adapt to conditions in the wild Tightly control user-perceived delays Adapt

Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Time elapsed since user interaction Predicted downlink & app processing delay

Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Adapt Processing Time Desired user-perceived delay

Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Adapt Processing Time Trade-off on quality of the result Desired user-perceived delay

Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Adapt Response PredictRemainingTime(10KB); PredictRemainingTime(5KB) PredictRemainingTime(15KB); Desired user-perceived delay

Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Adapt Response Desired user-perceived delay

Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Better quality result Desired user-perceived delay

Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Analysis of 4000 apps – 80% of them are single request-response transactions

Challenges Server App

Challenges UI Thread Background Thread GPS Start Http Request GPS Callback Click Handler Server App UI Dispatcher Download Callback App Highly asynchronous

Server Threads Challenges Spawn Workers Send Response Server Request Handler UI Thread Background Thread GPS Start Http Request GPS Callback Click Handler App UI Dispatcher Download Callback App Highly asynchronous

Challenges Server App GetElapsedTime(); No single reference clock Variable network delays and processing delays PredictRemainingTime (responseSize); Transaction Highly asynchronous

Timecard Server App GetElapsedTime(); PredictRemainingTime (responseSize); Transaction Transaction Tracking TimeSync Delay Predictors

Timecard App Instrumenter Service Developer Timecard.dll GetElapsedTime(); PredictRemainingTime (responseSize); Desired user-perceived delay App Instrumented App Store

Timecard Server App GetElapsedTime(); PredictRemainingTime (responseSize); Transaction Transaction Tracking TimeSync Delay Predictors

Server Threads Spawn Workers Send Response Server Request Handler Transaction UI Thread Background Thread GPS Start Http Request GPS Callback Click Handler App UI Dispatcher Download Callback App Transaction Tracking

Server Threads Spawn Workers Send Response Server Request Handler UI Thread Background Thread GPS Start Http Request GPS Callback Click Handler App UI Dispatcher Download Callback App Transaction Tracking TC Transaction context (TC) attached to every thread – Carry along timestamps, transaction information TC GetElapsedTime();

Timecard Server App GetElapsedTime(); PredictRemainingTime (responseSize); Transaction Transaction Tracking TimeSync Delay Predictors

TimeSync GPS Sync with Cell Tower – Off my several seconds to minutes GPS – Does not work indoors Probing Probe

Typical probing techniques don’t work well – Radio wakeup delays and queuing delays – Errors more than 100ms – Energy inefficient (network radio tail energy) TimeSync Radio-aware, network-aware probing – Probes when radio is active and network is idle

TimeSync in Timecard Server App

TimeSync in Timecard Probe UI Thread Background Thread GPS Start Http Request GPS Callback Click Handler App ~5ms error Energy efficient

Timecard App Server GetElapsedTime(); Time elapsed since user interaction

Predicting Remaining Time App Server PredictRemainingTime (responseSize);  Downlink delay  App processing delay

Predicting Downlink Delay Latency Throughput Loss rate TCP window state Most transfers are short – Median – 3KB – 90% percentile – 37KB Analysis of 4000 apps Cellular networks – High-bandwidth, rare-loss, high-latency, 99% transfers over HTTP – TCP window state matters -> multiple RTTs Response size

Predicting Downlink Delay TCP Window 1 extra RTT

Predicting Downlink Delay Response size Latency TCP parameters Determined by  RTT  Number of round trips

Predicting Downlink Delay Use recent estimate of RTT – Use time sync probes Read TCP window state Read TCP window state at server Middlebox Do not work with middleboxes – Split-TCP connection TCP window state at the middlebox matters – No easy way to estimate TCP window state Build an empirical data-driven model RTT Number of round trips

Predicting Downlink Delay PredictRemainingTime (responseSize); Data-driven model – Response size – Recent RTT – Network provider – Bytes already transferred in the TCP connection o Proxy for TCP window state at the middlebox Middlebox Downlink Delay Predictor Cellular Median error - 17 ms, 90 th percentile - 86 ms Wi-Fi Median error - 12 ms, 90 th percentile - 31ms Learn

Predicting App Processing Delay App Server PredictRemainingTime (responseSize); Parsing and rendering delay – Correlated with response size – Correlated with device type

Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Time elapsed since user interaction Predicted downlink & app processing delay

Mobile Ads Service Contextual ads to mobile apps

Mobile Ads Service Fetch and process ads for keywords Extract keywordsRender ad Send keywords Ad

Mobile Ads Service Extract keywordsRender ad Send keywords Ad Ad provider A Keywords

Mobile Ads Service Extract keywordsRender ad Send keywords Ad

Mobile Ads Service Extract keywordsRender ad Send keywords Ad GetElapsedTime(); PredictRemainingTime (responseSize); Adapt Processing Time

Mobile Ads Service Within 50ms of the target delay 90% of the time With Timecard Without Timecard Target delay User-perceived delay (ms) ~200,000 transactions

Twitter Analysis App and Service Adapt server processing time in steps of 150ms Adapt response from 10KB to 60KB With Timecard Without Timecard Target delay User-perceived delay (ms)

Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Adapt Resources Used Desired user-perceived delay

Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Request Prioritization

AppInsight Timecard  Identify performance bottlenecks o Critical path analysis  Control user-perceived delays o Elapsed time estimation o Remaining time prediction  User transaction tracking o Asynchrony, Time Sync

By the time it loads, the church service is over. Too slow!!! Slow responsiveness, long load time, and crashity crash crash. horrifically slow. Uninstalled. Crashes every time. DO NOT DOWNLOAD. RIP-OFF!! Slow and Crashes often. Fix. I will give 5 stars instead of 1. So slow that my screen already dims before it's loaded Too slow - killing the usefulness when you really need to go Crashes all the time. Should be called Crashing News. Crashed 3 times before I gave up Slow or no network crashes the app immediately.

UI Thread Background Thread Exception at GetSentiment() DownloadCallback() Stack trace GetSentiment() … DownloadCallback() Http.AsyncGet(url) GPSCallback() ClickHandler() GPS.AsyncSample() User Interaction Exception Path App Crashes App Store Slow and Crashes often. DO NOT DOWNLOAD. RIP-OFF!! Need to uncover crashes before the app is released

VanarSena MobiSys ‘13 App testing as a service Upload app to our service After a short time, obtain a detailed report of potential crashes Fix issues before submitting to app store Key techniques: Binary instrumentation and user transaction tracking

1.Top 10% crash buckets cover 90% of the crashes – Crash bucket: Unique exception type and system call Built on insights from real-world crash reports 3.Dominant root causes affect many different execution paths in the app 2.Most crashes due to externally-inducible root causes such as network failures Analysis of 25 million Windows Phone crash reports – from over 100,000 Windows Phone apps reported in 2012 VanarSena tests apps for common faults… … that are externally inducible… … as thoroughly as possible…

Dynamic testing using an army of monkeys Greybox testing – Binary instrumentation of app binaries To obtain detailed insight into app runtime behavior – Significantly reduces testing time VanarSena Fast UI automation – 4 – 40x reduction in testing time Efficient fault injection 38 sec vs. ~1 hour

~3,000 unique crashes in 1,100 apps – Unique exception type + stack trace Note that, these are apps in the App Store – Presumably well tested Compared to Windows Phone crash database 19 out of 20 top exceptions covered Key Results Ongoing Product Impact at Microsoft

Apps OS Sensors Network VanarSena MobiSys ‘14 VanarSena MobiSys ‘14 AppInsight OSDI ‘12 AppInsight OSDI ‘12 Timecard SOSP ‘13 Timecard SOSP ‘13 Procrastinator MobiSys ‘14 Procrastinator MobiSys ‘14 Wireless ESP NSDI ‘11 Wireless ESP NSDI ‘11 VTrack SenSys ‘09 VTrack SenSys ‘09 CTrack NSDI ‘11 CTrack NSDI ‘11 SmartAds MobiSys ‘13 SmartAds MobiSys ‘13 CodeInTheAir HotMobile ‘12 CodeInTheAir HotMobile ‘12 App—aware Operating Systems Nectar OSDI ‘10 Nectar OSDI ‘10 Wearables Mobile Support Best Paper Best Paper Nominee

Backup

0.1% runtime overhead Less than 1% memory overhead Less than 1% network overhead Negligible battery overhead Timecard Overhead

VanarSena MobiSys ‘13 Fast UI automation – Based on transaction tracking – 4x - 40x reduction in testing time App testing as a service Automatic dynamic testing – Using an army of monkeys Uncovered ~3000 crash bugs in 1100 apps in the app store