Presentation is loading. Please wait.

Presentation is loading. Please wait.

Improving and Controlling User-Perceived Delays in Mobile Apps Lenin Ravindranath By the time it loads, the church service is over. Too slow!!! Slow responsiveness,

Similar presentations


Presentation on theme: "Improving and Controlling User-Perceived Delays in Mobile Apps Lenin Ravindranath By the time it loads, the church service is over. Too slow!!! Slow responsiveness,"— Presentation transcript:

1 Improving and Controlling User-Perceived Delays in Mobile Apps Lenin Ravindranath By the time it loads, the church service is over. Too slow!!! Slow responsiveness, long load time. So slow. Did an intern write this app?? Slower than a snail. horrifically slow. Uninstalled. MIT So slow that my screen already dims before it's loaded

2

3

4 ~ Two Million Apps > 500,000 Developers

5 Too slow - killing the usefulness when you really need to go

6 “So slow. Did an intern write this app??” “Slower than a snail even on 3G.” “Slow and unresponsive like mud” “Sluggish and freezes my HTC phone.” “Very very slow compared to even browsing web.” “Consistently 3 seconds behind where I touch.” “Loading GPS data is *** slow”

7 Diverse environmental conditions – Network connectivity, GPS signal quality, – Location, Sensor conditions etc. Variety of hardware and OS versions Wide range of user interactions and inputs Hard for developers to test Performance problems are inevitable in the wild Significant diversity in the wild

8 App store is brutally competitive Need for improving and controlling performance Response time matters – Users are impatient – 100ms delay can cost substantial drop in revenue at Amazon – Similar observations from Google and Bing Help app developers

9 Find a bar

10 User InteractionRendering User-perceived delay Monitor in the hands of user Improve and control the user-perceived delay

11 User InteractionRendering AppInsight OSDI ‘12 Monitor performance in the wild What is the user-perceived delay? Where is the bottleneck? Developer Feedback Developer Feedback Users with slow network – Poor Performance Users with fast network – Not producing the best quality result

12 User InteractionRendering AppInsight OSDI ‘12 Monitor performance in the wild What is the user-perceived delay? Where is the bottleneck? Developer Feedback Developer Feedback Deal with uncontrolled variability

13 User InteractionRendering AppInsight OSDI ‘12 Monitor performance in the wild What is the user-perceived delay? Where is the bottleneck? Developer Feedback Developer Feedback Timecard SOSP ‘13 Adapt processing to conditions in the wild How much time has already been spent? How much time will be spent? Runtime Estimation Runtime Estimation Adapt High delays Reduce Tightly control user-perceived delay

14 User InteractionRendering AppInsight OSDI ‘12 Monitor performance in the wild What is the user-perceived delay? Where is the bottleneck? Developer Feedback Developer Feedback Timecard SOSP ‘13 Adapt processing to conditions in the wild How much time has already been spent? How much time will be spent? Runtime Estimation Runtime Estimation Adapt Tightly control user-perceived delay Low delays Improve result

15 User InteractionRendering AppInsight OSDI ‘12 Monitor performance in the wild What is the user-perceived delay? Where is the bottleneck? Developer Feedback Developer Feedback Timecard SOSP ‘13 Adapt processing to conditions in the wild How much time will be spent? Runtime Estimation Runtime Estimation How much time has already been spent?

16 AppInsight Timecard Minimal developer effort Readily deployable - No changes to the OS or runtime Negligible overhead Product Impact at Microsoft, Facebook  Used by dozens of developers

17 AppInsight Timecard App Instrumented Automatic Binary Instrumentation User Transaction Tracking Monitor app execution across async boundaries in a light-weight manner

18 AppInsight Monitor performance in the wild What is the user-perceived delay? Where is the bottleneck? Developer Feedback Developer Feedback Significant barrier for most app developers No platform support Only option is to instrument your app – Manage your own logging infrastructure

19 Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces Performance Improved Developer AppInsight App Store

20 Highly interactive, UI centric – Single UI thread that should not be blocked Most tasks are performed asynchronously – Asynchronous APIs for Network, Sensor, IO etc. – Computation performed on background threads Highly Asynchronous Programming Pattern Monitoring App Performance is Challenging Tracing async code is challenging

21 Internet Location Mood Around Me Nearby Tweets GPS sad “Just got engaged :)” “My life sucks” “At last its sunny. But I have to sit and prepare my talk :(” “I am sad that there is no more how I met your mother”

22 ClickHandler() { location = GPS.Sample(); url = GetUrl(location) tweets = Http.Get(url); result = Process(tweets); UI.Display(result); } GetUrl(location) {... } Process(tweets) {... } ClickHandler Start ClickHandler End LogStart(); LogEnd(); Thread User-perceived Delay Hypothetical Synchronous Code User Interaction Mood Around Me sad

23 ClickHandler() { GPS.AsyncSample(GPSCallback); } GPSCallback(location) { url = GetUrl(location); Http.AsyncGet(url, DownloadCallback); } DownloadCallback(tweets) { result = Process(tweets); UI.Dispatch(Render, result); } Render(result) { UI.Display(result); } GetUrl(location){... } Process(tweets){... } Render UI Thread Background Thread UI Dispatch ClickHandler Start System Process Download Callback User Interaction Asynchronous Code Background Thread GPS Network ClickHandler End Async Http Call Async GPS Call GPS Callback

24 Render UI Thread UI Dispatch ClickHandler Start System Async GPS Call Download Callback User Interaction Background Thread Network Async Http Call GPS Callback GPS Background Thread User Transaction Transaction timeUser-perceived Delay

25 UI Thread Background Thread User Transaction

26 Apps are highly asynchronous Where is the bottleneck? Focus development efforts 30 popular apps 200,000 transactions – On average, 10 asynchronous calls per user transaction Up to 7000 edges – There are apps with 8 parallel threads per transaction

27 UI Thread Background Thread Twitter Thread WakeupThread Blocked Fire Render Callback Nearby Tweets Nearby Posts Twitter Facebook Process Posts Process Tweets Facebook User Interaction Aggregate GPS GPS Callback Mood Around Me

28 UI Thread Background Thread Thread WakeupThread Blocked Fire Render User Transaction User Interaction Aggregate GPS GPS Callback Critical Path Optimizing the critical path reduces the user perceived delay Callback Facebook Callback Twitter Facebook Process Posts Process Tweets

29 Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces Developer AppInsight App Store

30 Low Overhead Capturing User Transactions UI Thread Background Thread User Transaction Instrumentation impacts app performance – They are already slow enough!!! Scarce Resources – Compute, Memory, Network, Battery

31 User Interactions Capture UI Thread Background Thread User Transaction User Interaction Event Handler

32 User Interactions Thread Execution Capture UI Thread Background Thread User Transaction ClickHandler Start ClickHandler End Render Start DownloadCallback Start End GPSCallback Start GPSCallback End Render End

33 User Interactions Thread Execution Async Calls and Callbacks Capture UI Thread Background Thread User Transaction UI Dispatch Call Download Callback Async GPS Call GPS Callback Async Http Call UI Dispatcher Callback

34 User Interactions Thread Execution Async Calls and Callbacks UI Updates Capture UI Thread Background Thread User Transaction UI Update

35 UI Thread Background Thread User Interactions Thread Execution Async Calls and Callbacks UI Updates Thread Synchronization Fire Thread Blocked Thread Wakeup Capture User Transaction

36 User Interactions Thread Execution Async Calls and Callbacks UI Updates Thread Synchronization Capture UI Thread Background Thread User Transaction

37 Trace every method – Prohibitive overhead Enough to log thread boundaries Log entry and exit of Upcalls Capturing Thread Execution Upcalls System App UI Thread Background Thread ClickHandler Start ClickHandler End Render Start End GPSCallback Start GPSCallback End Render End DownloadCallback Start

38 ClickHandler() { GPS.AsyncSample(GPSCallback); } GPSCallback(location) { url = GetUrl(location); Http.AsyncGet(url, DownloadCallback); } DownloadCallback(tweets) { result = Process(tweets); UI.Dispatch(Render, result); } Render(result) { UI.Display(result); } Event Handlers are Upcalls Function pointers point to potential Upcalls – Callbacks are passed as function pointers Identify Upcalls

39 Instrument Upcalls Rewrite app – Trace Upcalls ClickHandler() { Logger.UpcallStart(1); GPS.AsyncSample(GPSCallback); Logger.UpcallEnd(1); } GPSCallback(location) { Logger.UpcallStart(2); url = GetUrl(location); Http.AsyncGet(url, DownloadCallback); Logger.UpcallEnd(2); } DownloadCallback(tweets) { Logger.UpcallStart(3); result = Process(tweets); UI.Dispatch(Render, result); Logger.UpcallEnd(3); } Render(result) { Logger.UpcallStart(4); UI.Display(result); Logger.UpcallEnd(4); } GetUrl(location){... } Process(tweets){... } UI Thread Background Thread ClickHandler Start ClickHandler End Render Start DownloadCallback Start End GPSCallback Start GPSCallback End Render End Low Overhead 7000 times less overhead

40 UI Thread Background Thread UI Dispatch Call Download Callback Async GPS Call GPS Callback Async Http Call UI Dispatcher Callback Async Calls and Callbacks

41 Log Callback Start – We capture start of the thread Log Async Call Match Async Call to its Callback GPSCallback(location) { Http.AsyncGet(url, DownloadCallback); Logger.AsyncCall(5); } DownloadCallback(tweets) { Logger.UpcallStart(3);.... } Download Callback Async Http Call Detour Callbacks

42 System App Http.AsyncGet Async Call DownloadCallback DownloadCallback(tweets) { } Callbacks

43 Detour Callbacks DetourCallback(tweets) { DownloadCallback(tweets); } class DetourObject { } MatchId = 3 obj.DetourCallback MatchId = 3 Http.AsyncGet(url, DownloadCallback); Http.AsyncGet(url, obj.DetourCallback); obj = new DetourObject(DownloadCallback, MatchId++); System App Http.AsyncGet Async Call DownloadCallback DownloadCallback(tweets) { } while(...) { }

44 Detour Callbacks DetourCallback(tweets) { DownloadCallback(tweets); } class DetourObject { } MatchId = 4 obj.DetourCallback MatchId = 4 obj = new DetourObject(DownloadCallback, MatchId++); System App Http.AsyncGet Async Call DownloadCallback(tweets) { } while(...) { } Http.AsyncGet(url, obj.DetourCallback);

45 Detour Callbacks DetourCallback(tweets) { DownloadCallback(tweets); } class DetourObject { } MatchId = 5 obj.DetourCallback MatchId = 5 obj = new DetourObject(DownloadCallback, MatchId++); System App Http.AsyncGet Async Call DownloadCallback(tweets) { } while(...) { } Http.AsyncGet(url, obj.DetourCallback);

46 Detour Callbacks Delayed Callbacks – Track Object Ids Event Subscriptions Async Calls and Callbacks Low Overhead

47 User Interactions Thread Execution Async Calls and Callbacks UI Updates Thread Synchronization Capture UI Thread Background Thread User Transaction

48 Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces Developer AppInsight App Store

49 Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces Developer AppInsight App Store

50 Aggregate Analysis Group similar transactions – Same transaction graph Outliers – Points to corner cases Highlight common critical paths – Focus development effort Root causes of performance variability – Highlight what really matters in the wild Async GPS CallAsync Http Call Network GPS Device Compute

51 Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces Developer AppInsight App Store

52 Developer Feedback Web based Interface Long User Transactions Critical Path Aggregate Analysis – Outliers – Common Case – Factors affecting

53 Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces Developer Deployment AppInsight App Store

54 Deployment 40 apps in the Windows Phone store 30 to few thousand users 6 months to 1 year of data > Million user transactions

55 AppInsight Overhead Battery <1% Binary Size 1.2% Memory 2% Network 4% Compute 0.02% Low Overhead Impact on app performance is minimal Low resource consumption Based on 6 months of data from 30 apps in the wild

56 User Transactions and Critical Path 15% of the user transactions took more than 5 seconds! – 30% took more than 2 seconds Top 2 edges responsible for 82% of transaction time Critical path analysis – Focus on less than half the edges for 40% of transactions Aggregate analysis – Pointed to factors creating performance variability – GPS, Device, Network, Sensors Focus development efforts

57 App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces AppInsight Developer Improved Users App Store

58 App: My App Problem: UI hog AppInsight: High performance variability in UI thread Abnormal latencies only at the start of the session System loading DLLs in the critical path – Up to 1 second Fix: Preload the DLL asynchronously when the app starts – Dummy call to DLL Developer Case Study

59 App: Popular Search App Problem: Radio wakeup delay AppInsight: High performance variability before network call Radio wakeup delay in critical path Dominating when users are on cellular networks – More than 2 seconds Fix: Wakeup radio as the user starts typing before invoking the transaction Developer Case Study

60 App: Popular Professional App Problem: Slow transactions AppInsight: Custom instrumentation in the critical path – Affecting user perceived delay

61 Users App Instrumenter App Instrumented Analysis Developer Feedback Developer Feedback Server Traces AppInsight Developer App Store

62 Server-based User Transactions Core processing of the user transaction is at the server Core processing Server Input and sensors User Interaction App Send request Network Parse and render Receive response App Network

63 GPS Start Http Request GPS Callback Click Handler UI Dispatcher Download Callback Server Network + Server Joint app-server instrumentation – Azure services Server-based User Transactions Request Handler Start Request Handler End Network

64 Server Threads Spawn Workers Send Response Request Handler GPS Start Http Request Click Handler Download Callback GPS Callback UI Dispatcher Server-based User Transactions Identify and fix bottlenecks at the app and the server Network Server Network HTTP[“x-AppInsight-Id”]

65 Server User Interaction Rendering App Network Uncontrollable and Variable Delays Reading sensors Radio state Device type (3G, 4G, WiFi, LTE,..) RTT, tput, TCP, DNS Device type

66 Server User Interaction Rendering App Network Uncontrollable and Variable Delays Reading sensors Radio state Device type (3G, 4G, WiFi, LTE,..) RTT, tput, TCP, DNS Device type Significant variability in user-perceived delays

67 Server User Interaction Rendering App Network Uncontrollable and Variable Delays Reading sensors Radio state Device type (3G, 4G, WiFi, LTE,..) RTT, tput, TCP, DNS Device type Significant variability in user-perceived delays Users with high uncontrollable delays – Poor user-perceived delays Users with low uncontrollable delays – Not producing the best quality result

68 Server User Interaction Rendering App Network Uncontrollable and Variable Delays Reading sensors Radio state Device type (3G, 4G, WiFi, LTE,..) RTT, tput, TCP, DNS Device type Tightly control user-perceived delays Adapt to conditions in the wild

69 Server User Interaction Rendering App Network Adapt to conditions in the wild Adapt Do less or more processing Send less or more data Meet end-to-end deadline Trade-off on quality or quantity of the result Tightly control user-perceived delays

70 Request Response Deadline Server processing Trade-off quality of result for processing time – More time to process, better quality results Server Controlling the Server Delay

71 Request Response Deadline Worker Server Controlling the Server Delay

72 Request Response Deadline Worker Better result Server Controlling the Server Delay

73 Deadline Server Server processing Controlling the Server Delay Servers keep fixed deadlines – No visibility into external delays

74 Server User Interaction Rendering App Network Adapt to conditions in the wild Tightly control user-perceived delays Adapt

75 Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Time elapsed since user interaction Predicted downlink & app processing delay

76 Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Adapt Processing Time Desired user-perceived delay

77 Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Adapt Processing Time Trade-off on quality of the result Desired user-perceived delay

78 Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Adapt Response PredictRemainingTime(10KB); PredictRemainingTime(5KB) PredictRemainingTime(15KB); Desired user-perceived delay

79 Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Adapt Response Desired user-perceived delay

80 Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Better quality result Desired user-perceived delay

81 Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Analysis of 4000 apps – 80% of them are single request-response transactions

82 Challenges Server App

83 Challenges UI Thread Background Thread GPS Start Http Request GPS Callback Click Handler Server App UI Dispatcher Download Callback App Highly asynchronous

84 Server Threads Challenges Spawn Workers Send Response Server Request Handler UI Thread Background Thread GPS Start Http Request GPS Callback Click Handler App UI Dispatcher Download Callback App Highly asynchronous

85 Challenges Server App GetElapsedTime(); No single reference clock Variable network delays and processing delays PredictRemainingTime (responseSize); Transaction Highly asynchronous

86 Timecard Server App GetElapsedTime(); PredictRemainingTime (responseSize); Transaction Transaction Tracking TimeSync Delay Predictors

87 Timecard App Instrumenter Service Developer Timecard.dll GetElapsedTime(); PredictRemainingTime (responseSize); Desired user-perceived delay App Instrumented App Store

88 Timecard Server App GetElapsedTime(); PredictRemainingTime (responseSize); Transaction Transaction Tracking TimeSync Delay Predictors

89 Server Threads Spawn Workers Send Response Server Request Handler Transaction UI Thread Background Thread GPS Start Http Request GPS Callback Click Handler App UI Dispatcher Download Callback App Transaction Tracking

90 Server Threads Spawn Workers Send Response Server Request Handler UI Thread Background Thread GPS Start Http Request GPS Callback Click Handler App UI Dispatcher Download Callback App Transaction Tracking TC Transaction context (TC) attached to every thread – Carry along timestamps, transaction information TC GetElapsedTime();

91 Timecard Server App GetElapsedTime(); PredictRemainingTime (responseSize); Transaction Transaction Tracking TimeSync Delay Predictors

92 TimeSync GPS Sync with Cell Tower – Off my several seconds to minutes GPS – Does not work indoors Probing Probe

93 Typical probing techniques don’t work well – Radio wakeup delays and queuing delays – Errors more than 100ms – Energy inefficient (network radio tail energy) TimeSync Radio-aware, network-aware probing – Probes when radio is active and network is idle

94 TimeSync in Timecard Server App

95 TimeSync in Timecard Probe UI Thread Background Thread GPS Start Http Request GPS Callback Click Handler App ~5ms error Energy efficient

96 Timecard App Server GetElapsedTime(); Time elapsed since user interaction

97 Predicting Remaining Time App Server PredictRemainingTime (responseSize);  Downlink delay  App processing delay

98 Predicting Downlink Delay Latency Throughput Loss rate TCP window state Most transfers are short – Median – 3KB – 90% percentile – 37KB Analysis of 4000 apps Cellular networks – High-bandwidth, rare-loss, high-latency, 99% transfers over HTTP – TCP window state matters -> multiple RTTs Response size

99 Predicting Downlink Delay TCP Window 1 extra RTT

100 Predicting Downlink Delay Response size Latency TCP parameters Determined by  RTT  Number of round trips

101 Predicting Downlink Delay Use recent estimate of RTT – Use time sync probes Read TCP window state Read TCP window state at server Middlebox Do not work with middleboxes – Split-TCP connection TCP window state at the middlebox matters – No easy way to estimate TCP window state Build an empirical data-driven model RTT Number of round trips

102 Predicting Downlink Delay PredictRemainingTime (responseSize); Data-driven model – Response size – Recent RTT – Network provider – Bytes already transferred in the TCP connection o Proxy for TCP window state at the middlebox Middlebox Downlink Delay Predictor Cellular Median error - 17 ms, 90 th percentile - 86 ms Wi-Fi Median error - 12 ms, 90 th percentile - 31ms Learn

103 Predicting App Processing Delay App Server PredictRemainingTime (responseSize); Parsing and rendering delay – Correlated with response size – Correlated with device type

104 Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Time elapsed since user interaction Predicted downlink & app processing delay

105 Mobile Ads Service Contextual ads to mobile apps

106 Mobile Ads Service Fetch and process ads for keywords Extract keywordsRender ad Send keywords Ad

107 Mobile Ads Service Extract keywordsRender ad Send keywords Ad Ad provider A Keywords

108 Mobile Ads Service Extract keywordsRender ad Send keywords Ad

109 Mobile Ads Service Extract keywordsRender ad Send keywords Ad GetElapsedTime(); PredictRemainingTime (responseSize); Adapt Processing Time

110 Mobile Ads Service Within 50ms of the target delay 90% of the time With Timecard Without Timecard Target delay User-perceived delay (ms) ~200,000 transactions

111 Twitter Analysis App and Service Adapt server processing time in steps of 150ms Adapt response from 10KB to 60KB With Timecard Without Timecard Target delay User-perceived delay (ms)

112 Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Adapt Resources Used Desired user-perceived delay

113 Timecard App Server GetElapsedTime(); PredictRemainingTime (responseSize); Request Prioritization

114 AppInsight Timecard  Identify performance bottlenecks o Critical path analysis  Control user-perceived delays o Elapsed time estimation o Remaining time prediction  User transaction tracking o Asynchrony, Time Sync

115 By the time it loads, the church service is over. Too slow!!! Slow responsiveness, long load time, and crashity crash crash. horrifically slow. Uninstalled. Crashes every time. DO NOT DOWNLOAD. RIP-OFF!! Slow and Crashes often. Fix. I will give 5 stars instead of 1. So slow that my screen already dims before it's loaded Too slow - killing the usefulness when you really need to go Crashes all the time. Should be called Crashing News. Crashed 3 times before I gave up Slow or no network crashes the app immediately.

116 UI Thread Background Thread Exception at GetSentiment() DownloadCallback() Stack trace GetSentiment() … DownloadCallback() Http.AsyncGet(url) GPSCallback() ClickHandler() GPS.AsyncSample() User Interaction Exception Path App Crashes App Store Slow and Crashes often. DO NOT DOWNLOAD. RIP-OFF!!

117 VanarSena MobiSys ‘13 Fast UI automation – Based on transaction tracking – 4x - 40x reduction in testing time Scalable Testing Infrastructure Automatic dynamic testing – Using an army of monkeys Uncovered ~3000 crash bugs in 1100 apps in the app store

118 Apps OS Sensors Network VanarSena MobiSys ‘14 VanarSena MobiSys ‘14 AppInsight OSDI ‘12 AppInsight OSDI ‘12 Timecard SOSP ‘13 Timecard SOSP ‘13 Procrastinator MobiSys ‘14 Procrastinator MobiSys ‘14 Wireless ESP NSDI ‘11 Wireless ESP NSDI ‘11 VTrack SenSys ‘09 VTrack SenSys ‘09 CTrack NSDI ‘11 CTrack NSDI ‘11 SmartAds MobiSys ‘13 SmartAds MobiSys ‘13 CodeInTheAir HotMobile ‘12 CodeInTheAir HotMobile ‘12 Combine MobiSys ‘07 Combine MobiSys ‘07 SixthSense MobiSys ‘08 SixthSense MobiSys ‘08 DAIR MobiSys ‘06 DAIR MobiSys ‘06 App—aware Operating Systems Nectar OSDI ‘10 Nectar OSDI ‘10 Wearables Mobile Support Best Paper Best Paper Nominee

119 Backup

120 0.1% runtime overhead Less than 1% memory overhead Less than 1% network overhead Negligible battery overhead Timecard Overhead

121


Download ppt "Improving and Controlling User-Perceived Delays in Mobile Apps Lenin Ravindranath By the time it loads, the church service is over. Too slow!!! Slow responsiveness,"

Similar presentations


Ads by Google