AppInsight: Mobile App Performance Monitoring in the Wild

AppInsight: Mobile App Performance Monitoring in the Wild
Microsoft Research Chiraphat Chaiphet, Hsin-Yu Cheng UCL Computer Science CS GZ06 11th March 2016 Hello everyone, today we are going to present: AppInsight, which is a system to help developers diagnose performance bottlenecks and failures experienced by their app in the wild, published by Microsoft Research in 2012. We two, Chiraphat Chaiphet and me Hsin-Yu Cheng as a group to do this presentation

Motivation We use many mobile applications nowadays,
Mobile apps A wide variety of environmental conditions in the wild. Different hardware and OS versions Wide range of user interaction →Difficult to simulate in the lab User-perceived delay So, developers need data about how their app is performing in the wild to maintain and improve the quality of apps. We use many mobile applications nowadays, when we look through their comments in app store, we can usually see someone comment tha app is slow, but these apps must be tested well in the lab, so why it is slow in the hand of users? In the wild, you have diverse environmental conditions. For example, very different network connectivity, or GPS signal quality. And there is a variety of phone hardware, also wide range of user interactions. To combine of these factors, it is difficult to emulate in the lab. For performance, lab test is good but not enough. So if we want to understand why the users are unhappy, we can fix this problem by monitoring the performance in the wild.

AppInsight Design principles
Low overhead Not slow down app performance Zero-effort Developer don’t need to write any code Done by rewriting app binaries No source code required Immediately deployable No change to mobile OS or runtime The design of AppInsight was guided by three principles. 1. beacuse the performance is already slow, so we actually don’t want to slow down the app performance. To minimize overhead, AppInsight carefully select which code points to instrument. 2. app developers don’t need to write additional code, or add code annotations. Instrumentation is done by automatically rewriting app binaries. 3. We do not require changes to mobile OS or runtime.

AppInsight System Overview
This figure shows the architecture of AppInsight. This is a developer and he writes his app, once he finish writing it, he use this we called instrumenter which is automatic instrumentation to modify app binery. This is the instrumented version app, he submit to app store, and the user download it. And then running the app, when user run the instrumented app, trace data is collected and uploaded to server. we use BTS to upload the trace data when no foreground apps are running. we monitor the performace, collect the Logs in our server, and we analysis. After that we give feedback to the developer The developer only needs to provide the instrumenter with app binaries. Use the background transfer service (BTS)

Goals User transaction: user manipulation→all synchronous and async
tasks (threads) completed To achieve our goal, AppInsight provide them with critical paths for user transaction and exception path when apps fail during a transaction. User transaction starts when the user manipulation occurs, and end when all synchronous and async threads completed This figure shows the execution trace for a simple code snippet here. In the figure, horizontal line segments indicate time spent in thread execution. arrows between line segments indicate causal relationships between threads. In this example, user transaction begins with user click a button, the OS invokes the enent handler (btn_Fetch_click) in the context of the UI thread the handler makes an async web request the handler end time is spent downloading when web request completes, the OS call reqCallback then processing these data when the processing finishs, the worker thread invokes the UI dispatcher to queue a UI update Finally, the UI update the user transaction in this example is end when UI update method completes. But, not every user transcation end with UI update. So, what is uer-perceived delay? When user click the botton, and wait until the data shows on his screen. during this time is we called “user-perceived delay” So in this fiugure, user-perceived delay is entire transaction time. user-percived delay

Goals Critical path: CP is the bottleneck path in a user transaction
For example, This Figure illustrates a execution trace of a location-based app. (0)upon user manipulation, (1) the app asks the system to get a GPS fix, and supplies a callback to invoke when get the fix. (2)The system gets the fix, and invokes the callback at (2). (3)The callback function reads the GPS coordinates and makes two parallel web requests to get some data. (4)Then, the thread waits for two completion signals. The dotted line indicates the waitting time. when the two web requests complete, the OS invokes their callbacks at (5) and (7). The first callback signals complete at (6), while the second one does it at (8). beacuse now the 2 requests complete, so the blocked thread wakes up at (9), and updates the UI via the dispatcher. for such complex behavior, it can be difficult for the developers to find where the bottlenecks in the code are. So, in this case The red one is CP because it took longer time to complete. as a result, we optimize the CP to reduce the user perceived delay. We will explain this in detail later. The bottleneck path in a user transaction Optimizing the critical path reduces the user perceived delay

Exception path: Goals The exception path begin with user manipulation→end with the exception method, spanning asynchronous boundaries. (0)->(8) is exception path. Suppose the app crash here. Normally, we only know where the crash location of the the code . Log file contains only stack trace of that thread which is not useful for developer to replicate that error. but AppInsight can walk back the transaction graph to find what the user manipulation which actually trigger the crash. user manipulation→the exception method, spanning asynchronous boundaries.

What is the minimal amount of data we need to capture?
Instrumentation Capture UI manipulation events What is the minimal amount of data we need to capture? Firstly, we need to capture when user manipulate happen and what event handler called, this is the start of the transaction

Next, we need to track when thread execute.
Instrumentation Capture UI manipulation events Thread execution Next, we need to track when thread execute.

Instrumentation Capture UI manipulation events Thread execution Async calls and callbacks Next, we need to track when asynchronous call happen and how they are related to put the callbacks.

we also need to track when thread synchronization happen.
Instrumentation Capture UI manipulation events Thread execution Async calls and callbacks Thread Synchronization we also need to track when thread synchronization happen. there are two request, the thread blocked here fire->complete until two requests complete, the thread wake up

Instrumentation Capture UI manipulation events Thread execution Async calls and callbacks Thread Synchronization UI updates And we also need to track whenever the app finishes updating the UI, this is important because this measure sort of user perceived delay.

next, we also track the unhandled exceptions
Instrumentation Capture UI manipulation events Thread execution Async calls and callbacks Thread Synchronization UI updates Unhandled exceptions Additional Information URL, the network state GPS next, we also track the unhandled exceptions When an unhandled exception occurs in the app code, the system terminates the app. Before terminating, the system delivers a special event to the app. The data associated with this event contains the exception type and the stack trace of the thread in which the exception occurred. Finally,additional information. For certain asynchronous calls such as web requests and GPS calls, we collect additional information both at the call and at the callback. For example, for web request calls, we log the URL and the network state. For GPS calls, we log the state of GPS. By logging a small amount of additional information, we can give more meaningful feedback to the developer. now we are going to explain these three in detail by G.

Thread Execution Sentiment Analysis Donald Trump Query Submit 50% Tweets This app take a keywork input to get some tweets from Twitter than analyse score for that keyword. *Mobile CellPhone image by Webzoneme usage under CC0 1.0, Twitter logo is a trademark of Twitter, Inc.

Thread Execution Synchronous Code
Sentiment Analysis Query Thread Execution Submit Tweets 50% Synchronous Code ClickHandler() { tweets = HttpGet(url + keyword); score = analysis(tweets); result.Text = score; } Logger.start(); Logger.end() analysis(tweets) { ………. return score; } If using Synchronous code, this code will be very simple and everything are handled in UI thread. To capture this event, we can log at the start and end of this method. However, this syschronous will make app freeze during web query. UI Thread ClickHandler Start ClickHandler End User Perceived Delay

Thread Execution Asynchronous Code
Sentiment Analysis Query Thread Execution Submit Tweets 50% Asynchronous Code ClickHandler() { AsyncHttpGet(url+keyword, DownloadCallback); } DownloadCallback(tweets) { score = analysis(tweets); UIDispatch(DisplayScore, score); } analysis(tweets) { ………. return score; } DisplayScore(score){ result.Text = score; } Download Delay UI Thread ClickHandler Start ClickHandler End Background Thread DownloadCallback Start DownloadCallback End analysis End Start UI update Start UI update End So we modified this code to use Asynchronous call, web request and score analysis is performed in background thread. Callback function is called when tweets are downloaded. This make capturing thread execution complicated because we need to know who call this callback function. Can we do the same way by logging at start and end of every method? The answer is no.

Thread Execution Asynchronous Code
Sentiment Analysis Query Thread Execution Submit Tweets 50% Asynchronous Code Log every method calls? : High overhead Only need to log thread boundaries Log entry and exit of “Upcalls” System Upcalls Download Delay UI Thread ClickHandler Start ClickHandler End Background Thread DownloadCallback Start DownloadCallback End analysis End Start UI update Start UI update End Log every method call is causing too high overhead and make the app slow. We want to know only start and end of each Thread. This paper shows that logging only System upcalls is enough.

Matching Async Calls to their Callbacks
System DownloadCallback App ClickHandler DownloadCallback(response) Problems A callback method can be called from from many methods Same Async call can be used many times in loop Next problem is how to match Async Calls to their Callbacks. A callback method can be called from from many methods Same Async call can be used many times in loop

Matching Async Calls to their Callbacks
System App ClickHandler DownloadCallback(response) DetourCallback,matchId DetourCallback(response) matchId System App ClickHandler DownloadCallback(response) AsyncHttpGet(url) DownloadCallback Problems A callback method can be called from from many methods Same Async call can be used many times in loop Solution : Detour Callbacks Use matchId number to identify each Async call Increment matchId for every calls Solution : Detour Callbacks Use matchId number to identify each Async call Increment matchId for every calls

Is number of users large enough for evaluation?
Deployment Analyse 30 popular free Windows Phone apps 30 users 6 hardware models 4 months of data collection 6752 app sessions, 33,000 minutes in apps 563,641 total transactions, 69% are timer transactions Analyse only 167,286 user transactions, 40% of user transactions come from a multiplayer game User Transactions and Critical Paths Asynchronous calls per user transaction varies from 1.2 to 18.6 The average number of parallel threads per user transaction varies from 1 to 7.6 In critical paths, only a few edges responsible for most of the transaction time Exception paths Collected 111 crash logs from 16 apps 43 involved asynchronous transactions Analyse 30 popular free Windows Phone apps, 1 from author 30 users 6 hardware models for 4 months 6752 app sessions, 33,000 minutes in apps 563,641 total transactions, 69% are timer transactions Peridic Timer/sensor function is called transactions timer transactions Analyse only 167,286 user transactions, 40% of user transactions come from a multiplayer game app User Transactions and Critical Paths Result show that apps use many Async calls and multiple threads In critical paths, only a few edges responsible for most of the transaction time Developer can focus to improve these code. Exception paths Collected more than 100 crash logs which about 40% involve with asynchronous transactions Show that AppInsight can be useful for developer to track these exceptions. Is number of users large enough for evaluation? Many apps have less than 10 users. Is number of users large enough for evaluation?

Analysis Methodology Capturing User transactions : Build directed acyclic graphs from the traces UI Thread Background Thread GPS Callback Thread Web Callback Thread Web Call Thread Blocked Wake up User click UI Update S F E From the data that it is collected from app, we can build the directed acyclic graphs to represent transaction. In this example, we have 1 UI thread and 1 background thread. The background thread create 2 asynchronous call to web and GPS. After Web and GPS callbacks are finished, background thread wake up to process data and call to update UI. AppInsight can draw graph by starting at user manipulation M. (M) User Manipulation (S) Upcall start (E) Upcall end (A) Async call start (L) Layout updated (B) Thread blocked node (F) Semaphore fired node (W) Thread wakeup node S F E S B A W A E S E L E A M S

Analysis Methodology Capturing User transactions : Build directed acyclic graphs from the traces Web Callback Thread Web Call GPS Callback Thread Background Thread Thread Blocked Thread Wake up User click UI Update UI Thread E Analyse critical path by starting at last UI update. At W, we found that there are multiple edges point to W, Use last F because it is holding back this processing time. (M) User Manipulation (S) Upcall start (E) Upcall end (A) Async call start (L) Layout updated (B) Thread blocked node (F) Semaphore fired node (W) Thread wakeup node S E A F F A S A E W S B A S S E Critical Path L M E

Analysis Methodology Capturing User transactions : Build directed acyclic graphs from the traces Web Callback Thread Web Call GPS Callback Thread Background Thread Thread Blocked Thread Wake up User click UI Update UI Thread For exception path, suppost that we request callback failed. We can trace back from exception back to user manipulation. (M) User Manipulation (S) Upcall start (E) Upcall end (A) Async call start (L) Layout updated (B) Thread blocked node (F) Semaphore fired node (W) Thread wakeup node S S F E A Exception Path A W S B A S M E

Aggregate Analysis Group transactions with same graph (6,606 transaction groups) Understanding performance variance Use statistical technique called Analysis of Variance (ANOVA) 29% of the transaction groups contain multiple distinct critical paths A unique critical path has the dominant edge varies in 40% of the cases Analyse three factors network transfer location queries local processing If GPS was not initialized, the query took 3–20 seconds Performance can be affected by Network, GPS state, Device model, User Outliers Flag transactions which take longer time than usual Duration greater than (mean + (k * standard deviation)), k = 3 Group transactions with same graph (6,606 transaction groups) Understanding performance variance Use statistical technique called Analysis of Variance (ANOVA) 29% of the transaction groups contain multiple distinct critical paths A unique critical path has the dominant edge varies in 40% of the cases Analyse three factors network transfer location queries local processing If GPS was not initialized, the query took 3–20 seconds Performance can be affected by Network, GPS state, Device model, User Outliers Flag transactions which take longer time than usual Duration greater than (mean + (k * standard deviation)), k = 3

UI for Developer Feedback
This is a Web UI to report back to developer. Similiar transactions are group together and highlight some outliner. The developer can look at the graph and critical path of each transaction. Analysis also show a pie chart abount time spent in this transaction. You can clearly see that this transaction spend most of the time with network function.

Is it really useful for matured app?
Case Studies App 1 : developed by an author of this paper Exceptions Found problem in routine that split a line into words that incorrectly parse blank lines UI sluggishness DLLs loading during early launch cause problems, developer modify code to pre-load DLLs Wasted computation Some background threads were not being terminated correctly Serial network operations Network load is slow. Modify code to use parallel network operations App 2 : A popular app that in the marketplace for over 2 years Aggregate analysis showed that 3G data latency significantly impacted some transactions App developers were already aware of this problem but did not have good quantitative data They were impressed with AppInsight highlighted the problem easily App 3 : App Under active development Custom instrumentation code that the developers had added was causing poor performance Using AppInsight instead of custom instrumentation will solve this problem App 1 : developed by an author of this paper Exceptions Found problem in routine that split a line into words that incorrectly parse blank lines UI sluggishness DLLs loading during early launch cause problems, developer modify code to pre- load DLLs Wasted computation Some background threads were not being terminated correctly Serial network operations Network load is slow. Modify code to use parallel network operations App 2 : A popular app that in the marketplace for over 2 years Aggregate analysis showed that 3G data latency significantly impacted some transactions App developers were already aware of this problem but did not have good quantitative data They were impressed with AppInsight highlighted the problem easily App 3 : App Under active development Custom instrumentation code that the developers had added was causing poor performance Using AppInsight instead of custom instrumentation will solve this problem Is it really useful for matured app? Is it really useful for matured app?

0.02% 2% 4% App run time 0.02% 1.2% <1% Memory 2% Network 4%
CPU 0.02% Overheads App run time Average 0.57ms overhead per user transaction (Max 30ms) Average 0.21ms overhead per second (Max 5ms) Memory Uses 1MB memory buffer. (Typical apps consume 50MB) Network Average upload data is 3.8KB per app launch Use Background Transfer Service to upload the data. Can upload on WiFi only. Size Increased app binaries by 1.2% Battery: Use hardware power meter to measure power consumption Power usage increase from 1193 mW to 1205 mW Result is within experimental noise Impact of AppInsight on battery life is negligible MEMORY 2% NETWORK 4% APP SIZE 1.2% App run time 0.02% Average 0.57ms overhead per user transaction (Max 30ms) Average 0.21ms overhead per second (Max 5ms) Memory 2% Uses 1MB memory buffer. (Typical apps consume 50MB) Network 4% Average upload data is 3.8KB per app launch Use Background Transfer Service to upload the data. Can upload on WiFi only. Size 1.2% Increased app binaries by 1.2% Battery <1% Use hardware power meter to measure power consumption Power usage increase from mW to 1205 mW Result is within experimental noise Impact of AppInsight on battery life is negligible BATTERY <1% Negligible Overhead

Coverage Is logging method calls enough to capture all events?
Scenerio Compare AppInsight with fully instrumented - log all method calls Run on virtualized Windows Phone environment Automated UI framework simulates random user actions Ran each app a 100 times, simulating between 10 and 30 user transactions Result The “extra” instrumentation did not discover any new user transaction Full instrumentation overhead 7000 times higher than AppInsight Lastly, They perform coverage test to see if AppInsight capture enough information. Scenerio Compare AppInsight with fully instrumented - log all method calls instead of logging only upcalls Run on virtualized Windows Phone environment Automated UI framework simulates random user actions Ran each app a 100 times, simulating between 10 and 30 user transactions Is logging method calls enough to capture all events? Do we need to log every variable reference? Result Claim that the “extra” instrumentation did not discover any new user transaction Full instrumentation overhead 7000 times higher than AppInsight Can full instrumentation enable new analysis techniques to uncover hidden information? Can full instrumentation enable new analysis techniques to uncover hidden information?

Related Works Correlating event traces
LagHunter Focus on synchronous rendering time, developer need to supply method lists. Magpie Transactions from event log for server workload. Windows Phone log not avaliable to app. XTrace and Pinpoint Trace the path using a special identifier attached to each request Can trace across process/app but AppInsight does not trace across apps. Aguilera Use timing analysis to correlate trace logs collected from a black boxes system. Finding critical path of a transaction Yang and Miller “Critical Path Analysis for the Execution of Parallel and Distributed Programs” Finding the critical path of parallel and distributed programs. Barford and Crovella “Critical Path Analysis of TCP Transactions” Critical paths in TCP transactions, similiar ideas in building a graph but different design. Mobile application monitoring Commercial mobile monitoring products, focus on statictics collection. eg. Flurry, PreEmptive Many previous works from researchers are focus on energy consumption. Correlating event traces LagHunter Focus on synchronous rendering time, developer need to supply method lists. LagHunter collects data about user-perceived delays in interactive applications. Focused on synchronous delays such as rendering time. Magpie A system for monitoring and modeling server workload. Need application semantics supplied by the developer. Windows phone, system-event logs are not accessible to app. XTrace and Pinpoin Trace the path using a special identifier attached to each request. Can trace across process/app but AppInsight does not trace across apps. Aguilera Use timing analysis to correlate trace logs collected from a black boxes system. Finding critical path of a transaction Yang and Miller “Critical Path Analysis for the Execution of Parallel and Distributed Programs” Finding the critical path of parallel and distributed programs. Barford and Crovella “Critical Path Analysis of TCP Transactions” Critical paths in TCP transactions, similiar ideas in building a graph but different design. Mobile application monitoring Commercial mobile monitoring products, focus on statictics collection. eg. Flurry, PreEmptive Many previous works from researchers are focus on energy consumption.

Limitations and Future Works
Causal relationships between threads Not tracking data dependencies. Threads may use shared variables. Miss implicit causal relationships introduced by resource contention such as disk read/write. Cannot untangle complex dependencies introduced by counting semaphores. Does not track any state that a user transaction may leave behind. Definition of user transaction and critical path Some user interactions may involve multiple user inputs Current implementation will break them into multiple transactions Privacy issues Currenly use anonymous hash value instead of phone id to protect user privacy. Still risk in collecting user data from URLs Applicability to other platforms Apps have a single, dedicated UI thread Can rewrite byte code Can identify all possible upcalls and thread start events triggered by the UI The system needs to have a set of well-defined thread synchronization primitives Causal relationships between threads Not track data dependencies Miss implicit causal relationships introduced by resource contention. Cannot untangle complex dependencies introduced by counting semaphores. The Silverlight framework for Windows Phone does not currently support counting semaphores. Does not track any state that a user transaction may leave behind. Miss dependencies resulting from such saved state. Definition of user transaction and critical path Some user interactions may involve multiple user inputs Current implementation will break them into multiple transactions Privacy issues Currenly use anonymous hash value instead of phone id Risk collect user data from URLs Applicability to other platforms Apps have a single, dedicated UI thread Can rewrite byte code Can identify all possible upcalls and thread start events triggered by the UI The system needs to have a set of well-defined thread synchronization primitives

Conclusion AppInsight can help developers to easily understand performance problems of their app that were deployed on real user devices (in the wild). Automatically trace User Transactions Identify Critical Paths and Exception Paths Aggregate Analysis show factors that affect performance Extremely low overhead, not slow down apps Zero developer effort, no code change Readily deployable on Windows Phone platform

AppInsight: Mobile App Performance Monitoring in the Wild

Similar presentations

Presentation on theme: "AppInsight: Mobile App Performance Monitoring in the Wild"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

AppInsight: Mobile App Performance Monitoring in the Wild

Similar presentations

Presentation on theme: "AppInsight: Mobile App Performance Monitoring in the Wild"— Presentation transcript:

Similar presentations

About project

Feedback