Presentation is loading. Please wait.

Presentation is loading. Please wait.

Beyond Application Profiling to System Aware Analysis Elena Laskavaia, QNX Bill Graham, QNX.

Similar presentations


Presentation on theme: "Beyond Application Profiling to System Aware Analysis Elena Laskavaia, QNX Bill Graham, QNX."— Presentation transcript:

1 Beyond Application Profiling to System Aware Analysis Elena Laskavaia, QNX Bill Graham, QNX

2 Agenda Introduction to application profiling Profiling techniques Visualization Application centric profiling System aware analysis Summary

3 Application profiling Application In QNX terms this is a OS process with one or more threads Application profiling Measuring function and, optionally, line of code execution time Visualizing the results Various techniques available with associated strengths and weaknesses Sampling Call count instrumentation Function instrumentation Kernel event tracing

4 Sampling – how it works int func2(int var) { int p = sqrt((double)var); return var - p*p; } void test(){ int var, sum; func1(); for (var = 10; var < 15; ++var) { sum+=func2(var); } printf("result=%d\n", sum); } test() func1() func2() printf() Instruction pointer samples Source Execution path Annotated source histogram

5 Sampling Technique Sampling gets rough estimate on where process is spending its time. Target agent periodically records target process instruction pointer (sample) IDE gathers all samples, aggregates them and presents to the user in the form of a table or annotated code Strengths No special compilation required for binary Very low overhead Per instruction granularity Weaknesses Gives reliable results only for long running applications Can give incorrect results for timer based applications (because sampling is timer based itself) It is not possible to find out where function was called from (mitigated by combining with call count instrumentation)

6 Call counts – how it works int func2(int var) { int p = sqrt((double)var); return var - p*p; } void test(){ int var, sum; func1(); for (var = 10; var < 15; ++var) { sum+=func2(var); } printf("result=%d\n", sum); } Source Call tree and graph test() func1() func2() printf() Execution path = 1 = 2 = 3 = 4 = 5 = 1

7 Call counting Technique Call counting provides precise call count of all functions and all function pairs for instrumented code IDE provides visualization for call graph and call counts Strengths Precise call count information Provides call pair information, aggregated as a call graph Relatively low overhead Can augment sampled profiling Weaknesses Requires instrumentation (special compiler and linker options) Provides no information for non-instrumented libraries Call pair information but not full stack frames

8 Function instrumentation – how it works Source Execution path _func_enter _func_exit High resolution function timings int func2(int var) { int p = sqrt((double)var); return var - p*p; } void test(){ int var, sum; func1(); for (var = 10; var < 15; ++var) { sum+=func2(var); } printf("result=%d\n", sum); } test() func1() func2() printf()

9 Function instrumentation Technique Function Instrumentation records precise function execution time and runtime call-graph Requires instrumenting binary with –finstrument-functions compiler option which provides hooks on entry and on exit of each function Supports all visualization modes: function table, threads tree, call graph, call tree, annotated editor Strengths Complete runtime call graph, including call counts and full depth stack- frames for each call. Precise function execution time (aggregated) Weaknesses Requires instrumentation (special compiler and linker options) Higher overhead (overhead is removed from data shown in IDE)

10 Kernel event tracing – how it works IDE Execution path _func_enter _func_exit test() func1() func2() printf()

11 Kernel event tracing Technique Visualization of kernel trace logging Strengths System-wide perspective of target behaviour Precise information on context switches Weaknesses Available for relatively small timeframe Higher overhead when capturing trace Requires instrumented kernel running on target

12 Visualization

13 The key to understanding your system and the problems you are trying to solve is visualization Providing alternative views on the same data provides insight: Call trees and graphs Comparison of results from different scenarios Filtering, searching, grouping Source traceability and source annotation

14 Function call trees and graphs Call tree – top down Reverse call tree – bottom up Call graph

15 Comparing results Compare previous profiling sessions Snapshot comparison

16 Grouping, search and filtering Search Group Filters

17 Source code annotation

18 Application centric profiling

19 Isolating performance issues Suspect performance problem with our application Find out per-process CPU usage using a process monitor

20 Quick peek using sampling Get an overview of CPU time consumption Attach to running process or launch with profiling Sampling is good for an overview but not detailed enough to find the actual problem So we need to instrument the binary and run it again.

21 Instrumented run Compile and run application with function instrumentation Top-bottom call tree shows how much time process spent in each node, starting from thread and going down to individual functions Expand node Observe what function calls, expand nodes with the bigger contribution first Drill down to particular function to see aggregated results

22 Instrumented run, cont. Traverse call tree to look for anomalies In this example, most of the time consumption from usage of memory allocation functions For example, optimize this code by heap memory allocation with static memory Re-run application and compare results

23 Compare results Recompile changes and re-run instrumented application Select old and new session and run “Compare session” Compare tree shows different between old and new session Tooltips show absolute values of old and new run

24 System aware analysis

25 System and application profiling Application centric view works for isolating local performance problems Algorithms CPU-intensive code Embedded, real-time applications never work in isolation Need to consider interacting processes Temporary client-server relationships are typical as processes request data and services from each other Process 1 Process 2 Device Traditionally, application profiling looks at processes in isolation

26 Example - client Investigation first leads to the client process We see a majority of the time spent in the getServerValue function Where is the CPU time really being spent?

27 Observe CPU usage

28 Example – client, server and device! Looking at system profile tells us that the server is using block I/O (EIDE disk driver) We can see as much detail as we need – down to individual interactions with the driver In this example, we can clearly see where time is spent in application versus driver

29 System awareness System profiling provides visualization of behavior among processes including devices and the OS Function instrumentation provides process and function details in system event trace Leverage the function call information with system awareness to understand what is really going on

30 Example - server Investigation leads to server process where most of the time spent is in the getProperty() function Eventually, I/O is required to satisfy the requests File Device Network

31 Summary

32 Various options for application profiling Sampling, call counts, function instrumentation Technique used depends on need Visualization tools are essential for analysis Problem solving and optimization requires system view Leverage system and application profiling is key

33 Thank you! Questions & Answers


Download ppt "Beyond Application Profiling to System Aware Analysis Elena Laskavaia, QNX Bill Graham, QNX."

Similar presentations


Ads by Google