Presentation is loading. Please wait.

Presentation is loading. Please wait.

Measuring Where CPU Time Goes

Similar presentations


Presentation on theme: "Measuring Where CPU Time Goes"— Presentation transcript:

1 Measuring Where CPU Time Goes
Profiling Code Measuring Where CPU Time Goes

2 My Code Is Too Slow – Why? Look at code  what O() is it? Loading:
O(N) unavoidable and OK O(N2)  not good if N can get big Look-ups If you’ll do something N times, try to keep it O(1) or O(log N)

3 My Code is Complex! Can’t figure out O()
Or O() looks OK, but still not fast enough Profile! Measure where the time goes

4 Simple Profiling: Manual Random Sampling
Run the debugger Stop it with Debug  Pause Look at the subroutine and line where you paused Examine the call stack to see how you got there Continue execution with Debug  Continue More time in a routine  higher probability of stopping there Usually stop in same routine  found the problem

5 Detailed Profiling: gprof Tool
Randomly samples the function your program is in ~ every 1 ms Also records how it got there (call stack / call graph) Then summarizes the output for you How is this random sampling done? Program asks to be interrupted ~1000x / second by operating system Each interrupt  record function you are in

6 gprof Tool: How to Use Compile your program with the right options
Select profile configuration or make CONF=profile Adds –pg option to compiler  instruments exe Turns off function inlining  all function calls exist  easier to interpret

7 gprof Tool: How to Use Run program normally
./mapper Collects statistics, stores in big file (gmon.out) Program runs only a little slower (~30%) Run gprof to summarize / interpret output gprof mapper > outfile.txt Reads gmon.out, generates readable outfile.txt Even better: can visualize (graphics) gprof mapper gmon.out | gprof2dot . py -s | xdot - ECE 297 Profiling Quick Start Guide

8 Example: Extract Negatives
#include <vector> using namespace std; vector<int> extract_negatives (vector<int>& numbers) { vector<int> negatives; int i = 0; while(i < numbers.size()) { if(numbers[i] < 0) { negatives.push_back(numbers[i]); numbers.erase(numbers.begin() + i); } else { i++; //Next element } return negatives; Takes 30 s when given an 800,000 element vector. Too slow  why?

9 Visualized Call Graph Extract negatives called once by main
Takes 71% of time Type equation here. push_back() Estimated 37% of time This is an over-estimate: sample-based profiling isn’t perfect Erase called over 640,000 times Takes 53% of time  biggest problem

10 Milestone 4: Courier Company

11 Problem Definition C B D C A D A B
Given N deliveries (pick up, drop off) N = 4 here, all intersections and Given M courier truck depots M = 3 here, all intersections Return: low travel time path Starting and ending at some depot And reaching all 2N delivery intersections And always picking up a package before delivering it

12 No  dropping off packages before they’re picked up!
Possible Solution? C B D C A D A B No  dropping off packages before they’re picked up!

13 Output: vector of street segment ids
Legal Solution C B D C A D A B Output: vector of street segment ids

14 Simple Solution: Re-use m3 path-finder
// Go from first depot to first package pick up path = find_path (depot[0], delivery[0].pickUp); // Complete deliveries, in order for (i = 0; i < N-1; i++) { path += find_path (delivery[i].pickUp, delivery[i].dropOff); path += find_path (delivery[i].dropOff, delivery[i+1].pickUp); } // Drop off last package path += find_path (delivery[N-1].pickUp, delivery[N-1].dropOff); // Go back to the first depot to drop off the truck path += find_path (delivery[N-1].dropOff, depot[0]);

15 Possible Solution C B D C A D 2 A B 1 Lots of wasted travel!

16 Need to optimize delivery order!
More Logical Solution C B D C A D A B Need to optimize delivery order!

17 Exhaustive Algorithm? Try all possible delivery orders
Pick the one with lowest travel time How many combinations? M truck depots N deliveries  2N pick-up + drop-off intersections Pick one of M starting locations Then pick one of 2N pick-up/drop-off intersections Then one of 2N-1 for the second intersection … (repeat until last delivery) Then M places to drop off truck M * 2N * (2N-1) * (2N-2) * … * 1 * M = M2 (2N)! Some of these are illegal orders  say algorithm checks legality after generating the solution

18 Exhaustive Algorithm? Say M = 10, N = 100 102 * (2N)!  10377
Invoke find_path () 2N+1 times to get path between each intersection Say find_path takes 0.1 s (very good!) 10377 * 201 * 0.1 s = 1.6 x s 5 x years! Lifetime of universe: ~14 x 109 years!

19 Traveling Salesman Problem
We are solving a variation of the traveling salesman problem Computationally hard problem For N deliveries, no guaranteed optimal (lowest travel time solution) in polynomial time i.e. > O(Nk), for any k Means at least O(2N) Need to use heuristics to solve Most research problems are computationally hard Integrated circuit design, drug design, transportation network design, …

20 Stephen Cook: Pioneer of Complexity Theory
U of T professor in Computer Science NP-complete problems No polynomial time solution known on conventional computers Proved: If a method to solve any of these problems in polynomial time found, Then all these problems can be solved in polynomial time P vs. NP: most famous open problem in computer science 1970: denied tenure at UC Berkeley 1971: published most famous paper 1982: won Turing award

21 Not guaranteed to find best answer, but run in a reasonable time
Heuristic Algorithms Not guaranteed to find best answer, but run in a reasonable time

22 Heuristic 1 Ideas? Overall: O(M*N) + O(N*N) + O(M) = O(M*N + N2)
Go from any depot to nearest pickup, p while (packages to deliver) drop off package at delivery[p].dropOff p = nearest remaining pickup } Go to nearest depot O(M*N) N iterations O(N) O(M) Overall: O(M*N) + O(N*N) + O(M) = O(M*N + N2)


Download ppt "Measuring Where CPU Time Goes"

Similar presentations


Ads by Google