Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 4 Timing Linux Processes & Threads

Similar presentations


Presentation on theme: "Lecture 4 Timing Linux Processes & Threads"— Presentation transcript:

1 Lecture 4 Timing Linux Processes & Threads
CSCE 713 Advanced Computer Architecture Lecture 4 Timing Linux Processes & Threads Topics Readings January 19, 2012

2 Overview Last Time Readings for today New
Posix Pthreads: create, join, exit, mutexes /class/csce Code and Data Readings for today New Website alive and kicking; dropbox too! From Last time: Gauss-Seidel Method, Barriers, Threads Assignment Next time performance evaluation, barriers and MPI intro

3 Threads programming Assignment
Matrix addition (embarassingly parallel) Versions Sequential Sequential with blocking factor Sequential Read without conversions Multi threaded passing number of threads as command line argument (args.c code should be distributed as an example) Plot of several runs Next time

4 Time in the Computer World
Clock cycle time = ? 1GHz processor has 10-9 = 1 ns clock cycle time Computer Systems: A Programmers Perspective, Bryant and O’Hallaron

5 Times in the Unix World Command
Real Time User Time System Time Waiting Time – time waiting on I/O operations and while other processes execute Clock Time Which time is most important? Latency vs throughput

6 The Time Command TIME(1) TIME(1) NAME time - run programs and summarize system resource usage Example of a sleepy program real 3m10.006s user 0m0.004s sys 0m0.000s

7 MIPS, MFLOPS, Benchmarks SPEC2006 – relevance to this course

8 Multitasking and Interval Interrupts
Recall Multitasking from OS class: multiple processes ready to go scheduler decides who goes next each process given small time slice (1-10ms) The process end either because of an I/O operation, page-fault, even cache-miss, or it uses up time slice System has an “interval timer” to interupt at the end of the time slice

9 Computer Systems: A Programmers Perspective, Bryant and O’Hallaron
CSAPP Fig 9.2 . Computer Systems: A Programmers Perspective, Bryant and O’Hallaron

10 Measuring Time by Interval Counting
. Computer Systems: A Programmers Perspective, Bryant and O’Hallaron

11 Practice Problem 9.4 On a system with a timer interval of 10 ms, some segment of process A is recorded as requiring 70 ms, combining both system and user time. What are the minimum and maximum actual times used by thissegment?

12 Programmer Access to Interval Timers
#include <sys/times.h> struct tms clock_t tms_utime; /* user time * / clock_t tms_s time; /* system time * / clock_t tms_cutime; /* user time of reaped children */ clock_t tms_cstime; /* system time of reaped children */ } ; clock_t times(struct tms *buf); Returns: number of clock ticks elapsed since system started Computer Systems: A Programmers Perspective, Bryant and O’Hallaron

13 Measuring interval counting accuracy. Fig 9.8

14 Cycle Counters on the IA32
Cycle counters increment every clock cycle 32 bit counter with 1GHz clock wraps after 232/109 ~ 4.3 seconds 64 bit counter takes a lot longer 264/109 ~ 570 years IA32 counter accessed with the “rdtsc” (read time stamp counter) instruction It places high order 32 bits in %edx and low 32 in %eax It would be nice if we had a c interface void access counter(unsigned *hi, unsigned *lo);

15 Including Assembly Code in C
void access_counter(unsigned *hi, unsigned *lo) { asm("rdtsc; movl %%edx,%0; movl %%eax,%1" /* Read cycle counter */ : "=r" (*hi), "=r" (*lo) /* and move results to */ : /* No input */ /* the two outputs */ : "%edx", "%eax"); }

16 Closer Look at Extended ASM
void access_counter (unsigned *hi, unsigned *lo) { /* Get cycle counter */ asm("rdtsc; movl %%edx,%0; movl %%eax,%1" : "=r" (*hi), "=r" (*lo) : /* No input */ : "%edx", "%eax"); } asm(“Instruction String" : Output List : Input List : Clobbers List); } Instruction String Series of assembly commands Separated by “;” or “\n” Use “%%” where normally would use “%”

17 Closer Look at Extended ASM
void access_counter (unsigned *hi, unsigned *lo) { /* Get cycle counter */ asm("rdtsc; movl %%edx,%0; movl %%eax,%1" : "=r" (*hi), "=r" (*lo) : /* No input */ : "%edx", "%eax"); } asm(“Instruction String" : Output List : Input List : Clobbers List); } Output List Expressions indicating destinations for values %0, %1, …, %j Enclosed in parentheses Must be lvalue Value that can appear on LHS of assignment Tag "=r" indicates that symbolic value (%0, etc.), should be replaced by register

18 Closer Look at Extended ASM
void access_counter (unsigned *hi, unsigned *lo) { /* Get cycle counter */ asm("rdtsc; movl %%edx,%0; movl %%eax,%1" : "=r" (*hi), "=r" (*lo) : /* No input */ : "%edx", "%eax"); } asm(“Instruction String" : Output List : Input List : Clobbers List); } Input List Series of expressions indicating sources for values %j+1, %j+2, … Enclosed in parentheses Any expression returning value Tag "r" indicates that symbolic value (%0, etc.) will come from register

19 Closer Look at Extended ASM
void access_counter (unsigned *hi, unsigned *lo) { /* Get cycle counter */ asm("rdtsc; movl %%edx,%0; movl %%eax,%1" : "=r" (*hi), "=r" (*lo) : /* No input */ : "%edx", "%eax"); } asm(“Instruction String" : Output List : Input List : Clobbers List); } Clobbers List List of register names that get altered by assembly instruction Compiler will make sure doesn’t store something in one of these registers that must be preserved across asm Value set before & used after

20 Using access_counter #include "clock.h" void start counter(); double get counter();

21 The K-Best Measurement Scheme
Measurements of process time in a loaded system are always overestimates Context switching, cache operations, branch predictions Interference from other processes Even unloaded I/O operations will vary because of locations of disk heads etc. Heisenberg uncertainty principle Heisenbugs – bugs influenced/created by monitoring code

22 The K-Best Measurement Scheme
Warm up the instruction cache Record the K fastest times If these agree with some tolerance say .1% then the fastest of these represents the true execution time.

23 Measuring Using Time of Day Code/CSAPP/perf/tod. c
#include <stdlib.h> #include <stdio.h> #include <time.h> #include <sys/time.h> #include <unistd.h> static struct timeval tstart; /* Record current time */ void start_timer() { gettimeofday(&tstart, NULL); }

24 /. Get number of seconds since last call to start_timer
/* Get number of seconds since last call to start_timer */ double get_timer() { struct timeval tfinish; long sec, usec; gettimeofday(&tfinish, NULL); sec = tfinish.tv_sec - tstart.tv_sec; usec = tfinish.tv_usec - tstart.tv_usec; return sec + 1e-6*usec; }

25 /. Determine how many "seconds" are in tod counter
/* Determine how many "seconds" are in tod counter */ static void callibrate_tod() { double quick, slow; start_timer(); quick = get_timer(); sleep(1); slow = get_timer(); printf("%.2f - %f = %.2f seconds/sleep second\n", slow, quick, slow-quick); }

26 static void run_timer() { int i = 0; double t1, t2, t3, t4; start_timer(); t1 = get_timer(); do { t2 = get_timer(); } while (t2 == t1); t3 = get_timer(); i++; } while (t3 == t2); printf("Time = %f usecs, in %d iterations, %f usec/iteration\n“, 1e6 *(t3-t2), i, 1e6 *(t3-t2)/i);

27 start_timer(); i = 0; do { t4 = get_timer(); i++; } while (t4 < 1
start_timer(); i = 0; do { t4 = get_timer(); i++; } while (t4 < 1.0); printf("%d iterations in %f secs = %f usec/iteration\n", i, t4, 1e6*t4/i); }

28 static void check_time() { long e; int s, m, h, d, y; start_timer(); e = tstart.tv_sec; s = e % 60; e = e / 60; m = e % 60; h = e % 24; e = e / 24; d = e % 365; e = e / 365; y = e; printf("This clock started %d years, %d days, %d hours, %d minutes, %d seconds ago\n“, y, d, h, m, s); }

29 int main(int arg, char *argv[]) { callibrate_tod(); run_timer(); check_time(); return 0; }

30

31

32 Gprof – profiling Linux programs
Links

33 Man gprof GPROF(1) GNU GPROF(1) NAME gprof - display call graph profile data SYNOPSIS gprof [ -[abcDhilLrsTvwxyz] ] [ -[ACeEfFJnNOpPqQZ][name] ] [ -I dirs ] [ -d[num] ] [ -k from/to ] …] [ --[no-]exec-counts[=name] ] … [ --debug[=level] ] [ --function-ordering ] [ --file-info ] [ --help ] [ --line ] [ --min-count=n ] [ --static-call-graph ] [ --sum ] [ --table-length=len ] [ image-file ] [ profile-file ... ]

34 Gprof – GNU profiler -Arnout Engelen $ time event2dot //this took more than 3 minutes on this input: real 3m36.316s user 0m55.590s sys 0m1.070s $ g++ -pg dotgen.cpp readfile.cpp main.cpp graph.cpp config.cpp -o event2dot $ gprof event2dot | less'. gprof now shows us the following functions are important: % cumulative self self total time seconds seconds calls s/call s/call name CompareNodes(…) getNode(…) CompareEdges(…) addAnnotatedEdge(… ) addEdge(…)

35 Valgrind Valgrind is a memory mismanagement detector. It shows you memory leaks, deallocation errors, cache profiling Memcheck can detect: Use of uninitialised memory Reading/writing memory after it has been free'd Reading/writing off the end of malloc'd blocks Reading/writing inappropriate areas on the stack Memory leaks -- where pointers to malloc'd blocks are lost forever Mismatched use of malloc/new/new; vs free/delete/delete; Overlapping src and dst pointers in memcpy() and related functions Some misuses of the POSIX pthreads API

36 General Parallel Computing Wiki-Links

37 Libraries

38


Download ppt "Lecture 4 Timing Linux Processes & Threads"

Similar presentations


Ads by Google