Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.

Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare two or more systems  To compare different algorithms  metric to be used?  speedup

Rassul Ayani 2 Workload-Driven Evaluation  Approach  Run a workload (trace) and measure performance of the system  Traces  Real trace  Synthetic trace  Other issues  How representative is the workload?

Rassul Ayani 3 Type of systems  For existing systems  Run workload and evaluate performance of the system  Problem: Is the workload representative?  For future systems (an architectural idea):  Develop a simulator of the system  run workload and evaluate the system  Problem: Developing a simulator is difficult and expensive How do define system parameters, such as memory access time and communication cost?

Rassul Ayani 4 Measuring Performance  Performance metric most important to end user  Performance =Work / Time unit  Performance improvement due to parallelism Time(1) Time(p) Speedup=

Rassul Ayani 5 Performance evaluation of a parallel computer  Speedup(p) = Time(1) / Time(p)  What is Time(1)? 1. Parallel program on one processor of parallel machine? 2. A sequential algorithm on one processor of the parallel machine? 3. “Best” sequential program on one processor of the parallel machine? 4. “Best” sequential program on agreed-upon standard machine?  Which one is reasonable?

Rassul Ayani 6 Speedup  What is Time(p)?  The time needed by the parallel machine to run the same workload?  Is it fair?  How does the size affects our measurement?

Rassul Ayani 7 Example 1: Our experimence Parallel simulation of Multistage Interconnection Network (MIN) d: number of stages n: number of nodes n= (d+1)* 2 d

Rassul Ayani 8 Size 11 12 13 14 15 speedup 40 45 70 1600 2200 Speedup of MIN on CM-2 Speedup=T(1)/T(p), where T(1)=execution time of sequential simulator on a sun sparc T(p)=execution time of parallel simulator on CM-2 with 8k processors

Rassul Ayani 9 Why problem size is important?  The problem size is too small:  May be appropriate for small machine, but not for the parallel machine  Not enough work for the PM  Parallelism overheads begin to dominate benefits for the PM Load imbalance Communication to computation ratio  May even achieve slowdowns  Doesn’t reflect real usage, and inappropriate for large machines  Can exaggerate benefits of architectural improvements, especially when measured as percentage improvement in performance

Rassul Ayani 10 Size is too large  May not “fit” in small machine  Can’t run  Thrashing to disk  Working set doesn’t fit in cache  May lead to super linear speedup  What is the right size?  How do we find the right size?

Rassul Ayani 11 Scaling: Example 2 Small and big equation solvers on SGI Origin2000 (fom Parallel Computer Architecture, Culler & Singh)

Rassul Ayani 12 Scaling issues  Important issues  Reasonable problem size  Scaling problem size  Scaling machine size  Example  Consider a dispatcher based cluster and compare three load balancing algorithms  Round Robin (RR)  Least connection (LC) first  Least loaded first (LL)

Rassul Ayani 13 Scalin: example Dispatcher based web server

Rassul Ayani 14 Determine the problem size

Rassul Ayani 15 Scale problem size, but keep machine size fixed arrival rate (requests/s ec) average waiting time (ms)average response time (ms)average utilization Baselin e RRLCBaselineRRLC Baselin e RRLC 2500.210.50.93.814.14.5 0.226  0.001 0.226  0.002 0.23  0.001 5001.832.993.95.436.67.5 0.453  0.002 0.453  0.003 0.453  0.002 75049.5127.553.153.2131.156.7 0.679  0.001 0.679  0.004 0.680  0.001 1000849.51112.3 853. 0 853.11115.9 856. 7 0.906  0.00 0.905  0.006 0.905  0.001 12507008470118 7008 5 7008770121 7008 8 0.998  0.001 0.991  0.006 0.997  0.001 Table 3: Performance of a 4-Server Cluster

Rassul Ayani 16 Scaling problem size (cont’d) Conclusion: for low arrival rate LC is much better than RR, but for high arrival rate both converge to the BL algorithm Is it a fare conclusion?

Rassul Ayani 17 Scaling problem and machine size no. of servers arrival rate average response time (ms) average waiting time (ms)average utilization baselin e RRLC baselin e RRLC baselin e RRLC 12503467 0.906 25001722.01913.21724.61718.41909.51721.0 0.906  0.000 0.905  0.002 0.905  0.000 41000853.11115.9856.7849.51112.3853.0 0.906  0.00 0.905  0.006 0.905  0.001 82000419.7741.4421.6416.0737.8418.0 0.906  0.001 0.905  0.007 0.906  0.001 164000213.0608.0215.6209.4604.4212.0 0.906  0.001 0.903  0.017 0.905  0.002

Rassul Ayani 18 Scaling problem and machine size (cont’d) Conclusion: LC is much better than RR Is it a fare conclusion?

Rassul Ayani 19 Questions in Scaling  How should the application be scaled?  Look at the web server  Scaling machine size e.g., by adding identical nodes, each bringing memory  Memory size is increased  Locality may be changed  Extra work (e.g., overhead for task scheduling) will be increased  Problem size: scaling problem size may change  locality  working set size  Communication cost

Rassul Ayani 20 Why Scaling?  Two main reasons for scaling:  to increase performance, e.g. increase number of transactions per second  Of interest to users  to utilize resources (processor and memory) more efficiently  more interesting for managers  More difficult  scaling models:  Problem constrained (PC)  Memory constrained (MC)  Time constrained (TC)

Rassul Ayani 21 Problem Constrained Scaling  Problem size is kept fixed, but the machine is scaled  Motivation: User wants to solve the same problem, only faster.  Some examples:  Video compression  Computer graphics  Message routing in a router (or switch) Speedup (p) = Time(1) Time(p)

Rassul Ayani 22 Machine Constrained Scaling  Scale problem size, but the machine (memory) remains fixed  Motivation: It is good to find limits of a given machine e.g., what is the maximum problem size that can avoid memory thrashing?  Performance measurement:  previous definition of Speedup: Time(1) / Time(p) NOT valid  New definition:  Performance improvement = increase in work/increase in time  How to measure work?  Work can be defined as the number of instructions, operations, or transactions

Rassul Ayani 23 Time Constrained Scaling  Time is kept fixed as the machine is scaled  Motivation: User has fixed time to use the machine (or wait for result as in real-time systems), but wish to do more work during this time  Performance = Work/Time as usual, and time is fixed, so  Speedup TC (p) =  How Work(1) affects the result?  Work(1) must be reasonable to avoid thrashing Work(p) Work(1)

Rassul Ayani 24 Evaluation using Workload  Must consider three major factors:  Workload characteristics  Problem Size  machine size

Rassul Ayani 25 Impact of Workload  Should adequately represent domains of interest  Easy to mislead with workloads  Choose those with features for which machine is good, avoid others  Some features of interest:  Working set size and spatial locality  Fine-grained or coarse-grained tasks  Synchronization patterns  Contention, and Communication patterns  Should have enough to utilize the processors  If load imbalance dominates, may not be much machine can do

Rassul Ayani 26 Problem size  Many critical characteristics depend on problem size  Communication pattern (IPC)  Synchronization pattern  Load imbalance  Need to choose problem sizes appropriately  Insufficient to use a single problem size

Rassul Ayani 27 Steps in Choosing Problem Sizes 1. Expert view  May know that users care only about a few problem sizes  2. Determine range of useful sizes Below which bad performance or unrealistic time distribution in phases Above which execution time or memory usage too large  3. Use understanding of inherent characteristics Communication-to-computation ratio, load balance...

Rassul Ayani 28 Summary  Performance improvement due to parallelism is often measured by speedup  Problem size is important  Scaling is often needed  Scaling models are fundamental to proper evaluation  Time constrained scaling is a realistic method for many applications  Scaling only data problem size can yield misleading results  Proper scaling requires understanding the workload

Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.

Similar presentations

Presentation on theme: "Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.

Similar presentations

Presentation on theme: "Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare."— Presentation transcript:

Similar presentations

About project

Feedback