Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 704 Advanced Computer Architecture

Similar presentations


Presentation on theme: "CS 704 Advanced Computer Architecture"— Presentation transcript:

1 CS 704 Advanced Computer Architecture
Lecture 3 Quantitative Principles … Cont’d Design for Performance Prof. Dr. M. Ashraf Chughtai Welcome to the third lecture of he series on Advanced Computer Architecture. MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

2 Lecture 3 - Performance... Cont'd
Today’s Topics Recap I/O performance Laws and Principles Performance enhancement Concluding: quantitative principles Home work Summary After a quick review of the previous two lectures on the computer design we will continue with the discussion on the quantitative principles of computer design. An introduction to the computer processor performance, which is the key to the computer design for performance, has been the theme of the second lecture. Today we will talk about: I/O performance measures Laws and principles of performance measure Computer performance enhancement Concluding quantitative principles of computer design Homework MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

3 Lecture 3 - Performance... Cont'd
Recap: Lecture 1-2 Computer architecture verses organization Technological Developments Computer design cycle Performance metrics: time verses throughput Price-Performance design Benchmarks: Performance evaluation Distinguishing between architecture and organization of processors we concluded that ‘the architecture of the members of a processor family are same whereas organization of same architecture may differ between different members of the family’ Technological developments, from vacuum tubes to VLSI circuits, dynamic memory and network technology gave birth to four different generations of computers. In the computer design cycle, the decisive factors for rapid changes in the computer development have been the performance enhancements, price reduction and functional improvements The processor performance of two designs is often compared by the factor n, which determines how much lower execution time one machine takes as compared to the other or how much faster the other machine is than first. Time is the key measurement of performance. However, the throughput - number of tasks completed in specified time cannot be ignored. A desktop user may define the performance of his/her machine in terms of time taken by the machine to execute a program; whereas a computer center manager running a large server system may define the performance in terms of the number of jobs completed in a specified time. Price-Performance Design: The relationship between cost and price is complex one; and computer designers must understand this relationship as it effects the selling of their design. The cost is the total amount spends to produce a product and the price is the amount for which a finished good is sold and it is controlled by the die yield and volume. Growth in Processor Performance: The supercomputers and mainframes, costing millions of dollars and occupying excessively large space, prevailing by early 1970’s have been replaced with very low-cost microprocessor-based desktop computing machines in the form of personal computer (PC) and workstation massively parallel processing machines. Benchmark is a program developed to evaluate the performance of a computer. Good products created when have: proficient benchmarks and expert ways to summarize performance MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

4 Lecture 3 - Performance... Cont'd
Computer I/O System Producer-Server model Producer: the device that generates request to be serviced Queue: the area where the tasks accumulate waiting to be serviced Server: the device performing the requested service Response Time: the time a task takes from the moment it is placed in the buffer to the time server finishes the task Server I/O device/ controller Producer Queue Arrivals departures An I/O system works on the principle of producer-server model, which comprises an area, called queue, where the tasks accumulate waiting to be serviced and the device performing the requested service, called server. Producer creates tasks to be processed and place them in a FIFO buffer – queue. The server takes the task form buffer and perform them The response time is the time task takes from the moment it arrives in the buffer to the time the server finishes the task MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

5 I/O Performance Parameters
Diversity: Which I/O device can connect to the CPU Capacity: How many I/O devices can connect to the CPU Latency: Overall response time to complete a task Bandwidth: Number of task completed in specified time - throughput The parameters diversity that refers to which I/O device and capacity means how many I/O devices can connect to the CPU are the I/O performance measures having no counterpart in CPU performance metrics. In addition, the latency (response time) and bandwidth (throughput) also apply to the I/O system. An I/O system is said to be in equilibrium state when the rate at which the I/O requests from CPU arriving, at the input of I/O queue (buffer) equals the rate at which the requests departs the queue after being fulfilled by the I/O device. MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

6 Lecture 3 - Performance... Cont'd
I/O Transaction Time The interaction time or transaction time of a computer is sum of three times: Entry Time: the time for user to enter a command – average sec; from keyboard 4.0 sec. System Response Time: time between when user enters the command and system responds Think Time: the time from reception of the command until the user enters the next command The interaction or transaction time of a computer is sum of: Entry Time: the time for user to enter a command – average sec; from keyboard 4.0 sec. System Response Time: time between when user enters the command and system responds Think Time: the time from reception of the command until the user enters the next command MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

7 Throughput verses Response time: Performance Measures .. Cont’d
| | | | | | 0% 20% 40% 60% 80% 100% 200 _ 150 _ 100 _ 50 _ 20 % of maximum throughput - bandwidth Response time – latency ms The minimum response time achieves only 10% of the throughput The response time of 100% throughput takes 7-8 times the minimum response time The knee of the curve is the area where a little more throughput results in much longer response time MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

8 Response time and throughput calculation
Arrivals Departures If the system is in steady state, then the number of tasks entering the system must be equal to the number of tasks leaving the system Little’s Law: Mean number of tasks in system = Mean response time x Arrival rate The interaction or transaction time of a computer is sum of: Entry Time: the time for user to enter a command – average sec; from keyboard 4.0 sec. System Response Time: time between when user enters the command and system responds Think Time: the time from reception of the command until the user enters the next command MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

9 Little’s Law – A Little queuing theory
Mean number of tasks in the system = (Time accumulated) / (Time observe) Mean response time = (Time accumulated) / (Number tasks) Arrival rate λ = (Number tasks) / (Time observe) The expression for mean number of task may be written as: Time accumulated Timeaccumulated x Number tasks = Time observe Number tasks Time observe Mean number of tasks = mean response time x Arrival rate Assume that we observe a system for time (Time observe) minutes and found that the: number of task are completed in this time (Number task) and the sum of the times each task spends in the system is (Time accumulated), the arrival rate (λ) is the average number of arriving tasks/second; and the mean response time is the ratio of Timeaccumulated and number of tasks completed (Number task) during Time observe. MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

10 Amdahl's Law Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected Original Execution Time of Task Time after fraction F Enhanced by factor S Execution time of the Fraction Enhanced Time for Fraction F to be Enhanced by factor S MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

11 Lecture 3 - Performance... Cont'd
Amdahl's Law Speedup due to enhancement E: Ex Time without E Speedup (E) = Ex Time with E Performance with E = Performance without E MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

12 Lecture 3 - Performance... Cont'd
Amdahl’s Law Ex Time new = Ex Time old x (1 – Fraction enhanced) + Fraction enhanced Speedup enhanced MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

13 Lecture 3 - Performance... Cont'd
Amdahl’s Law ExTimenew = ExTimeold x (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced 1 ExTimeold ExTimenew Speedupoverall = = (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

14 Lecture 3 - Performance... Cont'd
Amdahl’s Law Floating point instructions improved to run 2X; but only 10% of actual instructions are FP ExTimenew = Speedupoverall = MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

15 Lecture 3 - Performance... Cont'd
Amdahl’s Law Floating point instructions improved to run 2X; but only 10% of actual instructions are FP ExTimenew = ExTimeold x ( /2) = 0.95 x ExTimeold 1 Speedupoverall = = 1.053 0.95 MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

16 Lecture 3 - Performance... Cont'd
Amdahl’s Law ExTimenew = ExTimeold x (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced 1 ExTimeold ExTimenew Speedupoverall = = (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

17 Lecture 3 - Performance... Cont'd
Amdahl’s Law Solution ExTimenew = ExTimeold x ( /2) = 0.95 x ExTimeold 1 Speedupoverall = = 1.053 0.95 MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd


Download ppt "CS 704 Advanced Computer Architecture"

Similar presentations


Ads by Google