Software Architecture in Practice Theoretical Models for Performance.

Software Architecture in Practice Theoretical Models for Performance

Why consider mathematics? Performance Engineering is to a large extent rooted in –practical experience –knowledge of relevant patterns –a structured and quantitative approach to problem solving Although you don’t have to be strong in mathematics it is highly relevant as a model of understanding and explanation. In some cases calculations are needed –Typically when you size the hardware platform In other cases the best approach is to make measurements of actual performance.

Different Modelling Techniques You need to make trade-off between cost and accuracy IBM “Performance Engineering and Management Method” (PEMM)

Queuing Theory A queue is a waiting line –like customers waiting for service at a bank or supermarket Queuing theory is the mathematical theory of waiting lines –it is concerned with the mathematical modeling and analysis of systems that provide service to random demands. Note: What happens in the Service Facility is irrelevant for this theory!

Queuing theory is used in many domains Queuing theory was born in the early 1900s with the work of A. K. Erlang of the Copenhagen Telephone Company (KTAS) –Erlang derived several important formulas for teletraffic engineering that today bear his name The range of applications has grown to many domains like –Manufacturing –Air traffic control –Logistics –Design of theme parks –IT system performance –… and many more Agner Krarup Erlang 1878 - 1929

Some queue configurations… Single Server - Single Queue Single Server - Multiple Queues Multiple Servers - Single Queue

… and some more Multiple Servers - Multiple Queues Multiple Servers in a series

Which factors affect system performance?

Basic Concepts Service TimeSDuration of servicing a request Wait TimeWDuration a request waits for service Response TimeR= S + W Residence TimeR'R'Total response time if system is visited multiple times for one transaction Arrival RateλRate at which requests arrive for service Service RateμRate at which customers are serviced UtilizationUPortion of time system services requests rather than idling Queue LengthNTotal number of requests waiting or being serviced ThroughputXRate at which requests are serviced ServersmNumber of parallel servers Number of visitsVNumber of visits to the server (V=1 for open m.) Note: Statistics! All are mean values

Applying probability theory A queuing system has 2 mutually coupled processes –The arrival process (α) –The service process (β) These 2 processes are stochastic (or random) in nature –Both the number of arrivals and the arrival interval are random, which makes the arrival process stochastic –Since the service process is driven by the arrival process this is stochastic as well  β m

Typical Process Models in Queuing Theory Markov Process (M) –This is a “memory-less” process –The number of arrivals follow the poisson distribution –The interarrival times and service times follow the exponential distribution General Process (G) –Not characterized by any distribution – completely arbitrary Deterministic Process (D) –Predictable and characterized by various constants Markov processes are representative of many random processes in reality –E.g. a web site receiving requests from independent users spread all over the world Therefore most often applied in practical IT work

Formal notation to classify queues Kendall’s notation:α/β/m/K/P/D –α: Arrival process Most usual is Markovian (memoryless) –β: Service time distribution Most usual is Markovian (memoryless) –m: Number of servers in the system –K: Capacity = Number of places in the system Default is ∞ (infinite) –P: Calling population Default is ∞ (infinite) –D: Service discipline Default is FIFO (first-in-first-out) = FCFS (first-come-first-serve) Usually only the first 3 letters are used to denote the queue –α/β/m M/M/1 is a reasonable approximation to most single-server queues

Common assumptions to make it manageable Equilibrium (or stable) Condition of the system: –No transactions are lost in the system or mathematically: throughput (X) = arrive rate (λ) Open model –straight through with no feedback, V = 1 In the closed model some customers re-enter queue, V > 1

The equations of queuing theory Utilization Throughput (tps) Average service time (s) 10%0.50.2 Utilization Law: U i = X i * S i = = * * Service demand at resource i Resource i's utilization System Throughput 0.20.42 Service Demand Law: D i = U i / X o = = / / Average # in the Node Throughput of the node (tps) Average time in the node (s) 3100.3 Little's Law: N = X * R =* =* i: Single node 0: System Queue i's throughput Average # of visits to queue i System throughput 1234 Forced Flow Law: X i = V i * X o

Example: The Utilization Law A network segment transmits 1000 packets/sec. Each packet has an average transmission time equal to 0.20 msec. What is the utilization of the LAN segment? Utilization = (throughput * service time) = (1000 packets/sec* 0.00020 sec/packet) = 0.20 = 20% Node 1Node 2 9989991000 0.20 msec

Contention in the queue When considering the response time of a system, contention for resources (or queueing) will be a factor. Queuing is waiting for service caused by: –Resource Utilisation: U –Arrival Patterns In an M/M/1 configuration the average queue waiting time is –W = S * U / (1-U) –R = S / (1-U)

The Maths R = W + SResponse time = Wait + Service UUtilization –U = 80% means 80% servicing while (1-80%) idling U / (1-U) = ‘business’, a number from 0 → ∞ –U / (1-U) = W/S, i.e. as service facility becomes more busy, each request has to spend a proportionally longer time waiting than being served Ex.: U = 80% => 0,8/0,2 = 4 og assume S = 1 sec –W/S = 4 => W = 4s og R = 5s Ex.: U = 10% => 0,1/0,9 = 0.11, and assume S = 1 sec –W/S = 0.11 => W = 0.11s; and R = 1.11s CS@AUHenrik Bærbak Christensen17

All systems have a limit All systems have a limit after which performance degrades rapidly Around 70% utilization the response time exceeds 3 times the service time !

A Central Result ! This explains –The weird queues on Autobahn in summertime –Why you wait too long for an email reply from me Around 70% utilization the response time exceeds 3 times the service time !

Example of M/M/1 queue calculation Requests arrive at the rate of λ per second Average service time is S Server utilisation is U = λ * S Average queuing time formula for M/M/1 queues: W = S * U / (1-U) Average Response time R = S + W Average population in the system (Little's Law), N λ= 10 / s S= 0.08s U= 10 * 0.08= 0.8 (i.e. 80%) W= 0.08 * (0.8 / (1 - 0.8)) = 0.08 * 4= 0.32s R= 0.08 + 0.32= 0.40s N= 10 * 0.40= 4.0  In practical applications of the theory you will be summing –Utilisation of each resource based on the total workload and workmix –End-to-end response times based on multiple steps in the end-to-end transaction path

Balanced System and Bottleneck Analysis A queuing system is balanced if all nodes have the same equal utilization U 1 = U 2 = U 3 A software system can achieve its best possible, scalable performance if it’s a balanced system –Therefore you should generally consider the node with highest utilization when you need to optimize a system (this is typically where the bottleneck is) –And generally stop when the system is balanced Node 1Node 2Node 3 U1U1 U2U2 U3U3

Amdahl’s Law Amdahl's law states that the performance improvement to be gained from using some faster mode of execution is limited by the fraction of the time the faster mode can be used –Also known as speedup, which can be defined as the maximum expected improvement to an overall system when only part of the system, is improved T1T2T3T1 T2 T3

An Example of Amdahl’s Law Consider an enhancement to the processor of a web server –The new CPU is 20 times faster on search queries than the old processor –The old processor is busy with search queries 70% of the time What is the speedup gained by integrating the enhanced CPU? 20 times faster gives only speedup of 3

Amdahl’s Law and limits of scalability The speedup of a program using multiple processors in parallel computing is limited by the sequential fraction of the program. For example, if 95% of the program can be parallelized, the theoretical maximum speedup using parallel computing would be 20× as shown in the diagram, no matter how many processors are used

Amdahl was later beaten by Gustafson Amdahl: Fixed Problem Size Variable Run Time Gustafson: Fixed Run Time Variable Problem Size

Software Architecture in Practice Theoretical Models for Performance.

Similar presentations

Presentation on theme: "Software Architecture in Practice Theoretical Models for Performance."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Software Architecture in Practice Theoretical Models for Performance.

Similar presentations

Presentation on theme: "Software Architecture in Practice Theoretical Models for Performance."— Presentation transcript:

Similar presentations

About project

Feedback