# Capacity Setting and Queuing Theory

## Presentation on theme: "Capacity Setting and Queuing Theory"— Presentation transcript:

Capacity Setting and Queuing Theory
BAHC 510 Lecture 6 US Election Nov 6, 2012

Capacity and Resources
A key lever for improving patient flow. How do we measure capacity? What is the capacity of a 20 seat restaurant? A 16 bed ward? Capacity is a RATE Customers/hour Patients/day We can view a 16 bed ward as a queuing system with 16 servers What is the capacity of a bed? Does this analogy apply to the restaurant? A system is composed of resources with capacities. Often we use the expressions “resource” and “capacity” interchangeably (hopefully without confusion)

How Much Capacity is Needed? or How Many Resources are Needed?
Surge capacity Base capacity

Capacity tradeoffs when demand is variable
Too much capacity or too many resources = idleness Not enough capacity = waits The resource manager must trade these off taking into account system objectives and available resources Should we set capacity equal to demand? What does this mean? This is called a balanced system It works perfectly when there is no variation in the system It works terribly when there is variation! Why? Once behind, you never can catch up. Queuing theory quantifies these tradeoffs in terms of performance measures.

Queuing Models (Mathematical) queuing models help us set capacity (or determine the number of resources needed) to meet: Service level targets Average wait time targets Average queue length targets Queuing models provide a more precise alternative to simulation They provide insights into how to plan, operate and manage a system Where are there queues in the health care system?

A single server queuing system
Buffer Server A queue forms in a buffer Servers may be people or physical space The buffer may have a finite or unlimited capacity The most basic models assume “customers” are of one type and have common arrival and service rates

A multiple (N) server queuing system
Buffer . Server Server

Several parallel singer server queues
Buffer Server Buffer Server Buffer Server

Parallel Queues vs. Multiple server Queues
Provide examples of multiple server queues (MSQs) Provided examples of parallel queues (PQs) In what situations would each of these queuing systems be most appropriate? Why?

Networks of queues Most health care systems are interconnected networks of queues and servers with multiple waiting points and heterogeneous customers. Provide some examples. Often we model these complex systems with simulation. But in some cases we can use formulae to get results

Queuing Theory background
Developed to analyze telephone systems in the 1930’s by Erlang. How many lines are needed to ensure a caller tries to dial and obtains a “line”. Depending on the system configuration an arriving customer may either be blocked or enter a queue and wait for service. Now they are applied to analyze internet traffic, telecommunications systems, call centers, airport security lines, banks and restaurants, rail networks, etc.

Queues and Variability
There are two components of a queuing system subject to variability The inter-arrival times of “jobs” The service times or LOS Why are these variable? We describe this variability by Mean Standard deviation Probability distribution Usually the normal distribution doesn’t fit well Often an exponential distribution fits well If we know its rate or mean we know everything about it.

The exponential distribution
P(T ≤ t) = 1 – e-λt The quantity λ is the rate. The mean and standard deviation of the exponential distribution is 1/λ. The median is ln(2)/ λ = .693/ λ Exponential distributions don’t allow negative times and have a small probability of long service times. Example; Patients arrive at rate 4 per hour. The mean inter-arrival time is 15 minutes. The median inter-arrival time is minutes. What is the probability that the time between two arrivals is less than 10 minutes (1/6 of an hour) P( T ≤ 1/6) = 1 – e-4∙(1/6) = 1- e-2/3 = .487. The exponential distribution underlies queuing theory. A queue with exponential service times and exponential inter-arrival times and N (FCFS) servers and an infinite waiting room is called an M/M/N queue.

Capacity management and queuing systems
Capacity management involves determining the number of servers to use and the size of the waiting rooms. Examples How many long term care beds are needed? How many porters are needed? How many nurses are needed? How many cubicles are needed in an ED? Some healthcare systems have no buffers; all the waiting is done outside of the system or in upstream resources. ALC cases waiting for LTC beds

Analyzing a queuing system
Outputs Capacity Utilization Wait Time in Queue Queue Length Blocking Probability Service Levels Inputs Arrival Rate Service Rate Number of Servers Buffer Size Population Size Queue Analyzer Armann Ingolfsson’s Queuing Calculator

Some Fundamental Quantities
Inputs The arrival rate per hour: λ The service rate per hour: µ The number of servers: s Can be 1 or more The buffer size: K Can be finite or infinite Derived Quantities The offered load: λ/ µ = R Example; λ = 100 calls/hr and µ = 5 calls/ hr Then the offered load is 20 (this quantity is unit less) This means the system needs at least 20 servers to meet its workload Another important quantity is the traffic intensity ρ = R/s It gives the offered load per server In example if there are 25 servers (N=25) ; ρ =20/25 =.8 So servers should be busy 80% of the time. If the traffic intensity exceeds 1; the system will be unstable.

Examples of Queuing Systems
Walk-in clinic with 6 seats and 2 physicians s = 2 K = 6 Long term care facility with 100 beds s= 100 K = ? A Finite Capacity Loss System Model for an (old-fashion) phone system s servers K= 0 When all servers are busy, system is blocked and customers are lost A Congestion System K= ∞ When all servers are busy; customers wait

Performance Measures Capacity Utilization
Probability the system is empty Average waiting time (in queue) – Wq Flow Time - Average Total Time in System – W Average queue length – Lq Average number of jobs in the system - L Probability that a customer waits for service Probability that there are k customers in the system Service Level – Probability that a customer waits less than T time units for service.

An Example - M/M/1 Queue Assume exponential inter-arrival time and service time distributions, infinite capacity and 1 server (s=1) Calculations below are based on analytical expressions available in most operations research texts on OR. Customers arrive at rate 4 per hour, mean service time is 10 minutes. Service rate is 6 per hour System utilization = Probability the server is occupied =  = 2/3. Safety capacity = service rate – arrival rate = 2 P(System is empty) = 1-  = 1/3. P(k in the system) = k(1- ) = (1/3)(2/3)k Average Time in system= 1/safety capacity = ½ hour Average Time in queue = Average time in system – average service time = ½ - 1/6 = 1/3 hour Average Queue Length = 2/(1- ) = 4/3 Suppose arrival rate increases to 5.9 customers per hour. Then  =5.9/6 = .9833 So P(System is empty) = .0167; Average time in system = 10 hours and Average number of customers in the system = 58.9!

An M/M/s queue is the same as an M/M/1 queue except that there may be more than one server. In this model, there is a single buffer and s servers in the resource pool. Jobs are processed on a FIFO basis. When there are more than s jobs in the system, the buffer is occupied and waiting for service occurs. The Erlang-C formula gives the probability an arriving job has to wait. An M/M/s/K queue is an M/M/c queue with a finite buffer of size K. There are at most K + s customers in the system. When the buffer is filled, the system is blocked and customers are lost. QUEUECALC computes performance measures for M/M/s queues M/M/s queues with a finite buffer size M/M/s queues with a finite population size M/G/1 queues In addition for a fixed T For specified s it computes the percentage of jobs waiting less than T time units It computes the number of servers needed to achieve a specified service level How many servers are needed so that 90% of jobs wait no more than 10 minutes for service.

Problem 1 Patients arrive at rate 5/hr. They require on average 1 hour of treatment. What is the offered load? How many service providers do we need to ensure that the average wait time is 20 minutes or less? Assume a large waiting room. Observe that we require more than 5 servers to ensure a stable system. Run “The Waiting Line Analyzer” to find For 6 service providers - Average number in queue is 2.94 and average wait time in queue is hours or minutes Note that with 6 service providers the probability a customer waits which equals the probability all 6 are occupied occurs 58.75% of time. The capacity utilization is 83% For 7 service providers – Average number in queue is and average wait is hours or 9.28 minutes. Note that with 7 service providers the probability a customer waits which equals the probability all 7 are occupied occurs 32.41% of time. The capacity utilization is 71% Observe the trade-off between capacity utilization and service!

More on Problem 1 Service Levels
Suppose our target service times are 6 and 10 minutes – fill in the following table Servers P(Wq ≤ 6) P(Wq ≤ 10) Capacity Utilization 6 7 8 9 10

More on Problem 1 Servers P(Wq ≤ 6) P(Wq ≤ 10) Capacity Utilization 6
.47 .50 83% 7 .73 .77 71% 8 .88 .90 63% 9 .95 .96 56% 10 .98 50%

Still more on Problem 1 Let’s explore relationship between (traffic intensity) utilization, queue lengths and wait times Assume 5 servers increase arrival rate to 5. Conclusion – as traffic intensity increases to 1 queue lengths and wait time increase rapidly Arrival Rate Utilization Wait time in queue (hrs) Queue Length 4 80% 0.55 2.22 4.5 90% 1.52 6.86 4.9 98% 9.50 46.56 4.99 99.8% 99.50 496.5

Problem 2 – A small walk in clinic
A walk in clinic has 3 doctors; Average time spent with a patient is 12 minutes (5/hr) Patients arrive at rate of 12 per hour How many chairs should we have in the waiting room so only 5% of patients are turned away? Solution Assume first an infinite waiting room This shows average queue length is 2.59 Now try a model with a finite waiting room. With 3 chairs 9% balk and 52% wait With 4 chairs 7% balk and 55% wait With 5 chairs 5% balk and 58% wait In this last case average waiting time is .038 hours This seems too fast.

Problem 3 – Blocking in a Hospital Ward
Bed requests arrive at the rate of 3 per day. Patients remain in beds for about 5 days How many beds are required so that the probability a patient is not admitted on arrival is less than 10%? This is a finite capacity queuing system with no waiting room Service rate = 1/5 = 0.2 patients per day Offered load = 3/.2 = 15 so we need at least 15 beds. Model this as a finite capacity queuing system with no waiting room – we want the blocking probability to be less than 0.1. With 15 beds 18% are blocked With 16 beds 14% are blocked With 17 beds 11% are blocked With 18 beds 9% are blocked In this case (s=18) the capacity utilization is 76% Graph gives occupancy distribution or census. This probability is computed using the Erlang-B formula

How can queuing theory improve porter scheduling?
Assumption: Porters handle 3.3 trips/hour

Implications of queuing formulas
As the safety capacity vanishes, or equivalently, the traffic intensity increases to 1: waiting time increases without bound! queue lengths become arbitrarily long! In the presence of variability in inter-arrival times and service times, a balanced system will be highly unstable. These formulas enable the manager to derive performance measures on the basis of a few basic descriptors of the queuing system The arrival rate The service rate The number of servers When the system has a finite buffer, the percentage of jobs that are blocked can also be computed

Summary When the manager knows the arrival rate and service rate, he/she can compute: The average number of jobs in the queue. The average time spent in the queue. The probability an arriving patient has to wait. The system utilization. This can be done without simulation! This information can be used to set capacity or explore the sensitivity of recommendations to assumptions or changes. Thus queuing theory provides a powerful tool to manage capacity.

Don’t Match Capacity with Demand
If service rate is close to arrival rate then there will be long wait times. Recall average queue length = 2/(1- ) For traffic intensity near 1, queue length will be very small. No variability – All procedures take exactly the same time, patients are scheduled to appear at the completion of the proceeding procedure and arrive at that time. Safety capacity – Service Rate – Arrival Rate

Performance measure formulas (M/M/1 queue – no limit on queue size)
System Utilization = P(Server is occupied) =  If traffic intensity increases, the likelihood the server is occupied increases This occurs if the arrival rate increases or the service rate decreases P(System is empty) = 1-  P(k in system) = k(1- ) Average Time in System = 1/ Safety capacity Average Time in Queue = Average time in system – average service time If safety capacity decreases; time in queue increases! Average Number of jobs in the system (including being served) = /(1- ) Average Queue Length = 2/(1- ) If we know safety capacity, service time and traffic intensity, we can compute all system properties Little’s Law holds too number in queue = arrival rate x waiting time in queue

Idle Capacity And Wait Time Targets
Theoretical –based on queueing formula.