Presentation is loading. Please wait.

Presentation is loading. Please wait.

IFIP Performance 2011 Best Paper

Similar presentations


Presentation on theme: "IFIP Performance 2011 Best Paper"— Presentation transcript:

1 IFIP Performance 2011 Best Paper
Join-Idle-Queue: A Novel Load Balancing Algorithm for Dynamically Scalable Web Services Li Yu, Qiaomin Xie, Gabriel Kliot, Alan Geller, James R. Larus, Albert Greenberg IFIP Performance 2011 Best Paper Presented by Amir Nahir

2 Agenda Queuing terminology and background: Motivation The algorithm
M/M/N, Processor Sharing Motivation The algorithm Some analysis Results Caveat: where has time gone?

3 M/M/N Single shared queue Jobs wait in the queue
Whenever a server completes one job, it gets the next from the queue

4 M/M/N Pros: Cons: Jobs arrive to the next server to become available
Centralized (single point of failure, bottleneck) Hidden overhead – the time is takes to access the queue to get the job

5 Service Disciplines FIFO (a.k.a FCFS) Processor sharing

6 FIFO vs. Processor Sharing
Analysis is very similar But results are not quite the same E.g., assume three jobs arrive at the system at time 0 PS is currently seen as the “realistic” model Avg. time in system = 2 Avg. time in system = 3

7 Web-Services: The User’s Experience
No one really tries to model the whole process as a single problem Common component (unchanged by the research) are often neglected Scheduler Server

8 The Main Motivation for the Paper
Reduce delays on the job’s critical path Scheduler Scheduler Server Server Scheduler

9 The Join-Idle-Queue Algorithm: System Structure
Two-layer system: dispatchers (front-ends) and processors (back-ends, servers) The ratio between servers and dispatchers is denoted by r No assumptions regarding processor discipline (can support PS, FIFO) Each dispatcher has an I-queue The I-queue holds servers (not jobs)

10 The Join-Idle-Queue Algorithm: Dispatcher Behavior
Upon receiving a job from user: If there are servers in the I-queue, dequeue first server and send job to it Otherwise – send job to random server This deteriorates system performance This is termed primary load balancing

11 The Join-Idle-Queue Algorithm: Server Behavior
Upon completing all jobs: Choose a dispatcher Two techniques are considered: Random and SQ(d) Register in its I-queue This is termed secondary load balancing

12 The Join-Idle-Queue Algorithm at Work
2 1 Shows that when completing a job, but the queue is not empty, nothing happens Show arrival that finds a server in the IQ Job completes, server registers to IQ Job arrives to find empty IQ, random server is chosen 1 2 3 4 4

13 The Join-Idle-Queue Algorithm: Corner Case 1
Server 2 is busy processing a job while being registered as “idle” in one of the I-queues 2 A server is registered as “Idle” in one of the Iqs, but is in fact working on a job Because it arrived from random assignment 1 2 3 4

14 The Join-Idle-Queue Algorithm: Corner Case 2
Server 2 is reported as “idle” in more than one dispatcher A server is registered as “Idle” in one of the Iqs, but is in fact working on a job Because it arrived from random assignment The server next completes this job, and registers to a different dispatcher The authors suggest overcoming this by not “re-registering” until a job arrives from the dispatcher to which the server registered last 2 1 2 3 4

15 JIQ Analysis: Some Notations
r – the ratio of servers to dispatchers When is the algorithm expected to perform better, large r or small r?

16 JIQ Analysis: Some Notations
pi – the probability that a server holds exactly i jobs p0 – the probability that a server is idle λi – the arrival rate of jobs to a server which holds exactly i jobs λ0 – the arrival rate of jobs to idle processors ρ=λ/μ Common notation in queuing

17 Load Balancing Assertions
No matter how your balance the load: p0 = 1 – λ a λN dispatchers n servers

18 Load Balancing: So Where Does the Wisdom Go?
It’s not about: increasing the probability that a server is idle It’s about increasing the arrival rate to idle (and lightly loaded) servers And from there,

19 Theorem 1: Proportion of Occupies I-Queues
There’s a strong connection between idle servers and occupied I-queues Jobs arrive at the system at rate λn The proportion of idle servers is (1-λ)n This proportion is equally distributed among the dispatchers, so the proportion of occupied I-queues is (1-λ)n/m = (1-λ)r

20 Theorem 1: Proportion of Occupies I-Queues
On the other hand, the authors show that “server arrivals” to the I-queue do behave like a Poisson process (when n→∞) Servers arrive at I-queues at rate ρ There are ρm occupied I-queues (on average) And so the average I-queue length, under random secondary load balancing, is:

21 Corollary 2: The Arrival Rate at Idle Servers (1)
Job arrival rate at the specific dispatcher is λn/m A job has probability ρ to find an occupied I-queue Average I-queue length is r(1-λ) 2 1 2 3 4

22 Corollary 2: The Arrival Rate at Idle Servers (2)
Job arrival rate at servers is λ A job has probability (1-ρ) to find an empty I-queue Overall arrival rate at idle servers 2 1 2 3 4

23 Corollary 2: The Arrival Rate at Non-Idle Servers
Job arrival rate at servers is λ A job has probability (1-ρ) to find an empty I-queue Arrival rate at busy servers is λ (1-ρ) 2 1 2 3 4 The arrival rate at idle servers is (r+1) times higher than the arrival rate at non-idle servers

24 Proportion of Empty I-queues
Empty I-queue is BAD r=10 n=500, m=50 for simulation

25 Results (Exponential Job Length)

26 Job Length Distributions

27 Sensitivity to Variance (PS)
Load=90%

28 Affect of r on Performance

29 Caveat: Scheduling Still Takes Time…
When the decision for the secondary load balancing takes place, the servers is not registered at any I-queue At this time, performance is expected to degrade…. Scheduler Server Scheduler

30


Download ppt "IFIP Performance 2011 Best Paper"

Similar presentations


Ads by Google