Presentation is loading. Please wait.

Presentation is loading. Please wait.

Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies Dong Lu* + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern.

Similar presentations


Presentation on theme: "Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies Dong Lu* + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern."— Presentation transcript:

1 Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies Dong Lu* + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern University + Ask Jeeves, Inc.

2 2 Outline Quick review of size-based scheduling Motivation and approach Correlation between file size and service time: a measurement study Performance of SRPT scheduling under real workload Domain-based scheduling

3 3 Quick Review of Size-based Scheduling SRPT –Shortest Remaining Processing Time –Assuming perfect knowledge of service times FSP –Fair Sojourn Protocol –Assuming perfect knowledge of service times Typical non-size-based scheduling –Processor Sharing (PS) –First Come First Serve (FCFS)

4 4 SRPT Always serve the job with minimum remaining processing time first, preemptive scheduling –Performance: Minimum mean response time [Schrage, Operations Research, 1968] –Fairness: performance gains of SRPT over PS do not usually come at the expense of large jobs, in other words, it is fair for heavy-tail job size distribution [Bansal and Harchol-Balter, Sigmetrics ‘01]

5 5 FSP Combined SRPT with PS, preemptive scheduling. [Friedman, et al, Sigmetrics ‘03] –SRPT + the longer a job stay in the queue, the higher its priority –Performance: Mean response time is close to that of SRPT –Fairness: Fairer than PS

6 6 Outline Quick review of size-based scheduling Motivation and approach Correlation between file size and service time: a measurement study Performance of SRPT scheduling under real workload Domain-based scheduling

7 7 Motivation Current implementation of SRPT and FSP –Use file size as service time (sorting jobs using file size) Is file size a good estimator of service time? What is the performance of SRPT and FSP using file size as service time? And how to improve? Service time: the time needed to send requested data in the absence of other requests in the system

8 8 Trace-driven Simulation Simulator: –C++ –Supports G/G/n/m queuing model –Driven by enhanced web server traces –Validation Little’s law Repeat the simulations in the FSP paper [Friedman, et al, Sigmetrics ‘03] Compare with available theoretical results [Bansal and Harchol-Balter, Sigmetrics ‘01]

9 9 Scheduling Policies Studied SRPT: Ideal SRPT SRPT-FS: File size as service time SRPT-D: Domain-estimated service time FSP: Ideal FSP FSP-FS: File size as service time FSP-D: Domain-estimated service time PS: Processor sharing

10 10 Outline Quick review of size-based scheduling Motivation and approach Correlation between file size and service time: a measurement study Performance of SRPT-FS and FSP-FS scheduling under real workload Domain-based scheduling

11 11 Correlation is Weak on a Typical Web Server Measurement on departmental web server: Scatter plot of file size versus service time (log-log scale) R ≈ 0.14 Service time File Size Request from the whole Internet

12 12 Correlation is Weak on Web Cache Servers Measurement on 10 Squid web cache servers: –www.ircache.net Correlation Coefficient R Between File size and Service time 00.10.20.30.40.50.60.7 P[R>x] 0 0.5 1.0

13 13 Main reason for the weak correlation End-to-end path diversity Web Server Client 1 Client 2 Client 3 Client 4

14 14 Outline Quick review of size-based scheduling Motivation and approach Correlation between file size and service time: a measurement study Performance of SRPT-FS and FSP-FS scheduling under real workload Domain-based scheduling

15 15 Mean Response Time Much Worse Than Expected Simulation driven by web server trace. G/G/1/m. Pareto arrivals (rate controlled to tune the load). Load on the queue 00.51.01.52.0 Mean Response Time (millisec) 100 300 500 700 900 PS SRPT-FS FSP-FS Ideal SRPT and FSP

16 16 Mean Queue Length Much Worse Than Expected Simulation driven by web server trace. G/G/1/m. Pareto arrivals (rate controlled to tune the load). Load on the queue Mean Queue Length 00.51.01.52.0 1000 2000 3000 4000 5000 FSP-FS SRPT-FS PS Ideal SRPT and FSP

17 17 Requirements For A Better Service Time Estimator Low overhead –Passive measurement –Low computation complexity –Low / adjustable memory usage Effective –Approximate the correct ordering of the service times. High correlation.

18 18 Outline Quick review of size-based scheduling Motivation and approach Correlation between file size and service time: a measurement study Performance of SRPT-FS and FSP-FS scheduling under real workload Domain-based scheduling

19 19 Domain-based estimator Divide Internet into smaller “domains” by leveraging CIDR (Classless Inter-domain Routing) Hosts in the same domain are likely to share same/similar routes to web server, and thus similar throughput Web Server

20 20 Supporting Facts Statistical Internet stability and locality –Routing stability [Paxson, Sigcomm 1996] –TCP throughput locality and stability [Balakrishnan, et al, Sigmetrics 1997]; [Seshan, et al, USITS 1997]; [Myers, et al, Infocom 1999] Classless Inter-domain Routing –implies that routes from machines in the domain to a server outside the domain will share many hops.

21 21 Algorithm Use high order k bits of client IP address to classify clients into 2 k domains For each domain, calculate R = F/S –R: representative service rate –F: sum of file sizes delivered to domain –S: sum of corresponding service times For each request, first extract its domain, then service time can be estimated as B/R –B: requested file size –R: representative service rate obtained before

22 22 Higher Correlation Can Be Achieved 0 8162432 Correlation Coefficient R 0.1 0.3 0.5 0.7 Bits used to define a domain

23 23 Much Lower Service Times Can Be Achieved Bits used to define a domain 08162432 Mean Response time (milisec) 100 300 500 700 900 PS FSP-D SRPT-FS FSP-FS SRPT-D SRPT and FSP

24 24 Much Lower Queue Lengths Can Be Achieved Bits used to define a domain 05102535152030 Mean queue length 1000 2000 3000 FSP-D FSP-FS SRPT-FS PS SRPT-D SRPT and FSP

25 25 Conclusions File size may not be a good estimator of service time for many regimes File size-based SRPT and FSP can perform worse than PS in these regimes Domain-based scheduling brings the benefits of size-based scheduling to these regimes

26 26 For more information Prescience Lab at Northwestern University –www.presciencelab.org

27 27 Jeeves’ Invitation … Have you ever seen the whole Web at once? Did you ever wonder how to rein the power of thousands of machines? We are hiring talents for Internet Search –Software Engineer –Development Manager Send us your Resume: talentacquisition@askjeeves.com


Download ppt "Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies Dong Lu* + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern."

Similar presentations


Ads by Google