Presentation is loading. Please wait.

Presentation is loading. Please wait.

Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies Dong Lu* + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern.

Similar presentations


Presentation on theme: "Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies Dong Lu* + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern."— Presentation transcript:

1 Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies Dong Lu* + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern University + Ask Jeeves, Inc.

2 2 Outline Quick review of size-based scheduling Motivation and approach Correlation between file size and service time: a measurement study Performance of SRPT scheduling under real workload Domain-based scheduling

3 3 Quick Review of Size-based Scheduling SRPT –Shortest Remaining Processing Time –Assuming perfect knowledge of service times FSP –Fair Sojourn Protocol –Assuming perfect knowledge of service times Typical non-size-based scheduling –Processor Sharing (PS) –First Come First Serve (FCFS)

4 4 SRPT Always serve the job with minimum remaining processing time first, preemptive scheduling –Performance: Minimum mean response time [Schrage, Operations Research, 1968] –Fairness: performance gains of SRPT over PS do not usually come at the expense of large jobs, in other words, it is fair for heavy-tail job size distribution [Bansal and Harchol-Balter, Sigmetrics 01]

5 5 FSP Combined SRPT with PS, preemptive scheduling. [Friedman, et al, Sigmetrics 03] –SRPT + the longer a job stay in the queue, the higher its priority –Performance: Mean response time is close to that of SRPT –Fairness: Fairer than PS

6 6 Outline Quick review of size-based scheduling Motivation and approach Correlation between file size and service time: a measurement study Performance of SRPT scheduling under real workload Domain-based scheduling

7 7 Motivation Current implementation of SRPT and FSP –Use file size as service time (sorting jobs using file size) Is file size a good estimator of service time? What is the performance of SRPT and FSP using file size as service time? And how to improve? Service time: the time needed to send requested data in the absence of other requests in the system

8 8 Trace-driven Simulation Simulator: –C++ –Supports G/G/n/m queuing model –Driven by enhanced web server traces –Validation Littles law Repeat the simulations in the FSP paper [Friedman, et al, Sigmetrics 03] Compare with available theoretical results [Bansal and Harchol-Balter, Sigmetrics 01]

9 9 Scheduling Policies Studied SRPT: Ideal SRPT SRPT-FS: File size as service time SRPT-D: Domain-estimated service time FSP: Ideal FSP FSP-FS: File size as service time FSP-D: Domain-estimated service time PS: Processor sharing

10 10 Outline Quick review of size-based scheduling Our approach and questions answered Correlation between file size and service time: a measurement study Performance of SRPT-FS and FSP-FS scheduling under real workload Domain-based scheduling

11 11 Correlation is Weak on a Typical Web Server Measurement on departmental web server: Scatter plot of file size versus service time (log-log scale) R 0.14 Service time File Size Request from the whole Internet

12 12 Correlation is Weak on Web Cache Servers Measurement on 10 Squid web cache servers: –www.ircache.net Correlation Coefficient R Between File size and Service time P[R>x]

13 13 Main reason for the weak correlation End-to-end path diversity Web Server Client 1 Client 2 Client 3 Client 4

14 14 Outline Quick review of size-based scheduling Our approach and questions answered Correlation between file size and service time: a measurement study Performance of SRPT-FS and FSP-FS scheduling under real workload Domain-based scheduling

15 15 Mean Response Time Much Worse Than Expected Simulation driven by web server trace. G/G/1/m. Pareto arrivals (rate controlled to tune the load). Load on the queue Mean Response Time (millisec) PS SRPT-FS FSP-FS Ideal SRPT and FSP

16 16 Mean Queue Length Much Worse Than Expected Simulation driven by web server trace. G/G/1/m. Pareto arrivals (rate controlled to tune the load). Load on the queue Mean Queue Length FSP-FS SRPT-FS PS Ideal SRPT and FSP

17 17 Outline Quick review of size-based scheduling Our approach and questions answered Correlation between file size and service time: a measurement study Performance of SRPT-FS and FSP-FS scheduling under real workload Domain-based scheduling

18 18 Requirements For A Better Service Time Estimator Low overhead –Passive measurement –Low computation complexity –Low / adjustable memory usage Effective –Approximate the correct ordering of the service times. High correlation.

19 19 Domain-based estimator Divide Internet into smaller domains by leveraging CIDR (Classless Inter-domain Routing) Hosts in the same domain are likely to share same/similar routes to web server, and thus similar throughput Web Server

20 20 Supporting Facts Statistical Internet stability and locality –Routing stability [Paxson, Sigcomm 1996] –TCP throughput locality and stability [Balakrishnan, et al, Sigmetrics 1997]; [Seshan, et al, USITS 1997]; [Myers, et al, Infocom 1999] Classless Inter-domain Routing –implies that routes from machines in the domain to a server outside the domain will share many hops.

21 21 Algorithm Use high order k bits of client IP address to classify clients into 2 k domains For each domain, calculate R = F/S –R: representative service rate –F: sum of file sizes delivered to domain –S: sum of corresponding service times For each request, first extract its domain, then service time can be estimated as B/R –B: requested file size –R: representative service rate obtained before

22 22 Higher Correlation Can Be Achieved Correlation Coefficient R Bits used to define a domain

23 23 Much Lower Service Times Can Be Achieved Bits used to define a domain Mean Response time (milisec) PS FSP-D SRPT-FS FSP-FS SRPT-D SRPT and FSP

24 24 Much Lower Queue Lengths Can Be Achieved Bits used to define a domain Mean queue length FSP-D FSP-FS SRPT-FS PS SRPT-D SRPT and FSP

25 25 Conclusions File size may not be a good estimator of service time for many regimes File size-based SRPT and FSP can perform worse than PS in these regimes Domain-based scheduling brings the benefits of size-based scheduling to these regimes

26 26 For more information Prescience Lab at Northwestern University –www.presciencelab.org

27 27 Jeeves Invitation … Have you ever seen the whole Web at once? Did you ever wonder how to rein the power of thousands of machines? We are hiring talents for Internet Search –Software Engineer –Development Manager Send us your Resume:

28 28

29 29 Correlation is Weak on a Typical Web Server Measurement on departmental web server: Scatter plot of file size versus service time (log-log scale) R 0.14 Service time File Size Service time File Size R 0.25 Request from the whole InternetRequest from a /16 IP network

30 30 Future Work The back-filling queuing model Web Server Bandwidth Time Bottleneck Web Requests


Download ppt "Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies Dong Lu* + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern."

Similar presentations


Ads by Google