Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.

Slides:

Advertisements

Similar presentations

1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs.

Advertisements

QoS Aware Scheduling in a Cluster-Based Web Server Jiani Guo Architecture Lab Department of Computer Science and Engineering University of California,

Scalable Web Server Clustering Technologies J. Wei.

Congestion Control Reasons: - too many packets in the network and not enough buffer space S = rate at which packets are generated R = rate at which receivers.

LOAD BALANCING IN A CENTRALIZED DISTRIBUTED SYSTEM BY ANILA JAGANNATHAM ELENA HARRIS.

OpenFlow-Based Server Load Balancing GoneWild

Latency-sensitive hashing for collaborative Web caching Presented by: Xin Qi Yong Yang 09/04/2002.

Scalable Content-aware Request Distribution in Cluster-based Network Servers Jianbin Wei 10/4/2001.

1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.

Introduction to Content-aware Switch Presented by Li Zhao.

Technical Architectures

Module 8: Concepts of a Network Load Balancing Cluster

Differentiated Services. Service Differentiation in the Internet Different applications have varying bandwidth, delay, and reliability requirements How.

Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.

Introduction Future wireless systems will be characterized by their heterogeneity - availability of multiple access systems in the same physical space.

Load Balancing in Web Clusters CS 213 LECTURE 15 From: IBM Technical Report.

1 Improving Web Servers performance Objectives:  Scalable Web server System  Locally distributed architectures  Cluster-based Web systems  Distributed.

Locality-Aware Request Distribution in Cluster-based Network Servers 1. Introduction and Motivation --- Why have this idea? 2. Strategies --- How to implement?

TCP Splicing for URL-aware Redirection

Fair Scheduling in Web Servers CS 213 Lecture 17 L.N. Bhuyan.

1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.

Differentiated Multimedia Web Services Using Quality Aware Transcoding S. Chandra, C.Schlatter Ellis and A.Vahdat InfoCom 2000, IEEE Journal on Selected.

ACN: Congestion Control1 Congestion Control and Resource Allocation.

CS335 Networking & Network Administration Tuesday, April 20, 2010.

School of Information Technologies IP Quality of Service NETS3303/3603 Weeks

FIREWALLS & NETWORK SECURITY with Intrusion Detection and VPNs, 2 nd ed. 6 Packet Filtering By Whitman, Mattord, & Austin© 2008 Course Technology.

Dynamic Load Balancing on Web-server Systems Valeria Cardellini, Michele Colajanni, and Philip S. Yu Presented by Sui-Yu Wang.

Locality-Aware Request Distribution in Cluster-based Network Servers Presented by: Kevin Boos Authors: Vivek S. Pai, Mohit Aron, et al. Rice University.

Web Server Load Balancing/Scheduling Asima Silva Tim Sutherland.

1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.

Achieving Load Balance and Effective Caching in Clustered Web Servers Richard B. Bunt Derek L. Eager Gregory M. Oster Carey L. Williamson Department of.

Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.

Packet Filtering. 2 Objectives Describe packets and packet filtering Explain the approaches to packet filtering Recommend specific filtering rules.

NetworkProtocols. Objectives Identify characteristics of TCP/IP, IPX/SPX, NetBIOS, and AppleTalk Understand position of network protocols in OSI Model.

Chapter 6: Packet Filtering

Tiziana Ferrari Quality of Service Support in Packet Networks1 Quality of Service Support in Packet Networks Tiziana Ferrari Italian.

Top-Down Network Design Chapter Thirteen Optimizing Your Network Design Oppenheimer.

1 Distributed Systems : Server Load Balancing Dr. Sunny Jeong. Mr. Colin Zhang With Thanks to Prof. G. Coulouris,

Application-Layer Anycasting By Samarat Bhattacharjee et al. Presented by Matt Miller September 30, 2002.

Jaringan Komputer Dasar OSI Transport Layer Aurelio Rahmadian.

EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.

 Protocols used by network systems are not effective to distributed system  Special requirements are needed here.  They are in cases of: Transparency.

Computer Networks Performance Metrics. Performance Metrics Outline Generic Performance Metrics Network performance Measures Components of Hop and End-to-End.

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

On the Performance of TCP Splicing for URL-aware Redirection Ariel Cohen, Sampath Rangarajan, and Hamilton Slye The 2 nd USENIX Symposium on Internet Technologies.

Mechanisms for Quality of Service in Web Clusters V. Cardellini, E. Casalicchio, S.Tucci M. Colajanni University of Roma “Tor Vergata” University of Modena.

1 Integrating security in a quality aware multimedia delivery platform Paul Koster 21 november 2001.

A Throttling Layer-7 Web Switch James Furness. Motivation & Goals Specification & Design Design detail Demonstration Conclusion.

Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

DYNAMIC LOAD BALANCING ON WEB-SERVER SYSTEMS by Valeria Cardellini Michele Colajanni Philip S. Yu.

Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.

CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.

09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.

Introduction to Content-aware Switch Presented by Li Zhao.

CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 27 – Media Server (Part 2) Klara Nahrstedt Spring 2009.

Instructor Materials Chapter 6: Quality of Service

Lab A: Planning an Installation

Web Server Load Balancing/Scheduling

REPLICATION & LOAD BALANCING

Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.

Affinity Depending on the application and client requirements of your Network Load Balancing cluster, you can be required to select an Affinity setting.

Web Server Load Balancing/Scheduling

Introduction to Load Balancing:

Network Load Balancing

VIRTUAL SERVERS Presented By: Ravi Joshi IV Year (IT)

Introduction to Networking

© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 6: Quality of Service Connecting Networks.

Design Unit 26 Design a small or home office network

Web switch support for differentiated services

Lecture 3: Secure Network Architecture

Presentation transcript:

Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report

Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano Casalicchio, Michele Colajanni and Philip S. Yu

Concepts Web server System Providing web services Trend: 1. Increasing number of clients 2. Growing complexity of web applications Scalable Web server systems The ability to support large numbers of accesses and resources while still providing adequate performance

Locally Distributed Web System Cluster Based Web System the server nodes mask their IP addresses to clients, using a Virtual IP address corresponding to one device (web switch) in front of the set of the servers – Web switch receives all packets and then sends them to server nodes Distributed Web System the IP addresses of the web server nodes are visible to clients. No web switch, just a layer 3 router may be employed to route the requests

Cluster based Architecture

Distributed Architecture

Two Approaches Depends on which OSI protocol layer at which the web switch routes inbound packets layer-4 switch – Determines the target server when TCP SYN packet is received. Also called content-blind routing because the server selection policy is not based on http contents at the application level layer-7 switch – The switch first establishes a complete TCP connection with the client, examines http request at the application level and then selects a server. Can support sophisticated dispatching policies, but large latency for moving to application level – Also called Content-aware switches or Layer 5 switches in TCP/IP protocol.

Layer-4 two-way architecture

Layer-7 two-way architecture

Layer-7 two-way mechanisms TCP gateway An application level proxy running on the web switch mediates the communication between the client and the server – makes separate TCP connections to client and server TCP splicing reduce the overhead in TCP gateway. For outbound packets, packet forwarding occurs at network level by rewriting the client IP address - will be described in more detail in the next class

Layer-4 Products

Layer 7 products

Dispatching Algorithms Strategies to select the target server of the web clusters Static: Fastest solution to prevent web switch bottleneck, but do not consider the current state of the servers Dynamic: Outperform static algorithms by using intelligent decisions, but collecting state information and analyzing them cause expensive overheads Requirements: (1) Low computational complexity (2) Full compatibility with web standards (3) state information must be readily available without much overhead

Content blind approach Static Policies: Random distributes the incoming requests uniformly with equal probability of reaching any server Round Robin (RR) use a circular list and a pointer to the last selected server to make the decision Static Weighted RR (For heterogeneous severs) A variation of RR, where each server is assigned a weight Wi depending on its capacity

Content blind approach (Cont.) Dynamic Client state aware static partitioning the server nodes and to assign group of clients identified through the clients information, such as source IP address Server State Aware Least Loaded, the server with the lowest load. Issue: Which is the server load index? Least Connection fewest active connection first

Content blind approach (Cont.) Server State Aware Contd. –Fastest Response responding fastest Weighted Round Robin Variation of static RR, associates each server with a dynamically evaluated weight that is proportional to the server load Client and server state aware Client affinity instead of assigning each new connection to a server only on the basis of the server state regardless of any past assignment, consecutive connections from the same client can be assigned to the same server

Considerations of content blind Static approach is the fastest, easy to implement, but may make poor assignment decision Dynamic approach has the potential to make better decision, but it needs to collect and analyze state information, may cause high overhead Overall, simple server state aware algorithm is the best choice, least loaded algorithm is commonly used in commercial products

Content aware approach Sever state aware Cache Affinity the file space is partitioned among the server nodes. Load Sharing. SITEA (Size Interval Task Assignment with Equal Load) switch determines the size of the requested file and select the target server based on this information. CAP (Client-Aware Policy) web requests are classified based on their impact on system resources: such as I/O bound, CPU bound

Content aware approach (Cont.) Client state aware Service Partitioning employ specialized servers for certain type of requests. Client Affinity using session identifier to assign all web transactions from the same client to the same server

Content aware approach (Cont.) Client and server state aware LARD (Locality aware request distribution) direct all requests to the same web object to the same server node as long as its utilization is below a given threshold. Cache Manager a cache manager that is aware of the cache content of all web servers.

Fair Scheduling in Web Servers CS 213 Lecture 17 L.N. Bhuyan

Objective Create an arbitrary number of service quality classes and assign a priority weight for each class. Provide service differentiation for different use classes in terms of the allocation of CPU and disk I/O capacities

Fair Scheduling in a Web Cluster: Objective Provide service differentiation (or QoS guarantee) for different user classes in terms of the allocation of CPU and disk I/O capacities => Scheduling Balance the Load among various nodes in the cluster to ensure maximum utilization and minimum execution time => Load Balancing

Target System

Master/Slave Architecture Server nodes are divided in two groups: Slave group only processes dynamic requests Master group can handles both requests

Performance Guarantees for Internet Services (Gage) Environment: Web hosting services multiple logical web servers (service subscriber) on a single physical web server cluster. Gage: guarantee each web server with a pre specific performance a distinct number of URL requests to service per second

Components Each service subscriber maintain a queue Request classification determines the queue for each input request Request scheduling determines which queue to serve next to meet the QoS requirement for each subscriber. Resource usage accounting capture detailed resource usage associated with each subscriber’s service requests.

The Gage System QoS guarantee QoS is in terms of a fixed number of generic URL request which represents an average web site access Currently, assuming it is 10msec of CPU time, 10msec of disk I/O and 2000 bytes of network bandwidth Each subscribe is given a fixed number of generic requests. Other possible QoS metrics: response time, delay jitter etc. Using TCP splicing

Request Scheduling Two decisions: Which request should be serviced next (Scheduling) according to each subscriber’s static resource reservation and dynamic resource usage Which RPN should service this request (Load Balancing) according to the load information on each RPN (Least Load First) and also exploit access locality