Web Server Software Architectures Author: Daniel A. Menascé Presenter: Noshaba Bakht.

Slides:



Advertisements
Similar presentations
Chapter 13 Queueing Models
Advertisements

Introduction to Queuing Theory
Hadi Goudarzi and Massoud Pedram
Copyright © 2005 Department of Computer Science CPSC 641 Winter PERFORMANCE EVALUATION Often in Computer Science you need to: – demonstrate that.
April Release: Performance Review 1 High-level PPE system diagram Definitions Current PPE performance summary performance May RR expectation list Links.
1 SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.
Performance Engineering Methodology Chapter 4. Performance Engineering Performance engineering analyzes the expected performance characteristics of a.
Queuing Analysis Based on noted from Appendix A of Stallings Operating System text 6/10/20151.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Web Server Hardware and Software
CS CS 5150 Software Engineering Lecture 19 Performance.
RAIDs Performance Prediction based on Fuzzy Queue Theory Carlos Campos Bracho ECE 510 Project Prof. Dr. Duncan Elliot.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Server Architecture Models Operating Systems Hebrew University Spring 2004.
Data Communication and Networks Lecture 13 Performance December 9, 2004 Joseph Conron Computer Science Department New York University
1 Queueing Theory H Plan: –Introduce basics of Queueing Theory –Define notation and terminology used –Discuss properties of queuing models –Show examples.
Performance Evaluation
1 PERFORMANCE EVALUATION H Often in Computer Science you need to: – demonstrate that a new concept, technique, or algorithm is feasible –demonstrate that.
1 Multiple class queueing networks Mean Value Analysis - Open queueing networks - Closed queueing networks.
1 CS6320 – Why Servlets? L. Grewe 2 What is a Servlet? Servlets are Java programs that can be run dynamically from a Web Server Servlets are Java programs.
1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
Bandwidth Allocation in a Self-Managing Multimedia File Server Vijay Sundaram and Prashant Shenoy Department of Computer Science University of Massachusetts.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Capacity planning for web sites. Promoting a web site Thoughts on increasing web site traffic but… Two possible scenarios…
1 Part VI System-level Performance Models for the Web © 1998 Menascé & Almeida. All Rights Reserved.
DATABASE MANAGEMENT SYSTEMS 2 ANGELITO I. CUNANAN JR.
Computer System Lifecycle Chapter 1. Introduction Computer System users, administrators, and designers are all interested in performance evaluation. Whether.
Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.
Module 10: Designing an AD RMS Infrastructure in Windows Server 2008.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Performance of Web Applications Introduction One of the success-critical quality characteristics of Web applications is system performance. What.
Simulation Examples ~ By Hand ~ Using Excel
1 An SLA-Oriented Capacity Planning Tool for Streaming Media Services Lucy Cherkasova, Wenting Tang, and Sharad Singhal HPLabs,USA.
Queuing models Basic definitions, assumptions, and identities Operational laws Little’s law Queuing networks and Jackson’s theorem The importance of think.
MIT Fun queues for MIT The importance of queues When do queues appear? –Systems in which some serving entities provide some service in a shared.
Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
NETE4631:Capacity Planning (2)- Lecture 10 Suronapee Phoomvuthisarn, Ph.D. /
By Lecturer / Aisha Dawood 1.  Dedicated and Shared Server Processes  Configuring Oracle Database for Shared Server  Oracle Database Background Processes.
1 CS 501 Spring 2006 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
Chapter 101 Multiprocessor and Real- Time Scheduling Chapter 10.
Computing Infrastructure for Large Ecommerce Systems -- based on material written by Jacob Lindeman.
1 Challenges in Scaling E-Business Sites  Menascé and Almeida. All Rights Reserved. Daniel A. Menascé Department of Computer Science George Mason.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part V Workload Characterization for the Web.
CSCI1600: Embedded and Real Time Software Lecture 19: Queuing Theory Steven Reiss, Fall 2015.
1 Part VII Component-level Performance Models for the Web © 1998 Menascé & Almeida. All Rights Reserved.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
NETE4631: Network Information System Capacity Planning (2) Suronapee Phoomvuthisarn, Ph.D. /
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part VI System-level Performance Models for the Web (Book, Chapter 8)
Internet Applications: Performance Metrics and performance-related concepts E0397 – Lecture 2 10/8/2010.
Performance Testing Test Complete. Performance testing and its sub categories Performance testing is performed, to determine how fast some aspect of a.
1 CS 501 Spring 2003 CS 501: Software Engineering Lecture 23 Performance of Computer Systems.
Apache Web Server Architecture Chaitanya Kulkarni MSCS rd April /23/20081Apache Web Server Architecture.
Queuing Theory.  Queuing Theory deals with systems of the following type:  Typically we are interested in how much queuing occurs or in the delays at.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part VIII Web Performance Modeling (Book, Chapter 10)
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part VI System-level Performance Models for the Web.
OPERATING SYSTEMS CS 3502 Fall 2017
Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.
Processes and Threads Processes and their scheduling
Simulation Statistics
CSCI1600: Embedded and Real Time Software
Queuing models Basic definitions, assumptions, and identities
Queuing models Basic definitions, assumptions, and identities
Computer Systems Performance Evaluation
Computer Systems Performance Evaluation
Chapter-5 Traffic Engineering.
Database System Architectures
Course Description Queuing Analysis This queuing course
Presentation transcript:

Web Server Software Architectures Author: Daniel A. Menascé Presenter: Noshaba Bakht

Web Site performance and scalability 1.workload characteristics. 1.workload characteristics. 2.security mechanisms. 2.security mechanisms. 3. Web cluster architectures. AND 3. Web cluster architectures. AND 4. Web server software architecture. 4. Web server software architecture.

Software Architecture Schemes Two dimensions generally characterize the architecture: Two dimensions generally characterize the architecture: ( 1 ). The processing model. ( 1 ). The processing model. It describes the type of process or threading It describes the type of process or threading model used to support a Web server operation; ( 2 ). Pool-size behavior. ( 2 ). Pool-size behavior. Pool-size behavior specifies how the size of the Pool-size behavior specifies how the size of the pool of processes or threads varies over time with workload intensity. pool of processes or threads varies over time with workload intensity.

Processing Models The main options for a processing model are The main options for a processing model are 1.process-based, 1.process-based, 2. thread-based, 2. thread-based, 3. a hybrid of the two. 3. a hybrid of the two.

process-based servers. Process-based servers consist of multiple single- threaded processes,each of which handles one request at a time. Process-based servers consist of multiple single- threaded processes,each of which handles one request at a time. The implementation of this server model also called prefork in Apache Unix version and in Apache version 2.0 MPM prefork module. servers/ servers/ html html

Thread based architecture. The server consists of a single multithreaded server, each thread handles one request at a time. The server consists of a single multithreaded server, each thread handles one request at a time. Implementation of this model is seen in Windows NT’s version of Apache 1.3. Implementation of this model is seen in Windows NT’s version of Apache ml ml ml ml windows.html windows.html windows.html windows.html

Hybrid Model. The model consists of multiple multithreaded processes, with each thread of any process handling one request at a time. The model consists of multiple multithreaded processes, with each thread of any process handling one request at a time. This Multi-Processing Module (MPM) implements a hybrid multi-process multi-threaded server. By using threads to serve requests, it is able to serve a large number of requests with less system resources than a process- based server. Yet it retains much of the stability of a process-based server by keeping multiple processes available, each with many threads. This Multi-Processing Module (MPM) implements a hybrid multi-process multi-threaded server. By using threads to serve requests, it is able to serve a large number of requests with less system resources than a process- based server. Yet it retains much of the stability of a process-based server by keeping multiple processes available, each with many threads

Advantage’s / Drawbacks The process based architecture is stable, the crash of any process generally does not effect the others so the system continues to operate. The process based architecture is stable, the crash of any process generally does not effect the others so the system continues to operate. Creating and killing of processes overloads the web server, due to the address –space management. Creating and killing of processes overloads the web server, due to the address –space management. In thread based architecture a single malfunctioning thread can bring the entire web server down resulting in great instability as all the threads share the same address space. In thread based architecture a single malfunctioning thread can bring the entire web server down resulting in great instability as all the threads share the same address space. The memory requirements are much smaller in this model, it is more efficient to spawn threads of the same process due to sharing of the same address space, than to fork a process, various threads easily share data structures as caches. The memory requirements are much smaller in this model, it is more efficient to spawn threads of the same process due to sharing of the same address space, than to fork a process, various threads easily share data structures as caches.

Hybrid architecture. Combines the advantages of both models and reduce their disadvantages. Combines the advantages of both models and reduce their disadvantages. Example Example # of processes = p. # of processes = p. # of threads per process = n. # of threads per process = n. Total # of requests handled by the server at a time = n * p. Total # of requests handled by the server at a time = n * p. If a thread/process crashes # of requests handled at a time = ( p -1) * n If a thread/process crashes # of requests handled at a time = ( p -1) * n Less memory required by the system to handle the requests and the system is more stable. Less memory required by the system to handle the requests and the system is more stable.

POOL SIZE BEHAVIOR Two approaches are possible. Two approaches are possible. 1. Static Approach. 1. Static Approach. The Web server creates a fixed number of processes or threads at the startup time. Rush hours:::If the # of requests exceed the number of the p/t the request waits in the queue. Server load is low::::idle p/t. The Web server creates a fixed number of processes or threads at the startup time. Rush hours:::If the # of requests exceed the number of the p/t the request waits in the queue. Server load is low::::idle p/t. The internet Information Server to view HTML pages by windows The internet Information Server to view HTML pages by windows Dynamic Approach. 2. Dynamic Approach. The Web server creates the number of processes and threads depending on the load, as the load increases so does the pool, letting more requests process concurrently and the queue size is reduced. The Web server creates the number of processes and threads depending on the load, as the load increases so does the pool, letting more requests process concurrently and the queue size is reduced.

Dynamic Pool-size Implementation. Web server parent process establishes Web server parent process establishes Configuration parameters. Configuration parameters. 1.Initial configuration parameter is the number p of child processes created to handle the requests correctly. 2.Minimum # m of idle processes in another configuration parameter. As the load of the request increases the processes are used up and the m # decreases below its sets level, thus telling the parent process to fork and created more child processes to handle the requests efficiently. When the load requests is decreased the min # of the processes increase telling the parent process to kill the extra processes. As the load of the request increases the processes are used up and the m # decreases below its sets level, thus telling the parent process to fork and created more child processes to handle the requests efficiently. When the load requests is decreased the min # of the processes increase telling the parent process to kill the extra processes. 3. Another configuration parameter is based on the life of the process or the number of requests it handles before it dies. It can be set to zero and the process will have infinite life. It can be set to a number value. This life limit reduces the memory loads and leaks.

Performance Considerations The Web serve request’s total response time is divided into following components. The Web serve request’s total response time is divided into following components. 1.Service Time. The total time a request spends on the physical resource level i.e disk and cpu. 1.Service Time. The total time a request spends on the physical resource level i.e disk and cpu. Physical contention. The time the process spends waiting for the physical resources of the web server. Physical contention. The time the process spends waiting for the physical resources of the web server. Software contention. The total time a server request spends waiting in the queue for a process/thread to be available to handle the request. Software contention. The total time a server request spends waiting in the queue for a process/thread to be available to handle the request.

Architecture Performance Queuing Network Model. Queuing Network Model. It models the physical resources of the Web server and calculate the throughput. Markov Chain Model. Markov Chain Model. It represent the software architecture

Analytic Performance Models Queuing Network Model ( QN ). Queuing Network Model ( QN ). A queuing network model represents a system as a set of service centers that process customers (also known as jobs, transactions, or arrivals). Service centers can be any active resource such as a CPU, disk,printer, network link. Queuing network models can predict the latency, throughput and utilization of a system and its components. It models the Web server’s physical resources and computes it throughput. Distributions of the service and arrival time is the key.

Markov model A Markov model is a random process in which the transition probability to the next state depends solely on the previous state A Markov model is a random process in which the transition probability to the next state depends solely on the previous state Transition probability is a conditional probability for the system to go to a particular new state, given the current state of the system. Transition probability is a conditional probability for the system to go to a particular new state, given the current state of the system. Memoryless model. Memoryless model.

Markov chain The characteristic property of this sort of process is that it retains no memory of where it has been in the past. This means that only the current state of the process can influence where it goes next.... The characteristic property of this sort of process is that it retains no memory of where it has been in the past. This means that only the current state of the process can influence where it goes next.... The process can assume only a finite or countable set of states, when it is usual to refer to it as a Markov chain. The process can assume only a finite or countable set of states, when it is usual to refer to it as a Markov chain. "Markov chains are the simplest mathematical models for random phenomena evolving in time. "Markov chains are the simplest mathematical models for random phenomena evolving in time. Indeed, the whole of the mathematical study of random processes can be regarded as a generalization in one way or another of the theory of Markov chains." Indeed, the whole of the mathematical study of random processes can be regarded as a generalization in one way or another of the theory of Markov chains." "Chains exist both in discrete time... and continuous time..." "Chains exist both in discrete time... and continuous time..."

Analytic Performance of the Models Arrival rate (λ)(inverse of mean interarrival time) Arrival rate (λ)(inverse of mean interarrival time) Service rate (µ) (inverse of mean service time) Service rate (µ) (inverse of mean service time)

The state k represent # of the request in the system ( waiting or being handled). Transition of k to k+1 occur at a rate at which new requests arrive at the server. The state k -1 indicates the completion of the request. K - 1 k K + 1 K + 1 +….

Idle Processes Pool The average number of idle processes is a function of the average arrival rate of requests for three different values of the active processes p. The average number of idle processes is a function of the average arrival rate of requests for three different values of the active processes p. For very light loads. For very light loads. # idle processes ~ total # of processes. When the load increases the # of idle processes decrease slightly in the beginning and precipitously as the arrival rate reaches to its maximum possible value. When the load increases the # of idle processes decrease slightly in the beginning and precipitously as the arrival rate reaches to its maximum possible value. This maximum value which is 5 req/sec in this case This maximum value which is 5 req/sec in this case Represent that the web server becomes unstable at this point and the queue for the processes grows without bound. Represent that the web server becomes unstable at this point and the queue for the processes grows without bound.

Web Dynamic Pool Size Maintain an average of idle processes to be 10 – 11. Maintain an average of idle processes to be 10 – 11.

Final Remarks Software connection and architecture effects the web server performance. Software connection and architecture effects the web server performance. Computer system web server can adjust dynamically the pool of the processes with the help and efficient use of the analytic performance models. Computer system web server can adjust dynamically the pool of the processes with the help and efficient use of the analytic performance models.

Reference Articles. Reference Articles. Scaling The Web DanielA.Menasce Scaling The Web DanielA.Menasce Response-Time Analysis of Composite Web Services. Response-Time Analysis of Composite Web Services. (Submit the query to various search engines in parallel) (Submit the query to various search engines in parallel) Fork and join model. Fork and join model Scaling Web Sites through Caching Scaling Web Sites through Caching Scaling your E-business Scaling your E-business Traffic spikes and its control via the building of scalable infrastructure Sun Secure Web Server Reference Architecture. Sun Secure Web Server Reference Architecture Dynamic Load balancing on Web-Server Systems Dynamic Load balancing on Web-Server Systems Distributed web server systems, Client based, server architecture based approach. The bandwidth of the network and its effect on the web traffic ANALYSIS OF COMPUTER PERFORMANE TECHNIQUES ANALYSIS OF COMPUTER PERFORMANE TECHNIQUES Performance Analysis Techniques for the models.

Three Critical Questions. For the author the average arrival time of the requests is important as it determines the number of the idle processes and which is the control switch for the configuration parameter in the dynamic size pool behavior of the web server architecture, what is the impact of complexity of the software architecture and the arrival time of the requests? For the author the average arrival time of the requests is important as it determines the number of the idle processes and which is the control switch for the configuration parameter in the dynamic size pool behavior of the web server architecture, what is the impact of complexity of the software architecture and the arrival time of the requests? Web based services are offered to potentially tens of millions of users by hundreds of thousands of servers. The web servers using the analytic performance models as a guide to dynamically adjust configuration parameters to handle the maximum request with minimum time delay will definitely improve the availability and scalability of the e-business but what is its impact on the cost-effectiveness of the web services? Web based services are offered to potentially tens of millions of users by hundreds of thousands of servers. The web servers using the analytic performance models as a guide to dynamically adjust configuration parameters to handle the maximum request with minimum time delay will definitely improve the availability and scalability of the e-business but what is its impact on the cost-effectiveness of the web services? Some e-business sites may become popular very quickly. How fast can the site architecture be scaled up? What components of the site should be upgraded? Database server? Web servers? Application servers? Or the network link bandwidth? Some e-business sites may become popular very quickly. How fast can the site architecture be scaled up? What components of the site should be upgraded? Database server? Web servers? Application servers? Or the network link bandwidth? Capacity planning is the process of predicting when future loads levels will saturate the system, and of determining the most cost-effective way of delaying the system saturation as much as possible. Prediction is key to capacity planning because an organization needs to be able to determine how an e-business site will react when changes in load levels and customer behavior occur, or when new business models are developed. This determination requires predictive models and not the experimentation. Is the analytic performance models are a better fit than the simulation models in capacity planning ? Capacity planning is the process of predicting when future loads levels will saturate the system, and of determining the most cost-effective way of delaying the system saturation as much as possible. Prediction is key to capacity planning because an organization needs to be able to determine how an e-business site will react when changes in load levels and customer behavior occur, or when new business models are developed. This determination requires predictive models and not the experimentation. Is the analytic performance models are a better fit than the simulation models in capacity planning ?