Advanced Distributed Software Architectures and Technology group ADSaT 1 Scalability & Availability Paul Greenfield CSIRO.

Slides:



Advertisements
Similar presentations
Express5800/ft series servers Product Information Fault-Tolerant General Purpose Servers.
Advertisements

Clustering Technology For Scaleability Jim Gray Microsoft Research
Multiple Processor Systems
Performance Testing - Kanwalpreet Singh.
NAS vs. SAN 10/2010 Palestinian Land Authority IT Department By Nahreen Ameen 1.
Preparing For Server Installation Instructor: Enoch E. Damson.
Introduction to DBA.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Chapter 5: Server Hardware and Availability. Hardware Reliability and LAN The more reliable a component, the more expensive it is. Server hardware is.
High Availability 24 hours a day, 7 days a week, 365 days a year… Vik Nagjee Product Manager, Core Technologies InterSystems Corporation.
NETWORK LOAD BALANCING NLB.  Network Load Balancing (NLB) is a Clustering Technology.  Windows Based. (windows server).  To scale performance, Network.
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
Technical Architectures
Dr. Zahid Anwar. Simplified Architecture of Linux Cluster Simplified Architecture of a Single Computer Simplified architecture of an enterprise cluster.
Module 8: Concepts of a Network Load Balancing Cluster
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
1 CSSE 477 – A bit more on Performance Steve Chenoweth Friday, 9/9/11 Week 1, Day 2 Right – Googling for “Performance” gets you everything from Lady Gaga.
CSE 190: Internet E-Commerce Lecture 16: Performance.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Advanced Distributed Software Architectures and Technology group ADSaT 1 Application Architectures Ian Gorton, Paul Greenfield.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Distributed File Systems Sarah Diesburg Operating Systems CS 3430.
Load Test Planning Especially with HP LoadRunner >>>>>>>>>>>>>>>>>>>>>>
1 Oracle 9i AS Availability and Scalability Margaret H. Mei Senior Product Manager, ST.
Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.
PMIT-6102 Advanced Database Systems
Web Based Applications
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
Module 12: Designing High Availability in Windows Server ® 2008.
Windows 2000 Advanced Server and Clustering Prepared by: Tetsu Nagayama Russ Smith Dale Pena.
Scalability Terminology: Farms, Clones, Partitions, and Packs: RACS and RAPS Bill Devlin, Jim Cray, Bill Laing, George Spix Microsoft Research Dec
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
CLUSTER COMPUTING STIMI K.O. ROLL NO:53 MCA B-5. INTRODUCTION  A computer cluster is a group of tightly coupled computers that work together closely.
Victor Mushkatin, MCSE, MCSD CORPORATION Alexander Zakonov, MCSE, MCSD Stephen Pelletier, MCSE.
Server Systems Administration. Types of Servers Small Servers –Usually are PCs –Need a PC Server Operating System (SOS) such as Microsoft Windows Server,
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Architectures of distributed systems Fundamental Models
Module 10: Maintaining High-Availability. Overview Introduction to Availability Increasing Availability Using Failover Clustering Standby Servers and.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 12: Planning and Implementing Server Availability and Scalability.
CNN Case Study: Deploying eDirectory ™ in a UNIX Environment Steve Brunton Chief Engineer CNN Internet Technologies
CHAPTER 7 CLUSTERING SERVERS. CLUSTERING TYPES There are 2 types of clustering ; Server clusters Network Load Balancing (NLB) The difference between the.
Copyright warning. COMP5348 Lecture 12: Scalability and Availability Adapted with permission from presentations by Paul Greenfield, and from material.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
SMP Basics KeyStone Training Multicore Applications Literature Number: SPRPxxx 1.
Background Computer System Architectures Computer System Software.
Lecture 17 Page 1 CS 111 Online Single System Image Approaches Built a distributed system out of many more- or-less traditional computers – Each with typical.
System Models Advanced Operating Systems Nael Abu-halaweh.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 12: Planning and Implementing Server Availability and Scalability.
Scaling Network Load Balancing Clusters
High Availability 24 hours a day, 7 days a week, 365 days a year…
Distributed File Systems
Network Load Balancing
VIRTUAL SERVERS Presented By: Ravi Joshi IV Year (IT)
Introduction to Networks
Clustering Technology For Fault Tolerance
Web Server Administration
Architectures of distributed systems Fundamental Models
Architectures of distributed systems Fundamental Models
Architectures of distributed systems Fundamental Models
Database System Architectures
Distributed Systems and Concurrency: Distributed Systems
Presentation transcript:

Advanced Distributed Software Architectures and Technology group ADSaT 1 Scalability & Availability Paul Greenfield CSIRO

Advanced Distributed Software Architectures and Technology group ADSaT 2 Building Real Systems Scalable –Fast enough to handle expected load –Grow easily when load grows Available –Available enough of the time Performance and availability cost –Aim for ‘enough’ of each but not more

Advanced Distributed Software Architectures and Technology group ADSaT 3 Scalable Scale-up –Bigger and faster systems Scale-out –Systems working to handle load –Server farms –Clusters Implications for application design

Advanced Distributed Software Architectures and Technology group ADSaT 4 Available Goal is 100% availability –24x7 operations Redundancy is the key –No single points of failure –Spare everything Disks, disk channels, processors, power supplies, fans, memory,.. Automated fail-over and recovery

Advanced Distributed Software Architectures and Technology group ADSaT 5 Performance How fast is this system? –Not the same as scalability but related Scalability is concerned with the limits to possible performance –Measured by response time and throughput –Aim for enough performance Have a performance target Tune and add hardware until target hit Then worry about tomorrow…

Advanced Distributed Software Architectures and Technology group ADSaT 6 Performance Measures Response time –What delay does the user see? –Instantaneous is good but 95% under 2 seconds is acceptable –Response time varies with ‘heaviness’ of transactions Fast read-only transactions Slower update transactions Effects of database contention

Advanced Distributed Software Architectures and Technology group ADSaT 7 Response Times

Advanced Distributed Software Architectures and Technology group ADSaT 8 Response Times

Advanced Distributed Software Architectures and Technology group ADSaT 9 Response Times

Advanced Distributed Software Architectures and Technology group ADSaT 10 Throughput How many transactions can be handled in some period of time –Transactions/second or tpm, tph or tpd –A measure of overall capacity Transaction Processing Council –Standard benchmarks for TP systems –TPCC for typical transaction system – –Current record is 227,000 tpmc

Advanced Distributed Software Architectures and Technology group ADSaT 11 Throughput Throughput increases until some resource limit is hit –Adding more clients just increases the response time –Run out of processor, disk bandwidth, network bandwidth –Some resources overload badly Ethernet network performance degrades

Advanced Distributed Software Architectures and Technology group ADSaT 12 Throughput

Advanced Distributed Software Architectures and Technology group ADSaT 13 System Capacity How many clients can you support? –Name an acceptable response time –Average 95% under 2 secs is common And what is ‘average’? –Plot response time vs # of clients Great if you can run benchmarks –Reason for prototyping and proving proposed architectures before leaping into full-scale implementation

Advanced Distributed Software Architectures and Technology group ADSaT 14 System Capacity

Advanced Distributed Software Architectures and Technology group ADSaT 15 Load Balancing I A few different but related meanings 1. Balancing across server processes –CORBA-style where clients use objects that live inside server processes –Want all server processes to be busy –Client calls have to go to the process containing their object, even if this process is busy and others are idle

Advanced Distributed Software Architectures and Technology group ADSaT 16 Load Balancing I

Advanced Distributed Software Architectures and Technology group ADSaT 17 Load Balancing I Client calls on name server to find the location of a suitable server Name server can spread client objects across multiple servers –Often ‘round robin’ Client is bound to server and stays bound forever –Can lead to performance problems

Advanced Distributed Software Architectures and Technology group ADSaT 18 Load Balancing I Server Object Reference Client Numbers Total Clients per server object Server Object Reference Client Numbers Total Clients per server object , 201, 206, 211, … , 202, 207, 212, …, , 208, 213, …, , 209, 214, …, , 210, 215, …, InitialLater

Advanced Distributed Software Architectures and Technology group ADSaT 19 Load Balancing I Solution to static allocation problem is for clients to throw away their server objects and get new ones every now and again Application coding problem –And can be objects be discarded? –What kind of ‘objects’ are they if they can be discarded?

Advanced Distributed Software Architectures and Technology group ADSaT 20 Name Servers Server processes call name server when they come up –Advertising their services Clients call name server to find the location of a server process –Up to the name server to match clients to servers Client calls server process to create objects

Advanced Distributed Software Architectures and Technology group ADSaT 21 Load Balancing I Client Name Server Server process Advertise service Request server reference Return server reference Call server object’s methods Get server object reference Load balancing across processes within a server

Advanced Distributed Software Architectures and Technology group ADSaT 22 Load Balancing II What happens when our single system is full? –Use faster systems Scale-up –Use additional systems Scale-out Now load-balancing is used to spread load across systems

Advanced Distributed Software Architectures and Technology group ADSaT 23 Load Balancing II CORBA world… –Name server can distribute across server processes running on different systems –Scales well… Name server only involved when handing out a reference to a server, not on every method call

Advanced Distributed Software Architectures and Technology group ADSaT 24 Load Balancing II Client Name Server Server process Advertise service Request server reference Return server reference Call server object’s methods Get server object reference Load balancing across multiple systems

Advanced Distributed Software Architectures and Technology group ADSaT 25 Load Balancing II COM+ world… –No need for load-balancing within a system Multithreaded server process All objects live in a single process space –Component load balancing across systems Client calls router when creating object Router returns reference to an object in a COM+ server process Load balanced at time of object creation

Advanced Distributed Software Architectures and Technology group ADSaT 26 Load Balancing II Client App DLLApp DLL DCOM/MTSDCOM/MTS MTS process Thread pool Shared object space Application code COM+/MTS using thread pools rather than load balancing within a single system

Advanced Distributed Software Architectures and Technology group ADSaT 27 COM+ Component Load Balancing Client Response time tracker Router Create object Call object’s methods Pass request to server Create object and pass back reference COM + CLB balancing load across multiple systems

Advanced Distributed Software Architectures and Technology group ADSaT 28 Load Balancing II COM+ scales well… –Router only involved when object is created May change in later release to support dynamic re-balancing as server load changes –Method calls direct from client to server –Allocation based on response time rather than round-robin Allocate to least-loaded server

Advanced Distributed Software Architectures and Technology group ADSaT 29 Load Balancing II No name server in COM world? –COM/MTS clients ‘know’ the name of the server Set at client installation time Can change using GUI tools Admin problem if server app is moved –COM+ uses Active Directory to find services

Advanced Distributed Software Architectures and Technology group ADSaT 30 Load Balancing II Some systems involve the router in every method call/request –Request goes to router process who then passes it on to a server process –Scales poorly as the router can be a major bottle-neck –Some availability concerns as well What happens if the router fails?

Advanced Distributed Software Architectures and Technology group ADSaT 31 Load Balancing II Client Router Server process Load balancing with router in main call path

Advanced Distributed Software Architectures and Technology group ADSaT 32 Scale-up No need for load-balancing across systems Just use a bigger box –Add processors, memory, …. –SMP (symmetric multiprocessing) Runs into limits eventually Could be less available

Advanced Distributed Software Architectures and Technology group ADSaT 33 Scale-up Example from the Web –Large auction site –Server farm of NT boxes (scale-out) –Single database server (scale-up) 64-processor SUN box –More capacity needed? Add more NT boxes easily SUN box is full so have to shift some databases to another box

Advanced Distributed Software Architectures and Technology group ADSaT 34 Clusters A group of independent computers acting like a single system –Shared disks –Single IP address –Single set of services –Fail-over to other members of cluster –Load sharing within the cluster –DEC, IBM, MS, …

Advanced Distributed Software Architectures and Technology group ADSaT 35 Clusters Client PCs Server A Server B Disk cabinet A Disk cabinet B Heartbeat Cluster management

Advanced Distributed Software Architectures and Technology group ADSaT 36 Clusters Address scalability –Add more boxes to the cluster Address availability –Fail-over –Add & remove boxes from the cluster for upgrades and maintenance Can be used as one element of a highly-available system

Advanced Distributed Software Architectures and Technology group ADSaT 37 Web Server Farms Web servers are highly scalable –Web applications are normally stateless Next request can go to any Web server State comes from client or database –Just need to spread incoming requests IP sprayers (hardware, software) >1 Web server looking at same IP address with some coordination (see MS WLB docs) –Same technique for other network apps

Advanced Distributed Software Architectures and Technology group ADSaT 38 Available System Web Clients Web Servers Load balanced using Convoy App Servers use COM+ LB Database is installed on Wolfpack cluster for high availability COM+ LBS router node

Advanced Distributed Software Architectures and Technology group ADSaT 39 Availability How much? –99%87.6 hours a year –99.9%8.76 hours a year –99.99%0.876 hours a year Need to consider operations as well –Maintenance, software upgrades, backups, application changes –Not just faults and recovery time

Advanced Distributed Software Architectures and Technology group ADSaT 40 Availability and Scalability Often a question of application design –Stateful vs stateless What happens if a server fails? Can requests go to any server? –What language and database API Balance cost vs speed – VB/C++ - ODBC/ADO –Synchronous method calls or asynchronous messaging? Reduce dependency between components Failure tolerant designs

Advanced Distributed Software Architectures and Technology group ADSaT 41 Next Week Distributed application architectures –How to design systems that will work, scale and be available –Web-based systems –Web technology