Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advanced Distributed Software Architectures and Technology group ADSaT 1 Scalability & Availability Paul Greenfield CSIRO.

Similar presentations


Presentation on theme: "Advanced Distributed Software Architectures and Technology group ADSaT 1 Scalability & Availability Paul Greenfield CSIRO."— Presentation transcript:

1 Advanced Distributed Software Architectures and Technology group ADSaT 1 Scalability & Availability Paul Greenfield CSIRO

2 Advanced Distributed Software Architectures and Technology group ADSaT 2 Building Real Systems Scalable –Fast enough to handle expected load –Grow easily when load grows Available –Available enough of the time Performance and availability cost –Aim for ‘enough’ of each but not more

3 Advanced Distributed Software Architectures and Technology group ADSaT 3 Scalable Scale-up –Bigger and faster systems Scale-out –Systems working to handle load –Server farms –Clusters Implications for application design

4 Advanced Distributed Software Architectures and Technology group ADSaT 4 Available Goal is 100% availability –24x7 operations Redundancy is the key –No single points of failure –Spare everything Disks, disk channels, processors, power supplies, fans, memory,.. Automated fail-over and recovery

5 Advanced Distributed Software Architectures and Technology group ADSaT 5 Performance How fast is this system? –Not the same as scalability but related Scalability is concerned with the limits to possible performance –Measured by response time and throughput –Aim for enough performance Have a performance target Tune and add hardware until target hit Then worry about tomorrow…

6 Advanced Distributed Software Architectures and Technology group ADSaT 6 Performance Measures Response time –What delay does the user see? –Instantaneous is good but 95% under 2 seconds is acceptable –Response time varies with ‘heaviness’ of transactions Fast read-only transactions Slower update transactions Effects of database contention

7 Advanced Distributed Software Architectures and Technology group ADSaT 7 Response Times

8 Advanced Distributed Software Architectures and Technology group ADSaT 8 Response Times

9 Advanced Distributed Software Architectures and Technology group ADSaT 9 Response Times

10 Advanced Distributed Software Architectures and Technology group ADSaT 10 Throughput How many transactions can be handled in some period of time –Transactions/second or tpm, tph or tpd –A measure of overall capacity Transaction Processing Council –Standard benchmarks for TP systems –TPCC for typical transaction system –www.tpc.orgwww.tpc.org –Current record is 227,000 tpmc

11 Advanced Distributed Software Architectures and Technology group ADSaT 11 Throughput Throughput increases until some resource limit is hit –Adding more clients just increases the response time –Run out of processor, disk bandwidth, network bandwidth –Some resources overload badly Ethernet network performance degrades

12 Advanced Distributed Software Architectures and Technology group ADSaT 12 Throughput

13 Advanced Distributed Software Architectures and Technology group ADSaT 13 System Capacity How many clients can you support? –Name an acceptable response time –Average 95% under 2 secs is common And what is ‘average’? –Plot response time vs # of clients Great if you can run benchmarks –Reason for prototyping and proving proposed architectures before leaping into full-scale implementation

14 Advanced Distributed Software Architectures and Technology group ADSaT 14 System Capacity

15 Advanced Distributed Software Architectures and Technology group ADSaT 15 Load Balancing I A few different but related meanings 1. Balancing across server processes –CORBA-style where clients use objects that live inside server processes –Want all server processes to be busy –Client calls have to go to the process containing their object, even if this process is busy and others are idle

16 Advanced Distributed Software Architectures and Technology group ADSaT 16 Load Balancing I

17 Advanced Distributed Software Architectures and Technology group ADSaT 17 Load Balancing I Client calls on name server to find the location of a suitable server Name server can spread client objects across multiple servers –Often ‘round robin’ Client is bound to server and stays bound forever –Can lead to performance problems

18 Advanced Distributed Software Architectures and Technology group ADSaT 18 Load Balancing I Server Object Reference Client Numbers Total Clients per server object 11-100100 2101-200100 3201-300100 4301-400100 5401-500100 Server Object Reference Client Numbers Total Clients per server object 11-100, 201, 206, 211, ….496 160 2101-200, 202, 207, 212, …, 497 160 3203, 208, 213, …, 498 60 4204, 209, 214, …, 499 60 5205, 210, 215, …, 500 60 InitialLater

19 Advanced Distributed Software Architectures and Technology group ADSaT 19 Load Balancing I Solution to static allocation problem is for clients to throw away their server objects and get new ones every now and again Application coding problem –And can be objects be discarded? –What kind of ‘objects’ are they if they can be discarded?

20 Advanced Distributed Software Architectures and Technology group ADSaT 20 Name Servers Server processes call name server when they come up –Advertising their services Clients call name server to find the location of a server process –Up to the name server to match clients to servers Client calls server process to create objects

21 Advanced Distributed Software Architectures and Technology group ADSaT 21 Load Balancing I Client Name Server Server process Advertise service Request server reference Return server reference Call server object’s methods Get server object reference Load balancing across processes within a server

22 Advanced Distributed Software Architectures and Technology group ADSaT 22 Load Balancing II What happens when our single system is full? –Use faster systems Scale-up –Use additional systems Scale-out Now load-balancing is used to spread load across systems

23 Advanced Distributed Software Architectures and Technology group ADSaT 23 Load Balancing II CORBA world… –Name server can distribute across server processes running on different systems –Scales well… Name server only involved when handing out a reference to a server, not on every method call

24 Advanced Distributed Software Architectures and Technology group ADSaT 24 Load Balancing II Client Name Server Server process Advertise service Request server reference Return server reference Call server object’s methods Get server object reference Load balancing across multiple systems

25 Advanced Distributed Software Architectures and Technology group ADSaT 25 Load Balancing II COM+ world… –No need for load-balancing within a system Multithreaded server process All objects live in a single process space –Component load balancing across systems Client calls router when creating object Router returns reference to an object in a COM+ server process Load balanced at time of object creation

26 Advanced Distributed Software Architectures and Technology group ADSaT 26 Load Balancing II Client App DLLApp DLL DCOM/MTSDCOM/MTS MTS process Thread pool Shared object space Application code COM+/MTS using thread pools rather than load balancing within a single system

27 Advanced Distributed Software Architectures and Technology group ADSaT 27 COM+ Component Load Balancing Client Response time tracker Router Create object Call object’s methods Pass request to server Create object and pass back reference COM + CLB balancing load across multiple systems

28 Advanced Distributed Software Architectures and Technology group ADSaT 28 Load Balancing II COM+ scales well… –Router only involved when object is created May change in later release to support dynamic re-balancing as server load changes –Method calls direct from client to server –Allocation based on response time rather than round-robin Allocate to least-loaded server

29 Advanced Distributed Software Architectures and Technology group ADSaT 29 Load Balancing II No name server in COM world? –COM/MTS clients ‘know’ the name of the server Set at client installation time Can change using GUI tools Admin problem if server app is moved –COM+ uses Active Directory to find services

30 Advanced Distributed Software Architectures and Technology group ADSaT 30 Load Balancing II Some systems involve the router in every method call/request –Request goes to router process who then passes it on to a server process –Scales poorly as the router can be a major bottle-neck –Some availability concerns as well What happens if the router fails?

31 Advanced Distributed Software Architectures and Technology group ADSaT 31 Load Balancing II Client Router Server process Load balancing with router in main call path

32 Advanced Distributed Software Architectures and Technology group ADSaT 32 Scale-up No need for load-balancing across systems Just use a bigger box –Add processors, memory, …. –SMP (symmetric multiprocessing) Runs into limits eventually Could be less available

33 Advanced Distributed Software Architectures and Technology group ADSaT 33 Scale-up Example from the Web –Large auction site –Server farm of NT boxes (scale-out) –Single database server (scale-up) 64-processor SUN box –More capacity needed? Add more NT boxes easily SUN box is full so have to shift some databases to another box

34 Advanced Distributed Software Architectures and Technology group ADSaT 34 Clusters A group of independent computers acting like a single system –Shared disks –Single IP address –Single set of services –Fail-over to other members of cluster –Load sharing within the cluster –DEC, IBM, MS, …

35 Advanced Distributed Software Architectures and Technology group ADSaT 35 Clusters Client PCs Server A Server B Disk cabinet A Disk cabinet B Heartbeat Cluster management

36 Advanced Distributed Software Architectures and Technology group ADSaT 36 Clusters Address scalability –Add more boxes to the cluster Address availability –Fail-over –Add & remove boxes from the cluster for upgrades and maintenance Can be used as one element of a highly-available system

37 Advanced Distributed Software Architectures and Technology group ADSaT 37 Web Server Farms Web servers are highly scalable –Web applications are normally stateless Next request can go to any Web server State comes from client or database –Just need to spread incoming requests IP sprayers (hardware, software) >1 Web server looking at same IP address with some coordination (see MS WLB docs) –Same technique for other network apps

38 Advanced Distributed Software Architectures and Technology group ADSaT 38 Available System Web Clients Web Servers Load balanced using Convoy App Servers use COM+ LB Database is installed on Wolfpack cluster for high availability COM+ LBS router node

39 Advanced Distributed Software Architectures and Technology group ADSaT 39 Availability How much? –99%87.6 hours a year –99.9%8.76 hours a year –99.99%0.876 hours a year Need to consider operations as well –Maintenance, software upgrades, backups, application changes –Not just faults and recovery time

40 Advanced Distributed Software Architectures and Technology group ADSaT 40 Availability and Scalability Often a question of application design –Stateful vs stateless What happens if a server fails? Can requests go to any server? –What language and database API Balance cost vs speed – VB/C++ - ODBC/ADO –Synchronous method calls or asynchronous messaging? Reduce dependency between components Failure tolerant designs

41 Advanced Distributed Software Architectures and Technology group ADSaT 41 Next Week Distributed application architectures –How to design systems that will work, scale and be available –Web-based systems –Web technology


Download ppt "Advanced Distributed Software Architectures and Technology group ADSaT 1 Scalability & Availability Paul Greenfield CSIRO."

Similar presentations


Ads by Google