An economic approach for scalable and highly-available distributed applications Nicolas Bonvin, Thanasis G. Papaioannou and Karl Aberer School of Computer.

An economic approach for scalable and highly-available distributed applications Nicolas Bonvin, Thanasis G. Papaioannou and Karl Aberer School of Computer and Communication Sciences Ecole Polytechnique F´ed´erale de Lausanne (EPFL) 1015 Lausanne, Switzerland firstname.lastname@epfl.ch 2010 IEEE 3rd International Conference on Cloud Computing 1

Outline INTRODUCTION MOTIVATION - RUNNING EXAMPLE SCARCE: THE QUEST OF AUTONOMIC APPLICATIONS EVALUATION CONCLUSIONS COMMENT 2

INTRODUCTION A successful online application should – be able to handle traffic spikes and flash crowds efficiently – be resilient to all kinds of failures (software, hardware, rack or even datacenter failures) A naive solution against load variations would be static over- provisioning of resources – result into resource underutilization for most of the time As the size of the cloud increases its administrative overhead becomes unmanageable – The cloud resources for an application should be self-managed and adaptive to load variations or failures 3

INTRODUCTION We propose a middleware (Scattered Autonomic Resources, referred to as Scarce) – to avoid underutilized computational resources – dynamically adapts to changing conditions(failures or load variations) Simplifies the development of online applications composed by multiple independent components (e.g. web services) – following the Service Oriented Architecture (SOA) principles 4

INTRODUCTION Components are treated as individually rational entities that rent computational resources from servers Components migrate, replicate or stop according to their economic fitness – fitness expresses the difference between the utility offered by a specific application component and the cost for retaining it in the cloud Components of a certain application are dynamically replicated to geographically-diverse servers according to the availability requirements of the application 5

INTRODUCTION Our approach combines the following unique characteristics: Adaptive component replication for accommodating load variations Geographically-diverse placement of clone component instances Cost-effective placement of service components Decentralized self-management of the cloud resources for the application 6

RUNNING EXAMPLE Building an application that both provides robust guarantees against failures (hardware, network, etc.) and handles dynamically load spikes is a non-trivial task a simple web application for selling e-tickets composed by 4 independent components 7 l the entry point of the application and serves the HTML pages to end users A user manager for managing the profiles of the customers A ticket manager for managing the amount of available tickets of an event An e-ticket generator that produces e-tickets in PDF format

RUNNING EXAMPLE Each component can be regarded as a standalone and self- contained web service This application – is highly sensitive to traffic spikes – is business-critical (deployed on different geographical regions) 8 A token (or a session ID) is assigned to each customers browser by the web front-end and is passed to each component along with the requests This token is used as a key in the key- value database to store the details of the clients shopping cart (number of tickets ordered)

SCARCE THE QUEST OF AUTONOMIC APPLICATIONS: – A. The approach – B. Server agent – C. Routing table – D. Economic model – E. Maintaining high-availability 9

The approach We consider applications formed by many independent components – interact together to provide a service to the end user A component – is self-managing, self-healing and is hosted by a server(allowed to host many different components) – stop, migrate or replicate to a new server according to its load or availability 10

Server agent The server agent is a special component that resides at each server – responsible for starting and stopping the components of applications – checking the health of the services 1. verifying if the service process is still running 2. firing a test request,checking that the corresponding reply is correct The agent knows the properties of every service that composes the application – the path of the service executable This knowledge is acquired when the agent starts, by contacting another agent (referred to as bootstrap agent) 11

Server agent During the startup phase, the agent also retrieve the current routing table from the bootstrap agent Assume that a server belongs to a rack, a room, a datacenter, a city, a country and a continent. A label of the form continent-country-city-datacenter-room-rack-server – For example, a possible label for a server located in a data center in London could be EU-UK-LON-D1-C03-R11-S07 12

Routing table 13

Choosing the replica 14

Economic model Service replication should be highly adaptive to the processing load and to failures – the server agent as an individual optimizer to each component 1. to ascertain the pre-specified availability guarantees 2. to balance its economic fitness At each epoch, a service : pays a virtual rent to the servers where it is running – virtual rent corresponds to the usage of the server resources (CPU, memory, network, disk) may be replicated, migrated or stopped by the server agent – based on the service demand, the renting cost and the maintenance of high availability 15

Economic model 16

Economic model 17

Migrate or Stop 18

Replicate 19

Availability 20

Availability 21

Candidate server 22 [3]N. Bonvin, T. G. Papaioannou, and K. Aberer, Cost-efficient and differentiated data availability guarantees in data clouds, in Proc. of the ICDE, Long Beach, CA, USA, 2010.

Candidate server The components rank servers according – net benefit (6) – randomly choose the target for replication among the top-k ones ( avoid overloading the same destination server at an epoch ) the availability tends to be increased as much as possible at the minimum cost, while the network latency for the query reply also decreases that the same approach according to (6) is used for choosing the candidate server for component migration 23

Experimental Setup We employ two different testbed settings: – a single application setup consisting of 7 servers – a multi-application setup consisting of 15 servers The hardware specification of each server is Intel Core i7 920 @ 2.67 GHz, 8GB Ram, Linux 2.6.32-trunk-amd64 We run two databases (MySQL 5.1 and Cassandra 0.5.0) One generator of client requests for each application (FunkLoad 1.10, http://funkload.nuxeo.org/) on their own dedicated servers 24

Experimental Setup Performing the following actions: – 1) request the main page that contains the list of entertainment events – 2) request the details of an event A – 3) request the details of an event B – 4) request again the details of the event A – 5) login into the application and view user account – 6) update some personal information – 7) buy a ticket for the event A – 8) download the corresponding ticket in PDF A client continuously performs this list of actions over a period of 1 minute 25

Experimental Setup An epoch is set to 15 seconds An agent sends gossip messages every 5 seconds We consider two different placements of the components: – A static approach each component is assigned to a server by the system administrator – A dynamic approach all components are started on a single server and dynamically migrate / replicate / stop according to the load or the hardware failures 26

Dynamic vs Static Replica Placement 27 the response time is lower bounded by that of the slowest component (in our case, for generating PDF tickets)

Scalability 28 the multi-application experimental setup Assume that all 10 servers reside at 1 datacenter: Increase the number of concurrent users from 150 to 1500 randomly routed among the replicas of a component

High-availability 29 A single application setup consisting of 7 servers Assume that each component has 2 replicas that reside at separate servers 10 concurrent clients continuously send requests for 1 minute After 30 seconds, one random server between those hosting the replicas of a component fails

Adaptation to New Cloud Resources 30 We employ the single-application experimental setup, but the number of available servers in the cloud ranges from 1 to 10

Evaluation of Routing Policies 31 In this case, we employ the single-application setup The 4 servers of the cloud are located in 2 datacenters (2 servers per datacenter). The round-trip time between the datacenters is 50 ms The minimum availability (i.e. number of replica per component) is set to 2

CONCLUSIONS Proposed an economic approach for dynamic accommodation of load spikes for composite web services deployed in clouds Server agents act as individual optimizers and autonomously replicate, migrate or stop based on their economic fitness This approach also offers high availability guarantees by maintaining a certain number of the various components in geographically diverse 32

COMMENT Rents v.s. Resources – Using rents is a more flexible approach Distributed v.s. Centralized 33

An economic approach for scalable and highly-available distributed applications Nicolas Bonvin, Thanasis G. Papaioannou and Karl Aberer School of Computer.

Similar presentations

Presentation on theme: "An economic approach for scalable and highly-available distributed applications Nicolas Bonvin, Thanasis G. Papaioannou and Karl Aberer School of Computer."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

An economic approach for scalable and highly-available distributed applications Nicolas Bonvin, Thanasis G. Papaioannou and Karl Aberer School of Computer.

Similar presentations

Presentation on theme: "An economic approach for scalable and highly-available distributed applications Nicolas Bonvin, Thanasis G. Papaioannou and Karl Aberer School of Computer."— Presentation transcript:

Similar presentations

About project

Feedback