Presentation on theme: "Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume Pierre and Maarten van Steen Vrije Universiteit, Amsterdam,"— Presentation transcript:
Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume Pierre and Maarten van Steen Vrije Universiteit, Amsterdam, The Netherlands.
Hosting Large-Scale Internet Services Large-scale e-commerce enterprises use complex software systems Sites built of numerous applications called services. A request to amazon.com leads to requests to hundreds of services [Vogels, ACM Queue, 2006]. Each site has a SLA (latency, availability targets) Global optimization-based hosting is intractable Convert Global to per-service SLA Host each service scalably. Problem in focus: Efficient hosting of an Internet service.
Web Services: Background Services – Multi-tiered Applications Perform business logic on data from its data store and from other services. E.g., Shopping cart service, Recommender service, Page generator. Exposed and restricted through well-defined interfaces Usually accessible over the network Does not allow direct access to its internal database Application Server DB Service Y Service X E.g., JBoss, Tomcat/Axis, Websphere e.g., DB2, Oracle, MySQL Service Req. (XML) DB Queries Service Req. Service Response DB Response Service Response (XML)
Application Server Scalability techniques applied to service hosting DB Service Y Service X Application Server Application Server Application Server Useful for compute- intensive services (e.g., page generators)
Application Server Scalability techniques applied to service hosting DB Service Y Service X Response Cache Response Cache Cache service Responses Reduces load on application (if hit ratio is good)
Application Server Scalability techniques applied to service hosting DB Service Y Service X DB Caches DB Cache Reduces DB load (if hit ratio is good) Cache Query Results e.g., IBM’s DBCache, GlobeCBC
Application Server Scalability techniques applied to service hosting DB Service Y Service X Response Cache Response Cache Response Cache Useful if other service is across WAN or does not meet SLA Reduces response time
Scalability techniques applied to service hosting DB Response Cache DB Cache Application Server DB Cache Response Cache Response Cache Response Cache Application Server Resource provisioning for a service Wide variety of techniques at different tiers to consider What is the right (set of) technique(s) for a given service? Depends on: locality, update workload, code execution time, query time, external service dependencies Too many parameters for an administrator to manage! Can we automate it (at least to a large extent)?
Autonomic Hosting: Initial Objective “To find the minimum set of resources to host a given service such that its end-to-end latency is maintained between [Lat min, Lat max ].” We pose it as: “To find the minimum number of resources (servers) to provision in each tier for a service to meet its SLA”
Proposed Approach Get a model of end-to-end latency Lat = f(hr server, t App, hr cli, t db, hr dbcache,ReqRate) hr = hit ratio, t = execution time f – Latency modeling function Little’s law based network of queues MVA (mean value analysis) on network of queues Or other models?
Proposed Approach (contd..) Fit a service to the model Parameters such as execution time can be obtained Log analysis, server instrumentation Estimating hr at different tiers is harder Request patterns and update patterns vary Fluid-based cache models assume infinite cache memory Need a technique that predicts hr for a given cache size
Virtual Caches Virtual cache (VC) – means to predict hr Cache that stores just the meta-data [Wong et.al., 2002] Takes original request &update stream to compute hr Smaller footprint Can be added in different tiers such as App servers, Client stubs, JDBC drivers. What will be hr if another server with memory d is added to a cache pool with M memory? Run a VC with M+d memory A VC with M-d memory gives hr when a server is removed. Running VC for distributed caches N caches servers, each with M memory Run VC in each server with M + M/N memory => Avg. hr when a new server is added
Resource Provisioning To provision a service Obtain ( hr & t) values from different tiers of service Estimate latency for different resource configurations Find the best configuration that meets its latency SLA For a running service If SLA is violated, find the best tier to add a server Switching time? Addition of servers take time (e.g., cache warm up, reconfiguration) Right now, assumed negligible Need to investigate prediction algorithms
Current Status & Limitations Goal: To build an autonomic hosting platform for Multi-tier internet applications Multi-queue model w/ online-cache simulations has been a good start Prototyped with Apache, Tomcat/Axis, MySQL Integrating with our CDN, Globule Experiments with TPC-App -> encouraging Experimented with other services Current Work Refining Queueing Models for accurate latency estimation Investigating availability issues
Discussion Points Utilization based SLAs Other prediction models Does cache behavior vary with req. rate? Failures How to provision for availability targets? Multiple service classes
Availability-aware provisioning To provision for a required up-time Must consider MTTF and MTTR for servers in each tier Caches have different MTTR than AppServers How to provision? Strategy 1 Perform latency-based provisioning. For each tier, add additional resources to reach target uptime Strategy 2 Formulate as a dual-constrained optimization problem.
Dynamic Provisioning For handling dynamic load changes Need to predict workload changes Allows us to be prepared earlier Adding/reconfiguring servers take time Prediction window should be greater than server addition time Load prediction is relatively well understood Prediction of temporal effects?
Thank You! More info: http://www.globule.org Questions?