1 Budapest University of Technology and Economics Department of Measurement and Information Systems Computing as a utility? SLA as Big Data? András Pataricza.

Slides:



Advertisements
Similar presentations
Ali Ghodsi UC Berkeley & KTH & SICS
Advertisements

Autonomic Scaling of Cloud Computing Resources
Brannan MathersonProduct Marketing Manager Symon PerrimanSenior Technical Evangelist.
1 Budapest University of Technology and Economics Department of Measurement and Information Systems Cloud Based Analytics for Cloud Based Applications.
AMI & Grid Data Analytics & Analysis Management Platform Page  1 What does this platform offer? Our tool is a next generation grid management software.
1 Characterization of Software Aging Effects in Elastic Storage Mechanisms for Private Clouds Rubens Matos, Jean Araujo, Vandi Alves and Paulo Maciel Presenter:
G. Alonso, D. Kossmann Systems Group
Ira Cohen, Jeffrey S. Chase et al.
Measures of Variation Sample range Sample variance Sample standard deviation Sample interquartile range.
Information System Economics IT PROJECT MANAGEMENT - revisited.
CLOUD COMPUTING AN OVERVIEW & QUALITY OF SERVICE Hamzeh Khazaei University of Manitoba Department of Computer Science Jan 28, 2010.
Green Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of Science and Technology,
CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, John Wilkes.
Making Services Fault Tolerant
Visual Solution to High Performance Computing Computer and Automation Research Institute Laboratory of Parallel and Distributed Systems
A General approach to MPLS Path Protection using Segments Ashish Gupta Ashish Gupta.
Reliability on Web Services Pat Chan 31 Oct 2006.
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Allocation for Shared Data Centers Using Online Measurements.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Virtualization in Data Centers Prashant Shenoy
MS CLOUD DB - AZURE SQL DB Fault Tolerance by Subha Vasudevan Christina Burnett.
24 February 2015 Ryota Mibu, NEC
1 Making Services Fault Tolerant Pat Chan, Michael R. Lyu Department of Computer Science and Engineering The Chinese University of Hong Kong Miroslaw Malek.
AN INTRODUCTION TO CLOUD COMPUTING Web, as a Platform…
Failure Avoidance through Fault Prediction Based on Synthetic Transactions Mohammed Shatnawi 1, 2 Matei Ripeanu 2 1 – Microsoft Online Ads, Microsoft Corporation.
Undergraduate Poster Presentation Match 31, 2015 Department of CSE, BUET, Dhaka, Bangladesh Wireless Sensor Network Integretion With Cloud Computing H.M.A.
N. GSU Slide 1 Chapter 04 Cloud Computing Systems N. Xiong Georgia State University.
TASK ADAPTATION IN REAL-TIME & EMBEDDED SYSTEMS FOR ENERGY & RELIABILITY TRADEOFFS Sathish Gopalakrishnan Department of Electrical & Computer Engineering.
ATIF MEHMOOD MALIK KASHIF SIDDIQUE Improving dependability of Cloud Computing with Fault Tolerance and High Availability.
TECHNOLOGY GUIDE THREE Emerging Types of Enterprise Computing.
Section 11.1 Identify customer requirements Recommend appropriate network topologies Gather data about existing equipment and software Section 11.2 Demonstrate.
Chapter 1 Descriptive Analysis. Statistics – Making sense out of data. Gives verifiable evidence to support the answer to a question. 4 Major Parts 1.Collecting.
Introduction To Windows Azure Cloud
Schneider Electric 1 Foundational BUSINESS OVERVIEW Rev 2 Virtualization and physical infrastructure – Integrations with Microsoft and VMware.
Felix Cuadrado Teaching: Big Data Processing (QMUL) Internet.
Ao-Jan Su, David R. Choffnes, Fabián E. Bustamante and Aleksandar Kuzmanovic Department of EECS Northwestern University Relative Network Positioning via.
Click to add text TWA Cloud Integration with Tivoli Service Automation Manager TWS Education.
Managing Service Metadata as Context The 2005 Istanbul International Computational Science & Engineering Conference (ICCSE2005) Mehmet S. Aktas
Outlier Detection Using k-Nearest Neighbour Graph Ville Hautamäki, Ismo Kärkkäinen and Pasi Fränti Department of Computer Science University of Joensuu,
Motorola Internal Use Only Evaluating Erlang for Robust Telecoms Software David King 2004 S 3 Symposium – Henry Nystrom, Phil Trinder, David King.
Young Suk Moon Chair: Dr. Hans-Peter Bischof Reader: Dr. Gregor von Laszewski Observer: Dr. Minseok Kwon 1.
Challenges towards Elastic Power Management in Internet Data Center.
Quality of Service Karrie Karahalios Spring 2007.
Urban Infrastructure and Its Protection Responding to the Unexpected Interest Group Report Group Members G. Giuliano (USC), Jose Holguin-Veras (CUNY),
“Trusted Passages”: Meeting Trust Needs of Distributed Applications Mustaque Ahamad, Greg Eisenhauer, Jiantao Kong, Wenke Lee, Bryan Payne and Karsten.
Information Technology Needs and Trends in the Electric Power Business Mladen Kezunovic Texas A&M University PS ERC Industrial Advisory Board Meeting December.
ICDCS 2014 Madrid, Spain 30 June-3 July 2014
t Test Assumptions: Paired/dependent/matched Samples:
TECHNOLOGY GUIDE THREE Emerging Types of Enterprise Computing.
Urban Infrastructure and Its Protection Responding to the Unexpected Interest Group Report.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #25 Dependable Data Management.
WHAT IS RESEARCH? According to Redman and Morry,
© 2013 IBM Corporation 1 Title of presentation goes Elisa Martín Garijo IBM Distinguish Engineer and CTO for IBM Spain. Global Technology.
Grand Challenge: Glitch Free Real-Time Communication Jin Li Research Manager/Principal Researcher Microsoft Research NITRD Workshop on Complex Engineered.
Survey of Smart Grid concepts and demonstrations Smart substation Ari Nikander.
INTRODUCTION About Project: About Project: Our project is based of the technology of cloud computing which is offering many pro’s to the world of computers.
Simulation-based inference beyond the introductory course Beth Chance Department of Statistics Cal Poly – San Luis Obispo
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.
Virtualization of Infrastructure as a Service (IaaS): Redundancy Mechanism of the Controller Node in OpenStack Cloud Computing Platform BY Shahed murshed.
Use relational database as a service
Business Continuity Robert Hedblom | sumNERV John Joyner | ClearPointe
Analyzing Security and Energy Tradeoffs in Autonomic Capacity Management Wei Wu.
Cloud Database Based on SQL Server 2012 Technologies
Supporting Fault-Tolerance in Streaming Grid Applications
SocialBoards Self-Service, Multichannel Support Ticket Notifications in Microsoft Office 365 Groups Help Customer Care Teams to Provide Better Care OFFICE.
Protect | Transform | Innovate
Cloud Based Analytics for Cloud Based Applications
Some Nonparametric Methods
with Raul Castro Fernandez* Matteo Migliavacca+ and Peter Pietzuch*
Reliable Web Services: Methodology, Experiment and Modeling International Conference on Web Services (ICWS 2007) Pat. P. W. Chan, Michael R. Lyu Department.
Presentation transcript:

1 Budapest University of Technology and Economics Department of Measurement and Information Systems Computing as a utility? SLA as Big Data? András Pataricza with contributions of Imre Kocsis 1, Ágnes Salánki 1, Zsolt Kocsis 2 1 Dept. of Measurement and Information Systems, BME 2 IBM Hungary

2 Cloud computing vs. electricity Origins: do it yourself Utility 2 Individual production Capacity=Peak workload Low utilization Efficient production Peak power station Redundancy, control Worlkloaqd mgmt, Monitoring Control Interconnection SLA Cloud computing  Electric power  computing power  Power station  Server  Power line  Internet  Power distribution, control, protection, measurement

3 Virtualization

4 Reactivity vs. proactivity  Reactive control o „acting in response to a situation rather than creating or controlling it:”  Proactive control o „controlling a situation rather than just responding to it after it has happened:” 4

5 Standard infrastructure for demanding applications? Monaco Grand Prix  COTS infrastructure  economic efficiency Isolated + monitored high SLA (barriers, road quality)  dependability Power supply to an ER COTS energy source replicated  economic efficiency Warm backup (diesel generator)  HA for critical processes

6 Clouds for demanding applications? Virtual Desktop Infrastructure Telecommunications Extra-functional reqs: throughput, timeliness, availability „Small problems”  high impact ? Extra-functional reqs: throughput, timeliness, availability „Small problems”  high impact ?

7 Dangers in a standard cloud for demanding apps? What are the impacts of factors not covered by an SLA Example: performability attacks

8 Impact of platform faults (perfomability domain) I’ am well-protected soft real-time Protected Protected?

9 Experimental setup

10 Exploratory data analysis (EDA)  Approach to analyzing data sets o Summary of their main characteristics in Easy-to-understand, often: visual graphs without using statistical model or having formulated a hypothesis.  Dynamic visualization capabilities o Identification of outliers, trends and patterns  Two developments in statistics to reduce the sensitivity of inferences to errors : o Robust statistics - low sensitivity to outliers o Nonparametric statistics – no a priori assumptions

11 Five number summary of numerical data Media n Max. Min. Qualitative domains Quantitative domains Extr. small SmallExtr. largeLargeNormal Small Normal Large Extr. large Extr. small Q1 – 1.5 IQR Q1 Q3 25 % 50 % 75 % 0 % 100 %

12 Example: Robust and non-paramteric measures  Sample set o 1000 points o U(1, 5) unif. Distribution mean = median = 3 ms 3ms ± 2 ms Response time Resp. t. median Resp. t. mean 1 sample of 20 sec New median: sort(resp. times)[501] = 3.02 ms New mean: (2 * 10^4 + 3 * 10^3 )/ 1001 = 25 ms! Robust  Non-robust

13 Measurement of output QoS Platform SL Time

14 The noisy neighbour problem Hypervisor Tenant Neighbor

15 Same, but different? Normal workload Denial of Service attack Overload Noisy neighbour

16 Short transient faults – long recovery 8 sec platform overload 30 sec service outage 120 sec SLA violation As if you unplug your desktop for a second...

17 Variance tolerable by overcapacity Performance outage intolerable by overcapacity You don’t need enemies: Deterministic (?!) run-time in the public cloud...

18 Let’s try it at user level Poor observability of the platform necessitates self-*

19 Mystery shopper & service QoS VM internal fault Mystery shopper Main application Fast detection Reaction time window Reaction time window Noisy neighbour fault Application failure

20 Summary  Technical o SLA coverage: missing aspects o Can be (somewhat) compensated Cheap computing power -> redundancy „Double” autonomic computing – Cloud level – provider – Application level – user  Methodology o Visual exploratory data analysis for insight o Algorithmic analysis for proofs and evaluation  Fault-tolerance design patterns revisited o Cheap redundancy in the cloud  I. Kocsis, A. Pataricza et al.: Analytics of Resource Transients in Cloud Based Applications Int. Journal of Cloud Computing