Mihai Neacşu, BSc. Prof.dr.eng. Alexandru Iosup Ir. Laurens Versluis

Slides:



Advertisements
Similar presentations
Pricing for Utility-driven Resource Management and Allocation in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS)
Advertisements

Autonomic Scaling of Cloud Computing Resources
Achieving Elasticity for Cloud MapReduce Jobs Khaled Salah IEEE CloudNet 2013 – San Francisco November 13, 2013.
SLA-Oriented Resource Provisioning for Cloud Computing
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Proactive Prediction Models for Web Application Resource Provisioning in the Cloud _______________________________ Samuel A. Ajila & Bankole A. Akindele.
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.
SLA-aware Virtual Resource Management for Cloud Infrastructures
04/10/04Team Fusion: Tim Fox, Tom Maher, Alala Sundaram, Andrew Szleper The Real New Economy.
Copyright © 2010 Platform Computing Corporation. All Rights Reserved. Platform Computing Ken Hertzler VP Product Management.
Integrate into existing systems with PowerShell integration modules Extend by building PS modules to enable integrating into other systems Optimize.
Online Auctions in IaaS Clouds: Welfare and Profit Maximization with Server Costs Xiaoxi Zhang 1, Zhiyi Huang 1, Chuan Wu 1, Zongpeng Li 2, Francis C.M.
1 Efficient Management of Data Center Resources for Massively Multiplayer Online Games V. Nae, A. Iosup, S. Podlipnig, R. Prodan, D. Epema, T. Fahringer,
Scheduling a Large DataCenter Cliff Stein Columbia University Google Research June, 2009 Monika Henzinger, Ana Radovanovic Google Research.
COST IC804 – IC805 Joint meeting, February Jorge G. Barbosa, Altino M. Sampaio, Hamid Harabnejad Universidade do Porto, Faculdade de Engenharia,
Capacity Scaling for Elastic Compute Clouds Ahmed Aleyeldin Hassan
Euro-Par 2007, Rennes, 29th August 1 The Characteristics and Performance of Groups of Jobs in Grids Alexandru Iosup, Mathieu Jan *, Ozan Sonmez and Dick.
® IBM Software Group © IBM Corporation IBM Information Server Understand - Information Analyzer.
Name, logo and (optional) motto of the company. Facts about the company In which industry company works? Product – what do you want to sell? (remember.
Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment.
Predicting performance of applications and infrastructures Tania Lorido 27th May 2011.
1 EuroPar 2009 – POGGI: Puzzle-Based Online Games on Grid Infrastructures POGGI: Puzzle-Based Online Games on Grid Infrastructures Alexandru Iosup Parallel.
Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer.
1. Produce a folio of tasks that demonstrate a progression in acquiring and applying programming knowledge (ie. learn Visual Basic) 2. Learn about computer.
Fault-Tolerant Workflow Scheduling Using Spot Instances on Clouds Deepak Poola, Kotagiri Ramamohanarao, and Rajkumar Buyya Cloud Computing and Distributed.
Jockey Guaranteed Job Latency in Data Parallel Clusters Andrew Ferguson, Peter Bodik, Srikanth Kandula, Eric Boutin, and Rodrigo Fonseca.
1 ROIA 2009 – CAMEO: Continuous Analytics for Massively Multiplayer Online Games CAMEO: Continuous Analytics for Massively Multiplayer Online Games Alexandru.
1 Andreea Chis under the guidance of Frédéric Desprez and Eddy Caron Scheduling for a Climate Forecast Application ANR-05-CIGC-11.
A dynamic optimization model for power and performance management of virtualized clusters Vinicius Petrucci, Orlando Loques Univ. Federal Fluminense Niteroi,
Towards Economic Fairness for Big Data Processing in Pay-as-you-go Cloud Computing Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
Green Computing Metrics: Power, Temperature, CO2, … Computing system: Many-cores, Clusters, Grids and Clouds Algorithm and model: task scheduling, CFD.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
Scalable and Coordinated Scheduling for Cloud-Scale computing
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
ANASOFT VIATUS. Challenges Supply chain optimization is necessary for achieving competitive price of final products Synchronization and utilization of.
Prediction-Based Multivariate Query Modeling Analytic Queries.
1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
The Application of Data Mining in Telecommunication by Wang Lina February 2003.
Organizations Are Embracing New Opportunities
OPERATING SYSTEMS CS 3502 Fall 2017
lecture 5: CPU Scheduling
Jacob R. Lorch Microsoft Research
Inputs and Outputs to Aggregate Production Planning
Measurement-based Design
Dagstuhl Seminar on Dark Silicon: From Embedded to HPC Feb 3, 2016
UNIT – V BUSINESS ANALYTICS
Abstract Major Cloud computing companies have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for.
Scheduling Jobs Across Geo-distributed Datacenters
PA an Coordinated Memory Caching for Parallel Jobs
AWS Batch Overview A highly-efficient, dynamically-scaled, batch computing service May 2017.
Deck 12 Accounting Information Systems Romney and Steinbart
Server Allocation for Multiplayer Cloud Gaming
Vlad Nae, Radu Prodan, Thomas Fahringer Institute of Computer Science
Cluster Resource Management: A Scalable Approach
Systems Design: Process Costing
Systems Design: Process Costing
P. (Saday) Sadayappan Ohio State University
ICSOC 2018 Adel Nadjaran Toosi Faculty of Information Technology
ANALYSIS OF USER SUBMISSION BEHAVIOR ON HPC AND HTC
Cloud Consulting Services and Solutions
From Rivulets to Rivers: Elastic Stream Processing in Heron
Wide Area Workload Management Work Package DATAGRID project
The Performance of Big Data Workloads in Cloud Datacenters
Challenges with developing a Commercial P2P System
MagnaData: Scheduling Complex Workflows with Non-Functional Requirements in Datacenters Laurens Versluis Massivizing Computer research.
Theory of Constraints An Introduction
Theory of Constraints An Introduction
XRN Enabling large scale utilisation of Class 3 – MOD0700
Inputs and Outputs to Aggregate Production Planning
Presentation transcript:

A Trace-Based Performance Study of Autoscaling Workloads of Workflows in Datacenters Mihai Neacşu, BSc. Prof.dr.eng. Alexandru Iosup Ir. Laurens Versluis Massivizing Computer Systems @Large research, https://atlarge-research.com

Cloud popularity and usage at all-time high Surveys: >$200B market by 2020, 86% of companies use >1 cloud service Resource utilization increasingly important Competitive position for companies Reduce costs for customer & provider Source: http://business.nasdaq.com/marketinsite/2017/Cloud-Computing-Industry-Report-and-Investment-Case.html

Workflow execution is gaining popularity Workflows = set of tasks with precedence constraints Usually represented as a Directed Acyclic Graph (DAG) Used to model applications in many domains Today: thousands of applications in use

Executing workflows in the cloud Workflows are submitted to the cloud, executed in datacenters Workflow resource demand changes over time due to their complex structures How many resources to acquire and when? Workload of workflows

A problem for cloud providers Minimize overprovisioning (allocating too many resources) Reduces costs Adhere to the Quality-of-Service (QoS) requirements of the client Minimize underprovisioning Automate this process Poor user estimates of resource requirements Solution: Autoscalers Scale resources on demand Forecast resource demand in the near-future Deal with sudden flash-crowds/peaks

Autoscaler operations Monitor current workload (9) and available resources (10) Attempt to predict future utilization (8) Scale resources up or down according to prediction (4) How well do autoscalers perform relative to each other?

In this work: Compare autoscalers in simulation Four distinct workload traces A workload is a set of workflows (applications) Use a rich set of metrics 10+ forms of elasticity Four experiments Different workload domains (new) Bursty workloads (deeper understanding) Impact of the allocation policy (new) Different resource environments (new) Medium scale Definitely mention that our design, we have contacted the authors of a previous work that performa real-world experimentation and obtained their dataset. We have verified our output matches theirs and added several

Experiment: different workload domains Investigate the performance of autoscalers while processing workloads of different domains Three workloads: Scientific (T1) Industrial (T2) Engineering (T3) Use the Chronos (Shell) industrial resource setup 70 resources per cluster At start, the number of clusters is scaled to reach 70% resource utilization

Results: significant differences per domain For some metrics such as overprovisioning, ConPaaS and Hist perform worse for the EE workload Autoscalers underprovision roughly equally on all workloads Make backup slide with all formulas of the metrics Maak twee bargraphs die je laat zien. + Vertel grapje “don’t worry, there are charts”.

Experiment: different allocation policies Inspect the performance of autoscalers using different allocation policies One workload: Engineering (T4) Three allocation policies WorstFit FillWorstFit BestFit Use the Chronos (Shell) industrial setup 70 resources per cluster At start, the number of clusters is scaled to reach 70% resource utilization Metrics Supplied resources We expect the results to be different, as for BestFit the number of allocated resources may be lower, which some autoscalers take into account. So we would expect differences.

Results Token overestimates the level of parallelism and oversupplies BestFit has a significant better performance than the other two policies Conclusion: the allocation policy has a direct impact on the performance. Suggests co-designing allocation and provisioning policies. Add callout why fillworstfit and worstfit are at 280 (because idle clusters receive 1 task continuously and cannot be deallocted).

Conclusion and future work We compared seven autoscalers using four distinct traces In this work four experiments: The performance differs significantly per application domain The allocation policy has a direct impact on performance All autoscalers perform similar on bursty workloads in terms of NSL Some autoscalers overprovision more while yielding no better NSL Future work: Study the impact of heterogeneity Apply several cost metrics by using e.g. cost models Experiment with job and task migrations Improve our simulations by including resource boot-up times Interested in the other two experiments or the paper? Let me know! The performance differs significantly per application domain All autoscalers perform similar on bursty workloads in terms of NSL The allocation policy has a direct impact on performance Suggests to co-design allocation and provision policies Some autoscalers overprovision more while yielding no better NSL

A Trace-Based Performance Study of Autoscaling Workloads of Workflows in Datacenters Mihai Neacşu, BSc. Prof. Dr. Ir. Alexandru Iosup Ir. Laurens Versluis Massivizing Computer Systems @Large research, https://atlarge-research.com