© 2009 IBM Corporation 1 Improving Consolidation of Virtual Machines with Risk-aware Bandwidth Oversubscription in Compute Clouds Amir Epstein Joint work.

Slides:



Advertisements
Similar presentations
Truthful Mechanisms for Combinatorial Auctions with Subadditive Bidders Speaker: Shahar Dobzinski Based on joint works with Noam Nisan & Michael Schapira.
Advertisements

On allocations that maximize fairness Uriel Feige Microsoft Research and Weizmann Institute.
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Design of the fast-pick area Based on Bartholdi & Hackman, Chpt. 7.
Minimum Clique Partition Problem with Constrained Weight for Interval Graphs Jianping Li Department of Mathematics Yunnan University Jointed by M.X. Chen.
Hadi Goudarzi and Massoud Pedram
Fast Algorithms For Hierarchical Range Histogram Constructions
Online Scheduling with Known Arrival Times Nicholas G Hall (Ohio State University) Marc E Posner (Ohio State University) Chris N Potts (University of Southampton)
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
Integration of sensory modalities
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
Greedy vs Dynamic Programming Approach
Bin Packing With Fragile Objects Nikhil Bansal (CMU) Joint with Zhen Liu (IBM) & Arvind Sankar(MIT)
Interval packing problem Multicommodity demand flow in a line Jian Li Sep
Parameterized Approximation Scheme for the Multiple Knapsack Problem by Klaus Jansen (SODA’09) Speaker: Yue Wang 04/14/2009.
Evaluating Hypotheses
1 Combinatorial Dominance Analysis The Knapsack Problem Keywords: Combinatorial Dominance (CD) Domination number/ratio (domn, domr) Knapsack (KP) Incremental.
CISS Princeton, March Optimization via Communication Networks Matthew Andrews Alcatel-Lucent Bell Labs.
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley Asynchronous Distributed Algorithm Proof.
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
Distributed Combinatorial Optimization
Experimental Evaluation
Online Auctions in IaaS Clouds: Welfare and Profit Maximization with Server Costs Xiaoxi Zhang 1, Zhiyi Huang 1, Chuan Wu 1, Zongpeng Li 2, Francis C.M.
Multipath Routing Algorithms for Congestion Minimization Ron Banner and Ariel Orda Department of Electrical Engineering Technion- Israel Institute of Technology.
Lecture II-2: Probability Review
Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.
1 Machine Learning: Lecture 5 Experimental Evaluation of Learning Algorithms (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
Asaf Cohen (joint work with Rami Atar) Department of Mathematics University of Michigan Financial Mathematics Seminar University of Michigan March 11,
Simulation Output Analysis
Throughput Competitive Online Routing Baruch Awerbuch Yossi Azar Serge Plotkin.
Network Aware Resource Allocation in Distributed Clouds.
Round and Approx: A technique for packing problems Nikhil Bansal (IBM Watson) Maxim Sviridenko (IBM Watson) Alberto Caprara (U. Bologna, Italy)
Approximation schemes Bin packing problem. Bin Packing problem Given n items with sizes a 1,…,a n  (0,1]. Find a packing in unit-sized bins that minimizes.
Design Techniques for Approximation Algorithms and Approximation Classes.
1 Sampling Distributions Lecture 9. 2 Background  We want to learn about the feature of a population (parameter)  In many situations, it is impossible.
© 2010 AT&T Intellectual Property. All rights reserved. AT&T and the AT&T logo are trademarks of AT&T Intellectual Property. Case Studies: Bin Packing.
Packing Rectangles into Bins Nikhil Bansal (CMU) Joint with Maxim Sviridenko (IBM)
EE 685 presentation Utility-Optimal Random-Access Control By Jang-Won Lee, Mung Chiang and A. Robert Calderbank.
Approximation Schemes Open Shop Problem. O||C max and Om||C max {J 1,..., J n } is set of jobs. {M 1,..., M m } is set of machines. J i : {O i1,..., O.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley.
Sampling and estimation Petter Mostad
Frequency Capping in Online Advertising Moran Feldman Technion Joint work with: Niv Buchbinder,The Open University of Israel Arpita Ghosh,Yahoo! Research.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
A Unified Continuous Greedy Algorithm for Submodular Maximization Moran Feldman Roy SchwartzJoseph (Seffi) Naor Technion – Israel Institute of Technology.
Non-Preemptive Buffer Management for Latency Sensitive Packets Moran Feldman Technion Seffi Naor Technion.
The bin packing problem. For n objects with sizes s 1, …, s n where 0 < s i ≤1, find the smallest number of bins with capacity one, such that n objects.
Matroids, Secretary Problems, and Online Mechanisms Nicole Immorlica, Microsoft Research Joint work with Robert Kleinberg and Moshe Babaioff.
Chance Constrained Robust Energy Efficiency in Cognitive Radio Networks with Channel Uncertainty Yongjun Xu and Xiaohui Zhao College of Communication Engineering,
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
A stochastic scheduling algorithm for precedence constrained tasks on Grid Future Generation Computer Systems (2011) Xiaoyong Tang, Kenli Li, Guiping Liao,
Approximation Algorithms based on linear programming.
1 Chapter 5 Branch-and-bound Framework and Its Applications.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
BIN SORTING Problem Pack the following items in bins of size Firstly, find the lower bound by summing the numbers to be packed.
Confidence Intervals Cont.
Server Allocation for Multiplayer Cloud Gaming
ISP and Egress Path Selection for Multihomed Networks
The Price of information in combinatorial optimization
Computability and Complexity
The Subset Sum Game Revisited
Effective VM Sizing in Virtualized Data Centers
Summarizing Data by Statistics
A new and improved algorithm for online bin packing
Integration of sensory modalities
Feifei Li, Ching Chang, George Kollios, Azer Bestavros
Selfish Load Balancing
Mathematical Foundations of BME Reza Shadmehr
Bin Packing Michael T. Goodrich Some slides adapted from slides from
Presentation transcript:

© 2009 IBM Corporation 1 Improving Consolidation of Virtual Machines with Risk-aware Bandwidth Oversubscription in Compute Clouds Amir Epstein Joint work with David Breitgand

© 2009 IBM Corporation 2 Motivation  Network Bandwidth is a critical Data Center resource  Network Bandwidth may become a bottleneck for consolidation  Accurate and efficient network bandwidth demand estimation is difficult  Common practice: fully provision for peak loads  Consequences: resource waste

© 2009 IBM Corporation 3 Full Provisioning VS. Multiplexing  The aggregate demand of VMs may be much smaller than the sum of the maximum demand of each VM: ∑ i max t d i (t) >> max t ∑ i d i (t) Max(VM1)+Max(VM2)=110

© 2009 IBM Corporation 4 Full Provisioning VS. Multiplexing Max(VM1+VM2)=71 < Max(VM1)+Max(VM2)=110

© 2009 IBM Corporation 5 Statistical Multiplexing  Consider each VM dynamic bandwidth demands as a random variable  Consider the aggregate bandwidth demand which is a sum of the random variables representing VMs Bandwidth demands  As the number of VMs increases: –The ratio between standard deviation of the aggregate bandwidth demand and the mean decreases

© 2009 IBM Corporation 6 Overcommit  Cloud provider aims at improving cost-efficiency  Overcommit resources using statistical multiplexing  Our focus is bandwidth

© 2009 IBM Corporation 7 Stochastic Bin Packing Problem (SBP)  S={X 1,…, X n } – Set of items  X i – random variable representing the size (bandwidth demand) of item i  p – overflow probability  Goal: Partition the set S into the smallest number of subsets (bins) S 1,…,S k such that p represents a probabilistic SLA / policy

© 2009 IBM Corporation 8 SBP with Normal Distribution  We assume that each item i independently follows normal distribution N( μ i, σ i 2 ).  When σ i,=0, for all i, then X i = μ i and the problem reduces to the classical bin packing problem  The focus of this work is SBP with normal variables

© 2009 IBM Corporation 9 Related Work – Bin Packing  The problem is NP-hard  Bin packing is hard to approximate to a factor better than 3/2 unless P=NP.  First Fit Decreasing (FFD) has asymptotic approximation ratio of 11/9 and (absolute) approximation ratio of 3/2.  MFFD algorithm has asymptotic approximation ratio of 71/60.  AFPTAS exists.  Online bin packing – First Fit (FF) has competitive ratio of 17/10. – Best upper and lower bounds are and , respectively.

© 2009 IBM Corporation 10 Related Work – Stochastic Bin Packing  -approximation for SBP with Bernoulli variables [Kleinberg et. al 1997]  SBP with Poisson, Exponential and Bernoulli variables [Goel and Indik 1999] – PTAS exists for Poisson and exponential distributions. – Quasi-PTAS exists for Bernoulli variables. – These results relax bin capacity and overflow probability constraints by a factor 1+ε.  - competitive algorithm for SBP with normal variables [Wang et. al 2011]

© 2009 IBM Corporation 11 Our Results  2-approximation algorithm for SBP with normal variables  (2+ε)-competitive algorithm for online SBP with normal variables  Observe the existence of a dual PTAS for SBP with normal variables.

© 2009 IBM Corporation 12 Definitions  Definition: The effective load of bin j is where and the quantile function is the inverse function of the CDF Ф of N(0,1).  Observation: A packing is feasible for a given overflow probability p iff for every bin j, The load of bin j is normally distributed with mean and variance

© 2009 IBM Corporation 13 Simple solution approach  Reduce the problem to the classical bin packing problem with item sizes, thus  A feasible solution to the classical bin packing problem is a feasible solution SBP, since  The optimum for the classical bin packing instance with the new sizes may be significantly larger than the optimum for SBP.

© 2009 IBM Corporation 14 Effective Size   Thus, the effective size of item i on bin j can be viewed as

© 2009 IBM Corporation 15 Approximation Algorithm Algorithm 1: First Fit VMR decreasing  Order the items in non-increasing order of VMR  Place the next item in the first bin into which it can be feasibly packed  If no such bin exists, open a new bin to pack this item Variance to Mean Ratio (VMR) is

© 2009 IBM Corporation 16 Approximation Algorithm Theorem 1: Algorithm 1 is a 2-approximation algorithm for SBP with normal variables.

© 2009 IBM Corporation 17 Integer Program for SBP

© 2009 IBM Corporation 18 Mathematical Program Relaxation

© 2009 IBM Corporation 19 Fractional Algorithm (Algorithm 2)  Order the items in non-increasing order of VMR  Place the next item in the bin with remaining capacity. If the item causes an overflow to the bin, assign maximum fraction of this item to the bin. Then, open a new bin to pack the remaining part of this item. Variance to Mean Ratio (VMR) is

© 2009 IBM Corporation 20 Analysis Lemma: There exists a feasible solution to the MP with the following property. For any pair of items k,l and a pair of bins i 0 and x li >0, then d l ≥ d k. Observation: Fractional algorithm produces a feasible fractional solution to the MP.  This implies that collocating items with high VMR (bursy) minimizes the total effective size of the items Variance to Mean Ratio (VMR) is

© 2009 IBM Corporation 21 Proof Outline  Consider a feasible solution to the MP with lexicographically maximal standard deviation (STD) vector of the bins S=(S 1,…,S m ), where  Assume by contradiction that the items are not packed into the bins according to non-increasing order of VMR  Thus, there exists at least one pair of items that are not placed in this order (i.e., item with smaller VMR is packed to a bin with smaller index than the other item).  We show that we can exchange fractions of these items between the bins, such that –the new solution is feasible –The STD vector of the bins in the solution is lexicographically greater than the one in the original solution  Contradiction

© 2009 IBM Corporation 22 Online Algorithm  VMR  Let  Class 0:  Class 1≤k≤C:  Class C+1:

© 2009 IBM Corporation 23 Online Algorithm Algorithm 3:  Classify next item according to the VMR classes  Place the next item in the first bin of its class into which it can be feasibly packed  If no such bin exists, open a new bin to pack this item Theorem 2: Algorithm 3 is a (2+O(ε))-approximation algorithm for SBP with normal variables.

© 2009 IBM Corporation 24 Simulation Study  Compare our proposed algorithms to previous reported ones  Data set –Real trace from production data center used to compute mean and standard deviation of bandwidth consumption of 6000 VMs over a few hours period. –Synthetic traces with statistical properties similar to those of the real traces

© 2009 IBM Corporation 25 Algorithms  Algorithms 1-3  First Fit (FF) with deterministic item sizes μ i +βσ i  First Fit Decreasing (FFD) with deterministic item sizes μ i +βσ i  Group Packing (GP) [Wang et. al 2011] For the online algorithms (Algorithm 3 and Group Packing), we set ε=0.1.

© 2009 IBM Corporation 26 Real Instance (Online) (Approx.) (L.B)

© 2009 IBM Corporation 27 Real Instance (L.B) (Approx.) (Online)

© 2009 IBM Corporation 28 Real Instance (L.B) (Approx.) (Online)

© 2009 IBM Corporation 29 Online Algorithms  Large synthetic instances 8% 9%

© 2009 IBM Corporation 30 Summary  We studied SBP under the assumption that virtual machines bandwidth demand obeys normal distribution  We showed a 2-approximation algorithm  We showed (2+ε)-competitive algorithm  We observed the existence of a dual PTAS for SBP  We studied the performance and applicability of our algorithms using synthetic and real data  The performance evaluation showed that our proposed algorithms considerably reduce the number of bins compared to the best known algorithms for the problem