Presentation is loading. Please wait.

Presentation is loading. Please wait.

Achieving Elasticity for Cloud MapReduce Jobs Khaled Salah IEEE CloudNet 2013 – San Francisco November 13, 2013.

Similar presentations


Presentation on theme: "Achieving Elasticity for Cloud MapReduce Jobs Khaled Salah IEEE CloudNet 2013 – San Francisco November 13, 2013."— Presentation transcript:

1 Achieving Elasticity for Cloud MapReduce Jobs Khaled Salah khaled.salah@kustar.ac.ae IEEE CloudNet 2013 – San Francisco November 13, 2013

2 p2 Outline r Background and motivation r Uses cases of our analytical model r Analytical model r Derived performance metrics r Numerical results r Conclusions and future work

3 p3 Background and Motivation r MapReduce is a popular paradigm that can parallelize large data processing on cloud clusters. r MR paradigm is a key enabler for Big Data analytics r MR Jobs – e.g. web search engine requests r In cloud computing, a critical research problem is how to achieve elasticity for MR jobs as the workload conditions change over time.

4 p4 Elasticity r Elasticity is how fast the cloud responds (or autoscales) to a given workload to reach perfect capacity.  Overprovisioning D(t) < R(t)  Underprovisioning D(t) > R(t)  Perfect Provisioning D(t) = R(t)

5 p5 MapReduce Jobs

6 p6 Usefulness of our model (1/2) r In elasticity and autoscaling: given workload conditions, we can estimate the required number of VMs to meet the SLO delay requirements  And not by trial and error  CPU utilization can be misleading r Determine the required slave nodes required to execute MR jobs

7 p7 Usefulness of our model (2/2) r In call admission  To accept or deny cloud requests based on meeting the SLO delay  Available compute resources are not enough r Estimating the end-to-end delay for elastic MR jobs

8 p8 Typical Cloud Datacenter Architecture

9 p9 M/G/1/K Queueing Model

10 p10 M/G/1/K

11 p11 Analysis Approach r The challenge in analyzing such a queueing system is to compute or the PDF of the generally distributed random variable X representing the service times r The mean service time E[X] r Then, the second stage random service time B for these N parallel workers can be expressed as r E[B] can be expressed as

12 p12 Analysis Approach r For the Reducer stage, r E[R] can be expressed as r Therefore, the mean service time E[X]

13 p13 Performance r Given:  Incoming load  JS and service rates for each mapper & reducer  Queue size r Formulas for:  Response time  Throughput  Loss probability

14 p14 Numerical Example r We fix the system size K to 100 requests. We fix r depends on two factors: (1) m-- the number of mapper per node, and (2) the execution speed of each node.  If we assume a reducer takes 500 ms to be executed on a single node, and with homogenous splitting, then ms.

15 p15 Numerical Example r Similarly, depends on two factors: (1) n-- the number of mapper per node, and (2) the execution speed of each node.  If we assume a reducer takes 100 ms to be executed on a single node, and with homogenous splitting, then ms. r For autoscaling, we assume that the mappers and reducers always autoscale with a ratio of 2:1. That is, one reducer is needed for two mappers, or

16 p16 Service Delay vs. Workload

17 p17

18 p18

19 p19

20 p20 Concluding Remarks r We presented analytical model to estimate the minimum number of cloud resources required for executing MapReduce jobs on the cloud r Closed-form solutions were derived for key SLO performance metrics such as response time, blocking probability, and throughput. r Simulation results show that our analytical model is correct. r Future work will be on implementation

21 p21 Thank you! khaled.salah@kustar.ac.ae

22 p22 Q&A khaled.salah@kustar.ac.ae


Download ppt "Achieving Elasticity for Cloud MapReduce Jobs Khaled Salah IEEE CloudNet 2013 – San Francisco November 13, 2013."

Similar presentations


Ads by Google