Presentation on theme: "By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and"— Presentation transcript:
1 Shared Resource Monitoring and Throughput Optimization in Cloud-Computing Datacenters By- Jaideep Moses,Ravi Iyer , Ramesh Illikkal andSadagopan Srinivasan
2 AbstractDatacenters employ server consolidation to maximize the efficiency of platform resource usage.Impacts on their performance.Focus:Use of shared resource monitoring toUnderstand the resource usage.Collect resource usage and performance.Migrate VMs that are resource-constrained.Result :To improve overall datacenter throughput and improveQuality of Service (QoS)
3 Focus Monitor and address shared cache contention. Propose a new optimization metric that captures the priority of the VM and the overall weighted throughput of the datacenterConduct detailed experiments emulating data center scenarios including on-line transaction processing workloads.Results:Monitoring shared resource contention is highly beneficial to better manage throughput and QoS in a cloud-computing datacenter environment.
4 Keyword Benchmark- TPCC, SPECjAppServer, SPECjbb, PARSEC VirtualizationLLC – Last Level CacheShared-cacheCMP – Chip MultiprocessingCache ContentionVirtual Platform ArchitectureMPI – Misses Per InstructionIPC – Instruction Per Cycle
5 Outline Introduction Background and Motivation Proposed Approach SimulationRelated WorkSummary and Conclusions.
6 IntroductionEvolved data center with large number of heterogeneous applications running within virtual machines on each Platform.VsphereService Level Agreements.Key Aspects:Shared Resource MonitoringVM MigrationQoS and Datacenter Throughput
7 ContributionA simple methodology of using cache occupancy for shared cache environment.New optimization metric that captures QoS as part of the throughput measure of the datacenter.Detailed Experiments emulating data center scenario resulting in improvement in QoS and throughput.Work is unique as it addresses application/VM scheduling in the context of SLAs.Management of the shared cache occupancy.Focus on shared cache contention which has first-order impact on performance.LLC monitoring.
9 Background and Motivation Cloud-computing virtualized datacenters of the future will have machines that are based on CMP architecture with multiple cores sharing the same LLC.Measured the performance of Intel’s latest Core2 Duo platform when running all 26applications (in Windows XP) from the SPEC CPU2000 benchmark suite individually and in pair-wise mode .
12 TPCC performance while co-running with other workloads on same shared LLC
13 Proposed MIMe Approach Key components :Mechanism used to monitor VM resource usage and identifying VMs that suffer due to resource contention.Techniques used to identify candidate VMs for migration based on priorities and their behavior to achieve improved weighted throughput and determinism across priorities.A metric that quantifies the goodness/efficiency of the datacenter as weighted throughput measure.
14 MIMe Key components to improve the efficiency of datacenter weighted throughput
17 Identifying VM candidates for migration Two key factors :VMs priority as agreed upon by an SLABehavior. E.g. cache sensitivity .Example scenarios wherein an application like TPCC can exhibit a huge variation in performance depending on co-scheduled application.
18 The basic algorithm to identify a candidate VM for migration
19 GoalNo VM of interest that has a higher priority but runs less efficiently than a VM of lower priority after the migrations.The whole process is cyclic which ensures that workload phases change or changes in SLAs with customers can be addressed with ease
20 Metric to quantify the efficiency of a datacenter Measure - Total System IPC.Benchmarking propose a Vconsolidate concept using weights associated to workload performance.Weighted normalized performance metricOur New metric that would incorporate the QoS value as part of the throughput measure :QoS-Weighted throughput performance metric
21 RESULTS AND ANALYSISSimulation based methodology that uses CMPSched$im - Parallel multi-core performance simulator.Utilizes the Pin binary instrumentation system to evaluate the performance of single-threaded, multi-threaded, and multi- programmed workloads on a single/multi-core processor .Dynamically feeds instructions and memory references to the simulator.Modified to be used as a trace-driven simulator.Server workload traces for TPCC, Specjbb, SPECjAppServer, indexing workload and parsec .Result:In the absence of any type of enforcement mechanisms being available in the hardware to control the cache occupancy, we have to rely only on monitoring information to make scheduling decisions.
23 After Migration TPCC IPC and Occupancy with QoS values
24 Effect of minimizing contention for HP applications
25 Mean IPC after VM migration for reducing cache contention for HP applications
26 Mean IPC after VM migration for reducing cache contention for HP applications
27 Experiment ResultLogically clustering identical machines together, then applying the migration policy.The overall scrore increases 8% for TPCC workloads.With SjappServer workloads the increase is 4.5%
28 RELATED WORKAll other study have focused on a single machine not virtualized environments.Recently, a few studies like - Cherkasova and Enright Jerger and have focused on sharing in caches and for better scheduling policies.We show how identical machines can be logically clustered, and based on VPA monitoring how higher priority applications that we care about are always guaranteed to get more platform resources (cache) than lower priority applications.We also propose a new metric that incorporates QoS as the throughput measure
29 CONCLUSIONProblem of contention in the shared cache is a critical problem in virtualized cloud computing data centers.High priority applications can suffer if scheduling at a data center level is not done with cache contention in mind.How it can be solved without waiting for enforcement mechanisms to be available in the shared LLC .A very simple solution based on a VPA architecture
30 Future WorkIncorporating memory bandwidth as part of the VPA architecture.Scheduling optimizations.Profiling of VMs to take scheduling decisions.Monitoring and Enforcement for cache, memory bandwidth and also power can be very efficiently used