Tao Zhu1,2, Chengchun Shu1, Haiyan Yu1

Slides:



Advertisements
Similar presentations
SDN + Storage.
Advertisements

UC Berkeley Job Scheduling for MapReduce Matei Zaharia, Dhruba Borthakur *, Joydeep Sen Sarma *, Scott Shenker, Ion Stoica 1 RAD Lab, * Facebook Inc.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
CPU Scheduling CPU Scheduler Performance metrics for CPU scheduling
Yousi Zheng Dept. of ECE, The Ohio State University
Matei Zaharia, Dhruba Borthakur *, Joydeep Sen Sarma *, Khaled Elmeleegy +, Scott Shenker, Ion Stoica UC Berkeley, * Facebook Inc, + Yahoo! Research Delay.
1 Lecture 10: Uniprocessor Scheduling. 2 CPU Scheduling n The problem: scheduling the usage of a single processor among all the existing processes in.
GreenHadoop: Leveraging Green Energy in Data-Processing Frameworks Íñigo Goiri, Kien Le, Thu D. Nguyen, Jordi Guitart, Jordi Torres, and Ricardo Bianchini.
A Dynamic MapReduce Scheduler for Heterogeneous Workloads Chao Tian, Haojie Zhou, Yongqiang He,Li Zha 簡報人:碩資工一甲 董耀文.
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-3 CPU Scheduling Department of Computer Science and Software Engineering.
Location-aware MapReduce in Virtual Cloud 2011 IEEE computer society International Conference on Parallel Processing Yifeng Geng1,2, Shimin Chen3, YongWei.
November , 2009SERVICE COMPUTATION 2009 Analysis of Energy Efficiency in Clouds H. AbdelSalamK. Maly R. MukkamalaM. Zubair Department.
Cloud Computing Energy efficient cloud computing Keke Chen.
EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.
Lecture 2 Process Concepts, Performance Measures and Evaluation Techniques.
임규찬. 1. Abstract 2. Introduction 3. Design Goals 4. Sample-Based Scheduling for Parallel Jobs 5. Implements.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion.
1 Our focus  scheduling a single CPU among all the processes in the system  Key Criteria: Maximize CPU utilization Maximize throughput Minimize waiting.
GreenSched: An Energy-Aware Hadoop Workflow Scheduler
Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.
Using Map-reduce to Support MPMD Peng
Parallelizing Video Transcoding Using Map-Reduce-Based Cloud Computing Speaker : 童耀民 MA1G0222 Feng Lao, Xinggong Zhang and Zongming Guo Institute of Computer.
Matchmaking: A New MapReduce Scheduling Technique
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
1 Dilemmas in energy consumption, international trade and employment: Analysing the impact of embodied energy in traded goods on employment China University.
Dynamic Slot Allocation Technique for MapReduce Clusters School of Computer Engineering Nanyang Technological University 25 th Sept 2013 Shanjiang Tang,
June 30 - July 2, 2009AIMS 2009 Towards Energy Efficient Change Management in A Cloud Computing Environment: A Pro-Active Approach H. AbdelSalamK. Maly.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
Scalable and Coordinated Scheduling for Cloud-Scale computing
A Two-phase Execution Engine of Reduce Tasks In Hadoop MapReduce XiaohongZhang*GuoweiWang* ZijingYang*YangDing School of Computer Science and Technology.
Using Map-reduce to Support MPMD Peng
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion.
CPU Scheduling Andy Wang Operating Systems COP 4610 / CGS 5765.
IIS Progress Report 2016/01/11. Goal Propose an energy-efficient scheduler that minimize the power consumption while providing sufficient computing resources.
Hadoop-based Distributed Web Crawler
Non-Preemptive Scheduling
CPU SCHEDULING.
Green cloud computing 2 Cs 595 Lecture 15.
Edinburgh Napier University
PA an Coordinated Memory Caching for Parallel Jobs
CPU scheduling 6. Schedulers, CPU Scheduling 6.1. Schedulers
Process Scheduling B.Ramamurthy 9/16/2018.
Lecture 24: Process Scheduling Examples and for Real-time Systems
Flavius Gruian < >
Lottery Scheduling Ish Baid.
EECS 582 Final Review Mosharaf Chowdhury EECS 582 – F16.
Hui Chen, Shinan Wang and Weisong Shi Wayne State University
CARP: Compression Aware Replacement Policies
Andy Wang Operating Systems COP 4610 / CGS 5765
Lecture 21: Introduction to Process Scheduling
Zhen Xiao, Qi Chen, and Haipeng Luo May 2013
Chapter 2: The Linux System Part 3
CPU SCHEDULING.
Virtual-Time Round-Robin: An O(1) Proportional Share Scheduler
Process Description and Control
CPU scheduling decisions may take place when a process:
Process Scheduling B.Ramamurthy 2/23/2019.
Process Description and Control
Processes and operating systems
Lecture 21: Introduction to Process Scheduling
Hawk: Hybrid Datacenter Scheduling
CPU Scheduling David Ferry CSCI 3500 – Operating Systems
Uniprocessor scheduling
Efficient Task Allocation for Mobile Crowdsensing
CPU SCHEDULING SIMULATION
IIS Progress Report 2016/01/18.
CPU Scheduling David Ferry CSCI 3500 – Operating Systems
Presentation transcript:

Tao Zhu1,2, Chengchun Shu1, Haiyan Yu1 Ensieea Rizwani Green Scheduling: A Scheduling Policy for Improving the Energy Efficiency of Fair Scheduler By: Tao Zhu1,2, Chengchun Shu1, Haiyan Yu1

Motivation Reducing energy consumption of data centers is critical to cutting down operational costs as well minimizing its impact to the environment. On one hand, if performance per watt of server doesn’t improve, power cost could easily overtake hardware cost . On the other hand, CO2 emissions of global data centers will be up to 259 million tons by 2020 , which will accelerate global warming.

Outline Introductions Overview Power conservation Mechanism Structure Simulation and Measurement Conclusion Related Work

In the last few years, a lot of effort has been devoted to improve the energy efficiency of data centers. Hardware (efficient building block) Reference to last presentation Software Techniques At the software level, improve the energy efficiency of MapReduce . MapReduce has been the dominant framework deployed in data center for processing large data sets: by 2010, Google processed approximate 1000 PB of data daily using MapReduce [11]; Yahoo had 38000 servers running hadoop(an open-source implementation of MapReduce) in production [12]. So its energy efficiency promotion will benefit the data center's energy consumption reduction.

Data Center Fact The fact is that servers in data center are non-power proportional (the energy consumed is not proportional to the work completed). In our experiments, the slave consumes 54.5 W at idle and 87.5 W at peak utilization. For servers, their peak energy efficiency occurs at peak utilization and improves as utilization increases.

Management System of HPC MapReduce's energy efficiency is closely tied to its scheduler, we find that fair scheduler outperforms FIFO scheduler in energy efficiency when CPU-intensive job and IO-intensive job running simultaneously on the cluster, because fair scheduler achieves better resource utilization by overlapping resource complementary tasks on slaves. We propose an energy-efficient scheduling policy called green scheduling which relaxes fairness slightly to create as many opportunities as possible for overlapping resource complementary tasks. The results show that green scheduling can save between 7% and 9% energy consumption of fair scheduler.

We believe the energy saving is the result of the better resource utilization achieved by fair scheduler by overlapping CPU-intensive task and IO-intensive task on slaves. The two types of tasks are complementary : IO-intensive task causes CPU to be idle, letting CPU-intensive task run can increase CPU utilization. In contrast, the effect on I/O performance is opposite: CPU-intensive task leaves IO idle while IO-intensive task can keep IO busy.

Simulation to Validate we compare our cluster's CPU and IO utilizations under FIFO scheduler and fair scheduler when the CPU-intensive job Pi estimator and the IO-intensive job RandomWriter are running simultaneously on it. Experimental results are demonstrated in Figure 1. Under FIFO scheduler, CPU utilization fluctuates between 60% and 100% while IO utilization is below 10% until job Pi estimator finishes. But after job RandomWriter starts, CPU utilization drops dramatically and IO utilization increases significantly. In contrast, fair scheduler keeps both CPU and IO at high utilization over the two jobs' duration. Clearly, fair scheduler leads to better resource utilization than FIFO scheduler.

Scheduler

Pi estimator

Relaxing Fairness

This motivates us to propose an energy efficient scheduling policy called green scheduling: when a slave asks for new task, if the loss of fairness is in permissible range, our scheduler will choose the job whose resource requirement is the most complementary to the slave's current resource utilization, maximizing the slave's utilization while having a minimal impact on fairness.

Fair Scheduler

Priority The default scheduler in Hadoop is FIFO scheduler. All running jobs are sorted and queued according to their priority and submit time. Five priority levels are defined: very high, High normal low very low When a slave is ready to accept a new task, FIFO scheduler always picks up the first job in the queue and assigns its required task to the slave. Note: UB Data center CCR, implements group priority

Starvation One drawback of FIFO scheduler is its poor response time. Let's look at a concrete example,: Job i at time t duration: 3 days Job j at time t+1 duration: 10 min Under FIFO scheduler, the response time of jobj is almost 433 times of its job duration. To address this problem, propose fair scheduler which assigns each job a certain share to avoid starving.

Comparison

IV. GREEN SCHEDULING Fair scheduler is often more energy efficient than FIFO scheduler when complementary jobs are running simultaneously on the cluster. However, this scheduler itself does not take the slave's and tasks’ resource utilization into account when scheduling jobs. To investigate the opportunity to improve the energy efficiency of fair scheduler, we analyze slots allocation on one slave under FIFO and fair sharing.

D. Green Scheduling To achieve better energy efficiency, green scheduling takes into account slave’s resource utilization and task’s resource utilization when choosing which job should be scheduled next. However, this may violate the primary design goal of fair scheduler: fairness. To minimize the impact on fairness, we only consider slave’s resource utilization as an important factor of choosing job in two scenarios: both of the two jobs are needy and neither of them is needy. The justification is that the two jobs have got relative fair shares in the two scenarios. In the scenario where one job is needy and other one is not, the shares that two jobs have got are absolute unfair. Consequently, relaxing fairness in this scenario will aggravate unfairness.

Pseudo code for comparing naiive job sorting

A MapReduce job usually consists of a set of map tasks and reduces tasks. For simplicity, we only consider scheduling map tasks to achieve better utilization.

Green Scheduling algorithm

Conclusion This paper presented a new scheduling policy called green scheduling to improve the energy efficiency of fair scheduler. Knowing the job’s resource requirement and slave’ resource utilization, green scheduling can create as many opportunities as possible for overlapping CPU-intensive task and IO-intensive task. The key insight it is that overlapping complementary tasks can achieve better energy efficiency as well as utilization. We perform an evaluation using different workloads that consist of CPU-intensive job and IO-intensive job, and the results show that fair sharing with green scheduling can reduce 7%-9% energy consumption over naïve fair sharing.

Related Work Energy efficiency of Hadoop: Chen et al. [5] Overlapping CPU-intensive job with IO-intensive job in scheduling: Overlapping CPU-intensive job with IO-intensive job leads to better resource utilization. Wiseman et al. [17]

Thank You 