Marcos Dias de Assunção 1,2, Alexandre di Costanzo 1 and Rajkumar Buyya 1 1 Department of Computer Science and Software Engineering 2 National ICT Australia.

Slides:

Advertisements

Similar presentations

Libra: An Economy driven Job Scheduling System for Clusters Jahanzeb Sherwani 1, Nosheen Ali 1, Nausheen Lotia 1, Zahra Hayat 1, Rajkumar Buyya 2 1. Lahore.

Advertisements

GridSim:Java-based Modelling and Simulation of Deadline and Budget-based Scheduling for Grid Computing Rajkumar Buyya and Manzur Murshed Monash University,

Pricing for Utility-driven Resource Management and Allocation in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS)

Evaluating the Cost-Benefit of Using Cloud Computing to Extend the Capacity of Clusters Presenter: Xiaoyu Sun.

Studies of the User-Scheduler Relationship Cynthia Bailey Lee Advisor: Allan E. Snavely Department of Computer Science and Engineering San Diego Supercomputer.

SLA-Oriented Resource Provisioning for Cloud Computing

1 GridSim 2.0 Adv. Grid Modelling & Simulation Toolkit Rajkumar Buyya, Manzur Murshed (Monash), Anthony Sulistio, Chee Shin Yeo Grid Computing and Distributed.

Anthony Sulistio 1, Kyong Hoon Kim 2, and Rajkumar Buyya 1 Managing Cancellations and No-shows of Reservations with Overbooking to Increase Resource Revenue.

Towards Provision of Quality of Service Guarantees in Job Scheduling Mohammad IslamPavan Balaji P. SadayappanD. K. Panda Computer Science and Engineering.

Virtual Machine Usage in Cloud Computing for Amazon EE126: Computer Engineering Connor Cunningham Tufts University 12/1/14 “Virtual Machine Usage in Cloud.

Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Scheduling of parallel jobs in a heterogeneous grid environment Scheduling of parallel jobs in a heterogeneous grid environment Each site has a homogeneous.

Service Level Agreement based Allocation of Cluster Resources: Handling Penalty to Enhance Utility Chee Shin Yeo and Rajkumar Buyya Grid Computing and.

Green Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of Science and Technology,

Managing Risk of Inaccurate Runtime Estimates for Deadline Constrained Job Admission Control in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing.

Senior Design Project: Parallel Task Scheduling in Heterogeneous Computing Environments Senior Design Students: Christopher Blandin and Dylan Machovec.

Present By : Bahar Fatholapour M.Sc. Student in Information Technology Mazandaran University of Science and Technology Supervisor:

On Economics and the User-Scheduler Relationship in HPC and Grid Systems Cynthia Bailey Lee Advisor: Allan E. Snavely Department of Computer Science and.

Is 99% Utilization of a Supercomputer a Good Thing? Scheduling in Context: User Utility Functions Cynthia Bailey Lee Department of Computer Science and.

An Introduction to Cloud Computing. The challenge Add new services for your users quickly and cost effectively.

Cloud Computing Systems Lin Gu Hong Kong University of Science and Technology Sept. 21, 2011 Windows Azure—Overview.

Chapter 2 Computer Clusters Lecture 2.1 Overview.

HeteroPar 2013 Optimization of a Cloud Resource Management Problem from a Consumer Perspective Rafaelli de C. Coutinho, Lucia M. A. Drummond and Yuri Frota.

PhD course - Milan, March /09/ Some additional words about cloud computing Lionel Brunie National Institute of Applied Science (INSA) LIRIS.

Integrated Risk Analysis for a Commercial Computing Service Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. Dept.

Introduction To Windows Azure Cloud

Simulation of Cloud Environments

OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.

Ruppa K. Thulasiram Slide 1/24 Resource Provisioning Policies to Increase IaaS Provider’s Profit in a Federated Cloud Environment Adel Nadjaran Toosi *,

Scheduling of Parallel Jobs In a Heterogeneous Multi-Site Environment By Gerald Sabin from Ohio State Reviewed by Shengchao Yu 02/2005.

Resource Provisioning based on Lease Preemption in InterGrid Mohsen Amini Salehi, Bahman Javadi, Rajkumar Buyya Cloud Computing and Distributed Systems.

A Broker for Cost-efficient QoS aware Resource Allocation in EC2. Kurt Vermeersch Coordinator: Kurt Vanmechelen.

Fault-Tolerant Workflow Scheduling Using Spot Instances on Clouds Deepak Poola, Kotagiri Ramamohanarao, and Rajkumar Buyya Cloud Computing and Distributed.

Meta Scheduling Sathish Vadhiyar Sources/Credits/Taken from: Papers listed in “References” slide.

1 Time & Cost Sensitive Data-Intensive Computing on Hybrid Clouds Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The.

1 520 Student Presentation GridSim – Grid Modeling and Simulation Toolkit.

1 A Framework for Data-Intensive Computing with Cloud Bursting Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The Ohio.

1 Challenge the future KOALA-C: A Task Allocator for Integrated Multicluster and Multicloud Environments Presenter: Lipu Fei Authors: Lipu Fei, Bogdan.

Grid Computing at The Hartford Condor Week 2008 Robert Nordlund

The Owner Share scheduler for a distributed system 2009 International Conference on Parallel Processing Workshops Reporter: 李長霖.

 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.

Cost-effective Cloud HPC Resource Provisioning by Building Semi-Elastic Virtual Clusters Shuangcheng Niu 1, Jidong Zhai 1, Xiaosong Ma 2,3 Xiongchao Tang.

Power-Aware Parallel Job Scheduling

Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.

Green Computing Metrics: Power, Temperature, CO2, … Computing system: Many-cores, Clusters, Grids and Clouds Algorithm and model: task scheduling, CFD.

Performance Analysis of Preemption-aware Scheduling in Multi-Cluster Grid Environments Mohsen Amini Salehi, Bahman Javadi, Rajkumar Buyya Cloud Computing.

Job Scheduling P. (Saday) Sadayappan Ohio State University.

QoPS: A QoS based Scheme for Parallel Job Scheduling M. IslamP. Balaji P. Sadayappan and D. K. Panda Computer and Information Science The Ohio State University.

1 Hidra: History Based Dynamic Resource Allocation For Server Clusters Jayanth Gummaraju 1 and Yoshio Turner 2 1 Stanford University, CA, USA 2 Hewlett-Packard.

Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.

Possibilities for joint procurement of commercial cloud services for WLCG WLCG Overview Board Bob Jones (CERN) 28 November 2014.

Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer

Cloudsim: simulator for cloud computing infrastructure and modeling Presented By: SHILPA V PIUS 1.

Universidade Federal do Ceará FOLE: A Framework for Elasticity Performance Evaluation in Cloud Computing Systems Emanuel F. Coutinho Group of Computer.

KAASHIV INFOTECH – A SOFTWARE CUM RESEARCH COMPANY IN ELECTRONICS, ELECTRICAL, CIVIL AND MECHANICAL AREAS

2004 Queue Scheduling and Advance Reservations with COSY Junwei Cao Falk Zimmermann C&C Research Laboratories NEC Europe Ltd.

INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.

Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.

1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.

Cloud Technology and the NGS Steve Thorn Edinburgh University (Matteo Turilli, Oxford University)‏ Presented by David Fergusson.

OpenPBS – Distributed Workload Management System

Provisioning 160,000 cores with HEPCloud at SC17

Job Scheduling in a Grid Computing Environment

Geographical Load Balancing for Sustainable Cloud Data Centers

WLCG Collaboration Workshop;

CLUSTER COMPUTING.

P. (Saday) Sadayappan Ohio State University

A Characterization of Approaches to Parrallel Job Scheduling

Creating a Dynamic HPC Infrastructure with Platform Computing

Using and Building Infrastructure Clouds for Science

Presentation transcript:

Marcos Dias de Assunção 1,2, Alexandre di Costanzo 1 and Rajkumar Buyya 1 1 Department of Computer Science and Software Engineering 2 National ICT Australia (NICTA) Victoria Research Laboratory The University of Melbourne

2  Maturity of virtual machines, virtualised storage and Web technologies  Software, Platform and Infrastructure  Emergence of commercial infrastructure managed by virtual machine technologies ◦ Amazon EC2

 Use of resources in a pay as you go manner  Web Services APIs and command line tools  Environments can scale on demand  Start-ups can avoid initial outlays for computing capacity  Organisations may have existing computing infrastructure ◦ How to scale out to the Cloud? 3

 Evaluation of using a commercial provider to extend the capacity of a local cluster  Different provisioning strategies may yield different ratios of performance improvement to money spent using resources from the Cloud 4 Scheduler Local computing cluster Cloud provider Requests Redirect requests according to the strategies VM Scheduling strategy Redirection strategy Request duration Number of VMs required Strategy set

 Conservative and Aggressive  Selective ◦ Requests are given reservations if they have waited long enough in the queue ◦ Long enough is determined by the requests’ eXpansion Factor:  Xfactor = (wait time+runtime)/run time ◦ The threshold is given by the average slowdown of previously completed requests ◦ Use of Adaptive-Selective-Backfilling* * S. Srinivasan, R. Kettimuthu, V. Subramani and P. Sadayappan, Selective Reservation Strategies for Backfill Job Scheduling, 8th International Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP '02), pp ,

 Naïve: ◦ Use commercial provider when the request cannot start immediately on local cluster  Shortest Queue: ◦ Aggressive backfilling ◦ Compute number of VMs required by requests in the queue ◦ Redirect request if commercial provider’s number is smaller  Weighted Queue: ◦ Number of VMs that can be borrowed from commercial provider is the number of VMs required by requests minus VMs in use  Selective ◦ When the request’s xFactor exceeds the threshold, the scheduler makes a reservation at the place that yields the smallest slowdown 6

 Simulation of two-month-long periods  SDSC Blue Horizon machine with 144 nodes ◦ Number of VMs  Price of a virtual machine per hour ◦ Amazon EC2’s small instance: US$0.10 ◦ Network and storage are not considered  Values are averages of 5 simulation rounds 7

 Average Weighted Response Time (AWRT) of site k: ◦ τ k : requests submitted to site k ◦ p j : the runtime of request j ◦ m j : the number of processors required by request j ◦ ct j : request j’s completion time ◦ st j : if the submission time of request j  Performance Improvement Cost of a strategy set st: 8

9 U. Lublin and D. G. Feitelson, The Workload on Parallel Supercomputers: Modeling the Characteristics of Rigid Jobs, Journal of Parallel and Distributed Computing, Vol. 63, n. 11, pp , 2003

 Users may have stringent requirements on when the virtual machines are required  Deadline constrained requests have: ◦ Ready time ◦ Duration ◦ Deadline  Cost of using Cloud resources used to meet requests’ deadlines and decrease the number of deadline violations and request rejections 10

 Conservative ◦ Places a request where it achieves the best start time ◦ If rejections are allowed and deadline cannot be met, reject the request  Aggressive ◦ Builds the schedule using aggressive backfilling * and Earliest Deadline First ◦ If request deadlines are broken in the local cluster, try the commercial provider ◦ If rejections are allowed and deadlines are broken, reject the request 11 *G. Singh, C. Kesselman and E. Deelman, Adaptive Pricing for Resource Reservations in Shared Environments, In 8th IEEE/ACM International Conference on Grid Computing (Grid 2007), pp , Austin, 2007.

 The non-violation cost is given by:  Where: ◦ Amount_spent st : amount spent with Cloud resources ◦ viol base : the number of deadline violations under the base strategy set ◦ viol st : the number of deadline violations under the evaluated strategy set 12

13  SDSC Blue Horizon’s trace divided into two- month-long intervals  We vary the % of requests with deadlines  Stringency factors of 0.9, 1.3 and 1.7

 SDSC Blue Horizon’s trace  We vary the % of requests with deadlines  Stringency factors of 0.9, 1.3 and

MetricNaïveShortest Queue Weighted Queue Selective Amount spent with VMs ($) Number of VM/Hours AWRT (improvement) Req. slowdown (improvement)  SDSC Blue Horizon’s trace divided into two-month-long intervals 15

 Scheduling policies can yield different ratios of performance improvement to money spent ◦ Naïve policy has a higher performance improvement cost  Selective policy provides a good ratio of money spent to job slowdown improvement  Using commercial provider to meet job deadlines ◦ Less than $3,000 were spent to keep the number of rejections close to zero 16

 Scheduling strategy that strikes a balance between money spent and performance improvement  Use of the Cloud to handle peak demands  Experiments with the real system ◦ Applications that can benefit from using local and remote resources ◦ Consider other resources such as storage and network 17

Questions & Answers