1 University of Maryland Linger-Longer: Fine-Grain Cycle Stealing in Networks of Workstations Kyung Dong Ryu © Copyright 2000, Kyung Dong Ryu, All Rights.

Slides:



Advertisements
Similar presentations
Paging: Design Issues. Readings r Silbershatz et al: ,
Advertisements

MINJAE HWANG THAWAN KOOBURAT CS758 CLASS PROJECT FALL 2009 Extending Task-based Programming Model beyond Shared-memory Systems.
Energy Efficiency through Burstiness Athanasios E. Papathanasiou and Michael L. Scott University of Rochester, Computer Science Department Rochester, NY.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 17 Scheduling III.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
Chap 5 Process Scheduling. Basic Concepts Maximum CPU utilization obtained with multiprogramming CPU–I/O Burst Cycle – Process execution consists of a.
Chapter 5 CPU Scheduling. CPU Scheduling Topics: Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling.
CS 311 – Lecture 23 Outline Kernel – Process subsystem Process scheduling Scheduling algorithms User mode and kernel mode Lecture 231CS Operating.
Chapter 6: CPU Scheduling. 5.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 2, 2005 Chapter 6: CPU Scheduling Basic.
Scheduler Activations Effective Kernel Support for the User-Level Management of Parallelism.
Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
Performance Evaluation of Load Sharing Policies on a Beowulf Cluster James Nichols Marc Lemaire Advisor: Mark Claypool.
Chapter 1 and 2 Computer System and Operating System Overview
Computer Organization and Architecture
Copyright 2006, Jeffrey K. Hollingsworth Grid Computing Jeffrey K. Hollingsworth Department of Computer Science University of Maryland,
Silberschatz, Galvin, and Gagne  Applied Operating System Concepts Module 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
Chapter 5 – CPU Scheduling (Pgs 183 – 218). CPU Scheduling  Goal: To get as much done as possible  How: By never letting the CPU sit "idle" and not.
Lecture 2 Process Concepts, Performance Measures and Evaluation Techniques.
Scheduling Basic scheduling policies, for OS schedulers (threads, tasks, processes) or thread library schedulers Review of Context Switching overheads.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
Scheduling. Alternating Sequence of CPU And I/O Bursts.
CPU Scheduling CSCI 444/544 Operating Systems Fall 2008.
1 Scheduling The part of the OS that makes the choice of which process to run next is called the scheduler and the algorithm it uses is called the scheduling.
Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.
Alternating Sequence of CPU And I/O Bursts. Histogram of CPU-burst Times.
Process A program in execution. But, What does it mean to be “in execution”?
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
Chapter 5: Process Scheduling. 5.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Basic Concepts Maximum CPU utilization can be obtained.
1 11/29/2015 Chapter 6: CPU Scheduling l Basic Concepts l Scheduling Criteria l Scheduling Algorithms l Multiple-Processor Scheduling l Real-Time Scheduling.
Cpr E 308 Spring 2005 Process Scheduling Basic Question: Which process goes next? Personal Computers –Few processes, interactive, low response time Batch.
Chapter 5: CPU Scheduling. 5.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 2, 2005 Chapter 5: CPU Scheduling Basic.
Faucets Queuing System Presented by, Sameer Kumar.
CSCI1600: Embedded and Real Time Software Lecture 24: Real Time Scheduling II Steven Reiss, Fall 2015.
Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.
1 CS.217 Operating System By Ajarn..Sutapart Sappajak,METC,MSIT Chapter 5 CPU Scheduling Slide 1 Chapter 5 CPU Scheduling.
6.1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Energy Efficient Prefetching and Caching Athanasios E. Papathanasiou and Michael L. Scott. University of Rochester Proceedings of 2004 USENIX Annual Technical.
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Lecture 4 CPU scheduling. Basic Concepts Single Process  one process at a time Maximum CPU utilization obtained with multiprogramming CPU idle :waiting.
Process Scheduling. Scheduling Strategies Scheduling strategies can broadly fall into two categories  Co-operative scheduling is where the currently.
CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.
Basic Concepts Maximum CPU utilization obtained with multiprogramming
CPU SCHEDULING.
Process Scheduling B.Ramamurthy 9/16/2018.
Chapter 6: CPU Scheduling
Chapter 6: CPU Scheduling
Department of Computer Science University of California, Santa Barbara
CPU Scheduling G.Anuradha
Module 5: CPU Scheduling
Chapter 5: CPU Scheduling
Operating System Concepts
3: CPU Scheduling Basic Concepts Scheduling Criteria
Chapter5: CPU Scheduling
Operating systems Process scheduling.
Chapter 6: CPU Scheduling
CPU SCHEDULING.
CPU scheduling decisions may take place when a process:
Operating System Concepts
Process Scheduling Decide which process should run and for how long
Chapter 5: CPU Scheduling
Process Scheduling B.Ramamurthy 2/23/2019.
Process Scheduling B.Ramamurthy 4/11/2019.
Process Scheduling B.Ramamurthy 4/7/2019.
Operating System , Fall 2000 EA101 W 9:00-10:00 F 9:00-11:00
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Presentation transcript:

1 University of Maryland Linger-Longer: Fine-Grain Cycle Stealing in Networks of Workstations Kyung Dong Ryu © Copyright 2000, Kyung Dong Ryu, All Rights Reserved.

2 University of Maryland Outline Introduction & Motivation Fine-Grained Cycle Stealing Cluster Simulation Implementation Mechanisms Evaluation Conclusion and Future Work

3 University of Maryland Harvest Idle Cycles Networks of Workstations –Workstations have high processing power –Connected via high speed network (100Mbps+) –Long idle time (50-60%) and low resource usage Traditional Approach (Coarse-Grained) –Goal: Run resource-intensive programs using idle periods while owner is away: send guest job and run when owner returns: stop and migrate job away –Examples: Condor and NOW system Improvement ?

4 University of Maryland Available Cycles on a “busy” Workstation When a user is active –CPU usage is low (10% or less for 75% of time) –a low priority process could use these cycles

5 University of Maryland How much Memory is available ? 90% of time, at least 14 MBytes is available.

6 University of Maryland Fine-Grain Cycle Stealing Goals: –Harvest more available resources using non-idle machines –Limit slowdown of local jobs Technique –Lower guest job’s resource priority guest job can linger on non-idle node –Migration is now optional Migrate guest job to speed up the program Local Scheduling –Guest job should use only remaining resources –Without disruption to machine owner –Need Strict Priority Scheduling for resources

7 University of Maryland Simulation: Generate Local Workload Coarse-Grain Usage: Traces (every 2 sec) Fine-Grain Run/Idle Bursts: Stochastic model –Fine-grain trace data supplies model parameters

8 University of Maryland Simulation Study: Migration Policies Immediate Eviction (IE) –when a user returns, migrate the job –policy used by Berkeley NOW –assumes free workstation or no penalty to stop job Pause and Migrate (PM) –when a user returns, stop and wait, then migrate the job –used by Wisconsin condor Linger Longer (LL) –when user returns, decrease priority and remain monitor situation to decide when to migrate –permits fine grained cycle stealing Linger Forever (LF) –like Linger Longer, but never migrate

9 University of Maryland Simulation of Policies Model workstation as –host process (high priority) requests CPU, then blocks hybrid of trace-based data and model –guest process (low priority) always ready to run, and have a fixed total CPU time –context switches (each takes 100 micro-seconds) accounts for both direct state and cache re-load Study: –What is the benefit of Lingering? –How much will lingering slow foreground processes?

10 University of Maryland Cluster Performance: Sequential Jobs Metrics –Average Job Time: mean time of jobs from submit to exit –Variation: stddev. of job execution time (start exec to finish) –Family Time: completion time of the last job in the family of processes –Throughput: mean of CPU time used by foreign jobs when the number of jobs in system is held constant Guest Workload –workload 1 : 128 jobs, 600 sec CPU time each –workload 2 : 16 jobs, 1800 sec CPU time each –All jobs submitted at once.

11 University of Maryland Cluster Performance: Sequential Jobs Results –LF is fastest, but variation is higher than LL Slowest 5% Jobs: LL: 921 sec, LF: 1233 sec –LF is a 60% improvement over the PM policy –LL&LF as good as IE or PM in Workload 2 –Slowdown for foreground jobs is under 1%.

12 University of Maryland Parallel DSM Applications Simulated three splash benchmark applications –Used CVM simulator plus Linger-Longer simulator (LF policy) –Varied local load by selecting workload parameters –Each application ran on eight nodes 1 non-idle node8 non-idle nodes4 non-idle nodes

13 University of Maryland Outline Introduction & Motivation Fine-Grained Cycle Stealing Cluster Simulation Implementation Mechanisms Evaluation Conclusion and Future Work

14 University of Maryland Local Scheduling Mechanisms Constraints –Guest job should use only remaining resources CPU cycles, memory and I/O bandwidth –Without disruption to machine owner Support Very Low Resource Priority –currently available ? –How hard to extend kernel ?

15 University of Maryland Is Unix “nice” sufficient ? CPU priority is not strict –run two empty loop processes (guest: nice 19) No memory priority –run two large memory footprint processes Answer is NO ! => extend the kernel

16 University of Maryland Starvation Level CPU Scheduling Original Linux CPU Scheduler –Run-time Scheduling Priority nice value & remaining time quanta T i = 20 - nice_level + 1/2 * T i-1 –Possible to schedule niced processes Modified Linux CPU Scheduler –If runnable host processes exist Schedule a host process with highest priority –Only when no host process is runnable Schedule a guest process

17 University of Maryland Prioritized Page Replacement New page replacement algorithm High Limit Low Limit Priority to Host Job Priority to Guest Job Based only on LRU Main Memory Pages –No limit on taking free pages –High Limit : Maximum pages guest can hold –Low Limit : Minimum pages guest can hold

18 University of Maryland Micro Test (2) Large Memory Footprint Job Revisited –With Modified CPU Scheduler and Page Replacement –Each job takes 82 sec to run in isolation –Host-then-guest: Reduce host job delay 8% to 0.8% –Guest-then-host: Serialize the execution Switch from guest job to host job guest job resumes after host job finishes

19 University of Maryland Application Evaluation - Setup Experiment Environment –Linux PC Cluster 8 pentium II PCs, Linux Connected by a 1.2Gbps Myrinet Local Workload for host jobs –Emulate Interactive Local User MUSBUS interactive workload benchmark Typical Programming environment –Coding, Compiling and Debugging Guest jobs –Run DSM parallel applications (CVM) –SOR, Water and FFT Metrics –Guest Job Performance, Host Workload Slowdown

20 University of Maryland Application Evaluation - Host Slowdown Run DSM Parallel Applications –3 Host Workloads : 7%, 13%, 24% (CPU Usage) –Host Workload Slowdown –For Equal Priority: Significant Slowdown Slowdown increases with load –No Slowdown with Linger Priority

21 University of Maryland Conclusions & Future Work More Efficient Use of Available Resources –60% increase in throughput –Still very limited local job slowdown: a few percent Potential for Parallel jobs –Acceptable slowdown with Linger Processes –Perform better than Reconfiguration in some cases Simple kernel modifications improve isolation –CPU scheduling: guest process priority –Memory Priority: lower/upper limit on guest memory pages Current Work –I/O & Comm. Prioritization –Global (cluster) scheduling policies –Integration with Condor –Evaluation of prototype implementation