Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona.

Slides:



Advertisements
Similar presentations
Computer System Organization Computer-system operation – One or more CPUs, device controllers connect through common bus providing access to shared memory.
Advertisements

CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Resource Containers: A new Facility for Resource Management in Server Systems G. Banga, P. Druschel,
PlanetLab Operating System support* *a work in progress.
 Basic Concepts  Scheduling Criteria  Scheduling Algorithms.
Chapter 7 Protocol Software On A Conventional Processor.
ECE 526 – Network Processing Systems Design Software-based Protocol Processing Chapter 7: D. E. Comer.
Process Description and Control Chapter 3. Major Requirements of an Operating System Interleave the execution of several processes to maximize processor.
Scheduling in Batch Systems
CS 3013 & CS 502 Summer 2006 Scheduling1 The art and science of allocating the CPU and other resources to processes.
Introduction to Operating Systems – Windows process and thread management In this lecture we will cover Threads and processes in Windows Thread priority.
Locality-Aware Request Distribution in Cluster-based Network Servers 1. Introduction and Motivation --- Why have this idea? 2. Strategies --- How to implement?
Fair Scheduling in Web Servers CS 213 Lecture 17 L.N. Bhuyan.
ENFORCING PERFORMANCE ISOLATION ACROSS VIRTUAL MACHINES IN XEN Diwaker Gupta, Ludmila Cherkasova, Rob Gardner, Amin Vahdat Middleware '06 Proceedings of.
Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.
1 Lecture 10: Uniprocessor Scheduling. 2 CPU Scheduling n The problem: scheduling the usage of a single processor among all the existing processes in.
1Chapter 05, Fall 2008 CPU Scheduling The CPU scheduler (sometimes called the dispatcher or short-term scheduler): Selects a process from the ready queue.
Chapter 3 Operating Systems Introduction to CS 1 st Semester, 2015 Sanghyun Park.
CIS679: Scheduling, Resource Configuration and Admission Control r Review of Last lecture r Scheduling r Resource configuration r Admission control.
OPERATING SYSTEMS CPU SCHEDULING.  Introduction to CPU scheduling Introduction to CPU scheduling  Dispatcher Dispatcher  Terms used in CPU scheduling.
Chapter 3 Process Description and Control
CE Operating Systems Lecture 11 Windows – Object manager and process management.
1 Scheduling The part of the OS that makes the choice of which process to run next is called the scheduler and the algorithm it uses is called the scheduling.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Processes Introduction to Operating Systems: Module 3.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 7: CPU Scheduling Chapter 5.
Chapter 5 Processor Scheduling Introduction Processor (CPU) scheduling is the sharing of the processor(s) among the processes in the ready queue.
Chapter 2 Processes and Threads Introduction 2.2 Processes A Process is the execution of a Program More specifically… – A process is a program.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems with Multi-programming Chapter 4.
CS640: Introduction to Computer Networks Aditya Akella Lecture 20 - Queuing and Basics of QoS.
Main Memory. Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Paging Structure of the Page Table Segmentation Example: The.
1 Computer Systems II Introduction to Processes. 2 First Two Major Computer System Evolution Steps Led to the idea of multiprogramming (multiple concurrent.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Chapter 2 Process Management. 2 Objectives After finish this chapter, you will understand: the concept of a process. the process life cycle. process states.
Operating Systems (CS 340 D) Dr. Abeer Mahmoud Princess Nora University Faculty of Computer & Information Systems Computer science Department.
Lecture Topics: 11/15 CPU scheduling: –Scheduling goals and algorithms.
Time Management.  Time management is concerned with OS facilities and services which measure real time.  These services include:  Keeping track of.
1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling.
Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
Copyright © Curt Hill More on Operating Systems Continuation of Introduction.
Lecture 4 Page 1 CS 111 Summer 2013 Scheduling CS 111 Operating Systems Peter Reiher.
Lecturer 5: Process Scheduling Process Scheduling  Criteria & Objectives Types of Scheduling  Long term  Medium term  Short term CPU Scheduling Algorithms.
Advanced Operating Systems CS6025 Spring 2016 Processes and Threads (Chapter 2)
Disk Cache Main memory buffer contains most recently accessed disk sectors Cache is organized by blocks, block size = sector’s A hash table is used to.
OPERATING SYSTEMS CS 3502 Fall 2017
lecture 5: CPU Scheduling
Processes and threads.
Chapter 6: CPU Scheduling
Process Management Process Concept Why only the global variables?
Operating Systems (CS 340 D)
Chapter 5a: CPU Scheduling
OPERATING SYSTEMS CS3502 Fall 2017
Operating Systems (CS 340 D)
Chapter 6: CPU Scheduling
CPU Scheduling Basic Concepts Scheduling Criteria
Page Replacement.
Memory Management Lectures notes from the text supplement by Siberschatz and Galvin Modified by B.Ramamurthy Chapter 8 11/24/2018.
Operating System Concepts
Memory Management Lectures notes from the text supplement by Siberschatz and Galvin Modified by B.Ramamurthy Chapter 9 12/1/2018.
Chapter 8: Memory management
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
Operating systems Process scheduling.
Process Description and Control
CPU SCHEDULING.
Chapter 6: CPU Scheduling
Chapter 2: Operating-System Structures
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Memory Management Lectures notes from the text supplement by Siberschatz and Galvin Modified by B.Ramamurthy Chapter 9 4/5/2019.
Uniprocessor scheduling
Virtual Memory: Working Sets
Chapter 2: Operating-System Structures
Presentation transcript:

Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Tal Ben-Nun Scl. Eng & CS Hebrew University Yoav Etsion CS Dept Barcelona SC Ctr Dror Feitelson Scl. Eng & CS Hebrew University Supported by the Israel Science Foundation, grant no. 28/09

Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Goal is to control share of resources, not to optimize performance – important in virtualization

Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Goal is to control share of resources, not to optimize performance – important in virtualization Same module used for diverse resources

Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Goal is to control share of resources, not to optimize performance – important in virtualization Same module used for diverse resources Mechanism used: dispatch the most deserving client at each instant

Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Goal is to control share of resources, not to optimize performance – important in virtualization Same module used for diverse resources Mechanism used: dispatch the most deserving client at each instant Selection of deserving client using virtual time formalism

Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher Goal is to control share of resources, not to optimize performance – important in virtualization Same module used for diverse resources Mechanism used: dispatch the most deserving client at each instant Selection of deserving client using virtual time formalism Implemented and measured in Linux

Motivation Context: VMM for server consolidation  Multiple legacy servers share physical platform  Improved utilization and easier maintenance  Flexibility in allocating resources to virtual machines  Virtual machines typically run a single application (“appliances”)

Motivation Assumed goal: enforce predefined allocation of resources to different virtual machines (“fair share” scheduling)  Based on importance / SLA  Can change with time or due to external events Problem: what is “30% of the resources” when there are many different resources, and diverse requirements?

Global Scheduling “Fair share” usually applied to a single resource  But what if this resource is not a bottleneck? Global scheduling idea: 1)Identify the system bottleneck resource 2)Apply fair share scheduling on this resource 3)This induces appropriate allocations on other resources This paper: how to apply fair-share scheduling on any resource in the system

Previous Work I: Virtual Time Accounting is inversely proportional to allocation Schedule the client that is farthest behind

Previous Work II: Traffic Shaping Leaky bucket – Variable requests – Constant rate transmission – Bucket represent buffer Token bucket – Variable requests – Constant allocations – Bucket represents stored capacity

Putting them Together: RSVT “Resource sharing”: all clients make progress continuously – Generalization of processor sharing Each job has its ideal resource sharing progress – This is considered to be the allocation a i – Grows at constant rate Each job has its actual consumption c i – Grows only when job runs Scheduling priority is the difference: p i = a i – c i

Example Three clients Allocations roughly 50%, 30%, 20% Consumption always occur in resource time Wallclock time Consumed resource time

Bookkeeping The set of active jobs is A The relative allocation of job i is r i During an interval T job k has run Update allocations: Update consumptions:

The Active Set Active jobs (the set A) are those that can use the resource now Allocations are relative to the active set The active set may change – New job arrives – Job terminates – Job stops using resource temporarily – Job resumes use of resource

Grace Period Intermittent activity: process data / send packet should retain allocations even when inactive Thus a i continues to grow during grace period after it becomes inactive Grace period reflects notion of continuity Sub-second time scale

Rebirth Resumption after very long inactive periods should be treated as new arrivals Due to grace period, job that becomes inactive accrues extra allocation Forget this extra allocation after rebirth period (set a i = c i ) Two order of magnitude larger than grace period

Implementation Kernel module with generic functionality – Create / destroy module – Create / destroy client – Make request / set active / set inactive – Make allocations – Dispatch – Check-in (note resource usage) Glue code for specific subsystems – Currently networking and CPU – Plan to add disk I/O

Networking Glue Code Use the Linux QoS framework: create RSVT queueing discipline IP QoS NIC TCP App queueing discipline

Networking Glue Code Non-RSVT traffic has priority (e.g. NFS traffic) and is counted as dead time IP NIC TCP App RSVT? send immediately no enqueue select and send yes

CPU Scheduling Glue Code Use Linux modular scheduling core Add an RSVT scheduling policy – RSVT module essentially replaces the policy runqueue – Initial implementation only for uniprocessors CFS and possibly other policies also exist and have higher priority – When they run, this is considered dead time

Timer Interrupts Linux employs timer interrupts (250 Hz) Allocations are done at these times – Translate time into microseconds – Subtract known dead time (unavailable to us) – Divide among active clients according to relative allocations – Bound divergence of allocation from consumption Also handling of grace period (mark as inactive) Also handling of rebirth (set a i = c i )

Multi-Queue At dispatch, need to find client with highest priority But priorities change at different rates Solution: allow only a limited discrete set of relative priorities Each priority has a separate queue Maintain all clients in each queue in priority order Only need to check the first in each queue to find the maximum

Experiment – Basic Allocations ratebandwidth   0.02

Experiment – Basic Allocations ratebandwidth    0.03

Experiment – Active Set

Experiment – Grace Period

Experiment – Rebirth

Experiment – Throttling Two competing MPlayers The one with higher allocation does not need all of it – Allocation tracks consumption

Conclusions Demonstrated generic virtual-time based resource sharing dispatcher Need to complete implementation – Support for I/O scheduling – More details, e.g. SMP support Building block of global scheduling vision