CS444/CS544 Operating Systems Finale 4/27/2007 Prof. Searleman

Slides:



Advertisements
Similar presentations
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
Advertisements

WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Silberschatz and Galvin  Operating System Concepts Module 16: Distributed-System Structures Network-Operating Systems Distributed-Operating.
Distributed System Structures Network Operating Systems –provide an environment where users can access remote resources through remote login or file transfer.
Silberschatz, Galvin and Gagne  Operating System Concepts Chap15: Distributed System Structures Background Topology Network Types Communication.
Distributed Systems 1 Topics  What is a Distributed System?  Why Distributed Systems?  Examples of Distributed Systems  Distributed System Requirements.
Lecture 12 Page 1 CS 111 Online Devices and Device Drivers CS 111 On-Line MS Program Operating Systems Peter Reiher.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
CS444/CS544 Operating Systems Synchronization 2/16/2006 Prof. Searleman
Network Operating Systems Users are aware of multiplicity of machines. Access to resources of various machines is done explicitly by: –Logging into the.
ECE 526 – Network Processing Systems Design Software-based Protocol Processing Chapter 7: D. E. Comer.
Operating Systems CS451 Brian Bershad
City University London
Distributed Hardware How are computers interconnected ? –via a bus-based –via a switch How are processors and memories interconnected ? –Private –shared.
1: Operating Systems Overview
Operating Systems CS208. What is Operating System? It is a program. It is the first piece of software to run after the system boots. It coordinates the.
CS444/CS544 Operating Systems Introduction 1/12/2007 Prof. Searleman
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
1 Distributed Systems: Distributed Process Management – Process Migration.
Chapter 3 Memory Management: Virtual Memory
Computer System Architectures Computer System Software
1 COMPSCI 110 Operating Systems Who - Introductions How - Policies and Administrative Details Why - Objectives and Expectations What - Our Topic: Operating.
1 Lecture 20: Parallel and Distributed Systems n Classification of parallel/distributed architectures n SMPs n Distributed systems n Clusters.
1 Distributed Operating Systems and Process Scheduling Brett O’Neill CSE 8343 – Group A6.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System.
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
SYSTEM SOFTWARE Prepared by: Mrs. Careene McCallum-Rodney.
CSE 451: Operating Systems Section 10 Project 3 wrap-up, final exam review.
Session-8 Data Management for Decision Support
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 3: Operating-System Structures System Components Operating System Services.
Types of Operating Systems
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
ECE200 – Computer Organization Chapter 9 – Multiprocessors.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Multiprossesors Systems.. What are Distributed Databases ? “ A Logically interrelated collection of shared data ( and a description of this data) physically.
Lecture 4: Sun: 23/4/1435 Distributed Operating Systems Lecturer/ Kawther Abas CS- 492 : Distributed system & Parallel Processing.
Module 16: Distributed System Structures
Databases Illuminated
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming.  To allocate scarce memory.
1: Operating Systems Overview 1 Jerry Breecher Fall, 2004 CLARK UNIVERSITY CS215 OPERATING SYSTEMS OVERVIEW.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 16: Distributed System Structures.
Types of Operating Systems 1 Computer Engineering Department Distributed Systems Course Assoc. Prof. Dr. Ahmet Sayar Kocaeli University - Fall 2015.
Distributed Systems. -2 A Distributed System -3 Loosely Coupled Distributed Systems r Users are aware of multiplicity of machines. Access to resources.
Cheating The School of Network Computing, the Faculty of Information Technology and Monash as a whole regard cheating as a serious offence. Where assignments.
Distributed Systems: Motivation, Time, Mutual Exclusion.
Module 16: Distributed System Structures Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Apr 4, 2005 Distributed.
CS307 Operating Systems Distributed System Structures Guihai Chen Department of Computer Science and Engineering Shanghai Jiao Tong University Spring 2012.
Operating Systems Distributed-System Structures. Topics –Network-Operating Systems –Distributed-Operating Systems –Remote Services –Robustness –Design.
Applied Operating System Concepts
CE 454 Computer Architecture
Distributed Shared Memory
COMPSCI 110 Operating Systems
Distributed System 電機四 陳伯翰 b
Operating System Structure
Module 16: Distributed System Structures
Chapter 16: Distributed System Structures
Distributed System Structures 16: Distributed Structures
Outline Midterm results summary Distributed file systems – continued
Operating Systems.
Operating System Concepts
CSE 451: Operating Systems Autumn 2005 Memory Management
Chapter 2: Operating-System Structures
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
Operating System Concepts
Distributed Systems and Concurrency: Distributed Systems
Lecture Topics: 11/1 Hand back midterms
Presentation transcript:

CS444/CS544 Operating Systems Finale 4/27/2007 Prof. Searleman

Outline Return & discuss Exam#2 Final Exam: Thursday, May 3 rd, 7:00 pm Distributed Systems – a primer Office hours during finals week: posted on my office door

Grading Policy 30% Midterm Exams 30% Final Exam One required section Optional questions (for improving your grade) Bring a calculator if you plan to do the optional questions Bring your old exams & turn them in with the final 10% Homework will drop the lowest 10 point HW 30% Laboratory Projects Lab grades will be posted in your AFS subdirectory Second Chance lab: subdirectory 2ndchance/ make all changes by Wednesday

search TLB fetch PTE page fault overwrite victim page copy victim to disk read in new page fetch word hit miss valid invalid empty frame or clean victim victim page is dirty

A Distributed System

Loosely Coupled Distributed Systems Users are aware of multiplicity of machines. Access to resources of various machines is done explicitly by: Remote logging into the appropriate remote machine. Transferring data from remote machines to local machines, via the File Transfer Protocol (FTP) mechanism.

Tightly Coupled Distributed-Systems Users not aware of multiplicity of machines. Access to remote resources similar to access to local resources Examples Data Migration – transfer data by transferring entire file, or transferring only those portions of the file necessary for the immediate task. Computation Migration – transfer the computation, rather than the data, across the system.

Distributed Operating Systems Process Migration – execute an entire process, or parts of it, at different sites. Load balancing – distribute processes across network to even the workload. Computation speedup – subprocesses can run concurrently on different sites. Hardware preference – process execution may require specialized processor. Software preference – required software may be available at only a particular site. Data access – run process remotely, rather than transfer all data locally.

Why Distributed Systems? Communication Resource sharing Computational speedup Reliability

Resource Sharing Distributed Systems offer access to specialized resources of many systems Example: Some nodes may have special databases Some nodes may have access to special hardware devices (e.g. tape drives, printers, etc.) DS offers benefits of locating processing near data or sharing special devices

OS Support for resource sharing Resource Management? Distributed OS can manage diverse resources of nodes in system Make resources visible on all nodes Like VM, can provide functional illusion bur rarely hide the performance cost Scheduling? Distributed OS could schedule processes to run near the needed resources If need to access data in a large database may be easier to ship code there and results back than to request data be shipped to code

Design Issues Transparency – the distributed system should appear as a conventional, centralized system to the user. Fault tolerance – the distributed system should continue to function in the face of failure. Scalability – as demands increase, the system should easily accept the addition of new resources to accommodate the increased demand. Clusters vs Client/Server Clusters: a collection of semi-autonomous machines that acts as a single system.

Why Distributed Systems? Communication Resource sharing Computational speedup Reliability

Computation Speedup Some tasks too large for even the fastest single computer Real time weather/climate modeling, human genome project, fluid turbulence modeling, ocean circulation modeling, etc. What to do? Leave the problem unsolved? Engineer a bigger/faster computer? Harness resources of many smaller (commodity?) machines in a distributed system?

Breaking up the problems To harness computational speedup must first break up the big problem into many smaller problems More art than science? Sometimes break up by function Pipeline? Job queue? Sometimes break up by data Each node responsible for portion of data set?

Linear Speedup Linear speedup is often the goal. Allocate N nodes to the job goes N times as fast Once you’ve broken up the problem into N pieces, can you expect it to go N times as fast? Are the pieces equal? Is there a piece of the work that cannot be broken up (inherently sequential?) Synchronization and communication overhead between pieces?

OS Support for Parallel Jobs Process Management? OS could manage all pieces of a parallel job as one unit Allow all pieces to be created, managed, destroyed at a single command line Fork (process,machine)? Scheduling? Programmer could specify where pieces should run and or OS could decide Process Migration? Load Balancing? Try to schedule piece together so can communicate effectively

OS Support for Parallel Jobs (con’t) Group Communication? OS could provide facilities for pieces of a single job to communicate easily Location independent addressing? Shared memory? Distributed file system? Synchronization? Support for mutually exclusive access to data across multiple machines Can’t rely on HW atomic operations any more Deadlock management? We’ll talk about clock synchronization and two-phase commit later

Why Distributed Systems? Communication Resource sharing Computational speedup Reliability

Distributed system offers potential for increased reliability If one part of system fails, rest could take over Redundancy, fail-over !BUT! Often the reality is that distributed systems offer less reliability “A distributed system is one in which some machine I’ve never heard of fails and I can’t do work!” Hard to get rid of all hidden dependencies No clean failure model Nodes don’t just fail they can continue in a broken state Partition network = many many nodes fail at once! (Determine who you can still talk to; Are you cut off or are they?) Network goes down and up and down again!

Robustness Detect and recover from site failure, function transfer, reintegrate failed site Failure detection Detecting hardware failure is difficult. To detect a link failure, a handshaking protocol can be used. Reconfiguration When Site A determines a failure has occurred, it must reconfigure the system: When the link or the site becomes available again, this information must again be broadcast to all other sites.

Issues for Distributed Systems Components can fail (not fail-stop) Network partitions can occur in which each portion of the distributed system thinks they are the only ones alive Don’t have a shared clock Can’t rely on hardware primitives like test-and- set for mutual exclusion …

Distributed Coordination To tackle this complexity need to build distributed algorithms for: Event Ordering Mutual Exclusion Atomicity Deadlock Handling Election Algorithms Reaching Agreement

In Summary... Course Objectives To introduce students to the abstractions and facilities provided by modern operating systems. To familiarize students with the issues that arise when implementing operation system services

In Summary... Course Objectives To introduce students to the abstractions and facilities provided by modern operating systems. To familiarize students with the issues that arise when implementing operation system services An Operating System is a software layer that... manages resources (“Who gets what, when”) provides an abstraction of the underlying hardware that is easier to program and use (“Virtual Machine”) Applications Operating Systems Hardware

This Semester Architectural support for OS; Application demand on OS Major components of an OS: Scheduling, Memory Management, Synchronization, File Systems,... How is the OS structured internally and what interfaces does it provide for using its services and tuning its behavior? What are the major abstractions modern OSes provide to applications and how are they supported?