Processes After today’s lecture, you are asked to know

Slides:

Advertisements

Similar presentations

CS-495 Distributed Systems Fabián E. Bustamante, Winter 2004 Processes Threads & OS Threads in distributed systems Object servers Code migration Software.

Advertisements

Processes: Code Migration Chapter 3 Will Cameron CSC 8530 Dr. Schragger.

Dr. Kalpakis CMSC621 Advanced Operating Systems Naming.

Naming Computer Engineering Department Distributed Systems Course Asst. Prof. Dr. Ahmet Sayar Kocaeli University - Fall 2014.

Distributed Systems Principles and Paradigms Chapter 04 Naming.

Naming in Distributed System Presented by Faraz Rasheed & Uzair Ahmed RealTime & Multimedia Lab Kyung Hee University, Korea.

The implementation of a name space

Describe the concept of lightweight process (LWP) and the advantages to using LWPs Lightweight process (LWP) lies in a hybrid form of user-level & kernel-level.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

Threads Clients Servers Code Migration Software Agents Summary

EECS122 - UCB 1 CS 194: Distributed Systems Processes, Threads, Code Migration Computer Science Division Department of Electrical Engineering and Computer.

Computer Science Lecture 9, page 1 CS677: Distributed OS Today: Naming Names are used to share resources, uniquely identify entities and refer to locations.

Computer Science Lecture 8, page 1 CS677: Distributed OS Code and Process Migration Motivation How does migration occur? Resource migration Agent-based.

Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming.

Chapter 9 Virtual Memory Produced by Lemlem Kebede Monday, July 16, 2001.

Processes After today’s lecture, you are asked to know The basic concept of thread and process. What are the advantages of using multi-threaded client.

EEC-681/781 Distributed Computing Systems Lecture 8 Wenbing Zhao Cleveland State University.

1 Chapter 4 Threads Threads: Resource ownership and execution.

Processes After today’s lecture, you are asked to know The basic concept of thread and process. What are the advantages of using multi-threaded client.

Processes After today’s lecture, you are asked to know The basic concept of thread and process. What are the advantages of using multi-threaded client.

A. Frank - P. Weisberg Operating Systems Introduction to Tasks/Threads.

Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming.

Give an example to show the advantages to using multithreaded Clients See page 142 of the core book (Tanebaum 2002).

NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.

Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.

DNS. Outline r Domain Name System r DNS Hierarchy r Resolution.

Chapter 51 Threads Chapter 5. 2 Process Characteristics  Concept of Process has two facets.  A Process is: A Unit of resource ownership:  a virtual.

Distributed Computing COEN 317 DC2: Naming, part 1.

ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream.

Computer Science Lecture 9, page 1 CS677: Distributed OS Today: Naming Names are used to share resources, uniquely identify entities and refer to locations.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.

5.1 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.

Naming Chapter 4. Name Spaces (1) A general naming graph with a single root node.

Naming Chapter 4.

Distributed Computing COEN 317 DC2: Naming, part 1.

Chapter 3 Process Description and Control

Processes Chapter 3. Table of Contents Multithreading Clients and Servers Code Migration Software Agents (special topic)

Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

Processes. Processes and threads Process forms a building block in distributed systems Processes granularity is not sufficient for distributed systems.

Computer Science Lecture 7, page 1 CS677: Distributed OS Multiprocessor Scheduling Will consider only shared memory multiprocessor Salient features: –One.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.

ADVANCED OPERATING SYSTEMS STRUCTURED NAMING BY KANNA KARRI.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

Review CS File Systems - Partitions What is a hard disk partition?

Naming CSCI 6900/4900. Mounting Mounting – Merging different namespaces transparently File system example –Directory node of one namespace stores identifier.

Lecture 9: Name and Directory Servers CDK4: Chapter 9 CDK5: Chapter 13 TVS: Chapter 5.

Naming CSCI 6900/4900. Names & Naming System Names have unique importance –Resource sharing –Identifying entities –Location reference Name can be resolved.

Naming CSCI 4780/6780. Name Space Implementation Naming service – A service that lets users to add/delete and lookup names In large distributed systems.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S

File System Implementation

Naming Chapter 4.

Naming A name in a distributed system is a string of bits or characters used to refer to an entity. To resolve name a naming system is needed.

CSI 400/500 Operating Systems Spring 2009

5.3. Structured Naming Advanced Operating Systems Fall 2017

Chapter 15, Exploring the Digital Domain

Background Program must be brought into memory and placed within a process for it to be run. Input queue – collection of processes on the disk that are.

Processes Chapter 3.

Distributed Systems CS

Processes Chapter 3.

Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.

CS510 Operating System Foundations

Processes Chapter 3.

Presentation transcript:

Processes After today’s lecture, you are asked to know The basic concept of thread and process. What are the advantages of using multi-threaded client and server? How the client deal with access transparency and replication transparency? What is stateless server and what is stateful server? What is iterative and concurrency server? What are the reasons for Migrating Code? Tells the three segments of a process. What are the weak mobility model and strong mobility model?

Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming is that a name can be resolved to the entity it refers to. To resolve names, it is necessary to implement a naming system. In distributed system, the implementation of a naming system is itself often distributed across multiple machines. Two things need to be considered for naming system are efficiency and scalability. Contents for this section: Discussing some general issues with respect to naming Organization & implementation of human friendly names, for example DNS

Process A Process is often defined as a program in execution. To execute a program, an operating system creates a number of virtual processors, each one for running a different program. To keep track of these virtual processors, the operating system has a process table, containing entries to store CPU register values, memory maps, open files, accounting information, privileges, etc Processes in a modern OS have: Management information Resources Unique process identifier or PID

Process Management information covers Resources include allocated resources event handling permissions scheduling information utilisation of resources Resources include memory allocated - address space open files, I/O channels, devices secondary storage allocations - swap space, memory mapping Address space is protected so only process may read/write to it. Memory protection stops process writing to address space of other processes or kernel crashing OS or other processes if it goes wrong

Process For creating a process, the OS must create a complete independent address. The price is high. Even for the switching of the CPU between two processes, because the OS will have to modify registers of the memory management unit (MMU) and invalidate address translation caches such as in the translation lookaside buffer (TLB) Requirements: Changing the memory map in the MMU Flashing the TLB (Translation lookaside buffer)

Process Creation in Java Class Run.java uses getRuntime() obtain context for spawning external process exec() get OS command interpreter to run command getInputStream() get input stream to read output from process

Thread A thread is very similar to a process in the sense that it can also be seen as the execution of a (part of a) program on a virtual processor. A thread context often consists of nothing more than the CPU context, along with some other information for thread management. Threads are sometimes called lightweight subcomputations running a in a process that Have their own flow of control and execution state Share their resource context – address space, open files

Thread Compared to processes, threads Threads are useful where are quick to create are quick to context switch can readily share memory, files and sockets Threads are useful where many concurrent computation units are needed computation units need to share address space easily

Thread Usage in Non-distributed Systems For a single-threaded process, whenever a blocking system call is executed, the process as a whole is blocked. Using the multithread process, a program can process more than two tasks at same time, for example the spreadsheet program. Multithreading also makes it possible to exploit parallelism when executing the program on a multiprocessor system. Thread switching can sometimes be done entirely in user space.

Java Thread Wire.java creates 2 threads that compete to print their IDs

Threads in Distributed Systems Multithreaded Clients Example: Web browser is doing a number of tasks simultaneously. It is designed as a multithreaded client program. Each thread sets up a separate connection to the server and pulls in the data. Advantages: Hiding communication latencies as much as possible by delivering text contents first, then image and other data. Several connections can be opened simultaneously. Web server can be replicated across multiple machines with multithreaded client. Connections maybe set up to different replicas, allowing data to be transferred in parallel.

Multithreaded Servers (1) A multithreaded server organized in a dispatcher/worker model.

Multithreaded Servers (2) Model Characteristics Threads Parallelism, blocking system calls Single-threaded process No parallelism, blocking system calls Finite-state machine Parallelism, nonblocking system calls Three ways to construct a server.

Clients Client-Side Software for Distribution Transparency Besides the user interface and other application-related software, client software comprises components for achieving distribution transparency. Access transparency is generally handled through the generation of a client stub from an interface definition of what the server has to offer. Replication transparency in many distributed systems is handled by means of client-side solution. One way is forward invocation request to each replica and client proxy collects all responses transparently and passes a single return value to the client application.

Client-Side Software for Distribution Transparency A possible approach to transparent replication of a remote object using a client-side solution.

General Server Design Issues A server is a process implementing a specific service on behalf of a collection of clients. It is organized in this way: it waits for an incoming request from a client and subsequently ensures that the request is taken care of, after which it waits for the next incoming request. Issues: Iterative server: the server itself handles the request and, if necessary, returns a response to the requesting client. Concurrent server: it does not handle the request itself, but passes it to a separate thread or another process, after which it immediately waits for the next incoming request. E.g. Multithreaded server, or Unix way: fork a new process for each new incoming request. Discuss the endpoint (port) and how to manage it. Whether or not the server is stateless: A stateless server does not keep information on the sate of its clients, and can change its own state without having to inform any client, e.g. A Web Server. A stateful server does maintain information on its clients, e.g. a file server that allows a client to keep a local copy of a file.

Servers: General Design Issues 3.7 Client-to-server binding using a daemon as in DCE Client-to-server binding using a superserver as in UNIX (e.x. inetd

CODE MIGRATION Reasons for Migrating Code Code migration in distributed systems took place in the form of process migration. That reason has always been performance: The process should be close to where that data reside. A. Migrating parts of the client to server when doing the database operation. B. Migrating parts of the server to client in interactive database applications. Code migration can be used to improve performance by exploiting parallelism.

Reasons for Migrating Code The principle of dynamically configuring a client to communicate to a server. The client first fetches the necessary software, and then invokes the server.

Models for Code Migration As in process migration, the execution status of a program, pending signals and other parts of the environment must be moved as well. A process consists of three segments according to Fugetta’s framework: Code segment is the part that contains the set of instructions that make up the program that is being executed. Resource segment contains references to external resources needed by the process, such as file, printers, devices, other processes, and so on. Execution segment is used to store the current execution state of a process, consisting of private data, the stack, and the program counter. Weak mobility model: In this model, it is possible to transfer only the code segment, along with perhaps some initialization data. Feature: a transferred program is always started from its initial state, e.g. Java applets. Strong mobility model: Besides the code segment being transferred, the execution segment can be transferred as well. Feature: A running process can be stopped, subsequently moved to another machine, and then resume execution where it left off.

Models for Code Migration Even for the upper two models, further distinction can be made between sender-initiated and receiver-initiated migration. In sender-initiated migration, migration is initiated at the machine where the code currently resides or is being executed. In receiver-initiated migration, the initiative for code migration is taken by the target machine In the case of weak mobility, it also makes a difference if the migrated code is executed by the target process, or whether a separate process is started, e.g. Java applets sere executed in the browser’s address space. For strong mobility model, instead of moving a running process, it can also be supported by remote cloning.

Models for Code Migration Alternatives for code migration.

Migration and Local Resources Three types of process-to-resource bindings: Strongest binding – binding by identifier is when a process refers to a resource by its identifier. E.x. when a process uses a URL to refer to a specific Web site by means of that server’s IP address. Weaker form binding is when only the value of a resource is needed. It is also called binding by value. The execution of the process wouldnot be affected if another resource would provide the same value. E.x. a program relies on standard libraries. The weakest form of binding is when a process indicates it needs only a resource of a specific type. This binding by type is exemplified by references to local devices, such as monitors, printers, and so on.

Resource-to machine binding Migration and Local Resources Resource-to machine binding Unattached Fastened Fixed By identifier By value By type MV (or GR) CP ( or MV, GR) RB (or GR, CP) GR (or MV) GR (or CP) GR RB (or GR) Process-to-resource binding Actions to be taken with respect to the references to local resources when migrating code to another machine. GR: Establish a global system wide reference. MV: Move the resource. CP: Copy the value of the resource. RB: Rebind process to locally available resource. Three types of resource to machine bindings: Unattached resources can be easily moved between different machines ( e.x. data) Fastened resources. Moving or copying may be possible Fixed resources. Often refer to local devices.

Naming Entities Names, Identifiers, and Addresses A name in a distributed system is a string of bits or characters that is used to refer to an entity. An entity here can be anything practical: process, printer, mailbox, webpage, hosts, disk….. It can be operated on. The name of an access point is called an address An identifier for entities is a name that has the following properties: An identifier refers to at most one entity Each entity is referred to by at most one identifier An identifier always refers to the same entity (i.e. it is never reused).

Name space Names in distributed system are organized into name space. A name space can be represented as a labelled, directed graph with two types of nodes: A leaf node represents a named entity and has the property that it has no outgoing edges. A directory node has a number of outgoing edges, each labelled with a name. A directory node stores a directory table in which an outgoing edges is represented as a pair (edge label, node identifier) Each path in a naming graph can be referred to by the sequence of labels corresponding to the edges in that path such as: N:<label-1, label-2, …, label-n> If N is the root of the naming graph, it is called an absolute path name. Otherwise, it is called a relative path name. global name and local name.

Name Spaces (1) A general naming graph with a single root node.

Name Spaces (2) The general organization of the UNIX file system implementation on a logical disk of contiguous disk blocks.

Name Resolution The process of looking up a name is called name resolution To explain how name resolution works, consider a path name such as N:<label-1, label-2, …, label-n>. Resolution of this name starts at node N of the naming graph, where the name label-1 is looked up in the directory table, and which returns the identifier of the node to which label-1 refers. Resolution continues to label-n by returning the content of that node. Name Resolution includes topics: Closure Mechanism Linking and Mounting

Closure Mechanism Knowing how and where to start name resolution is generally referred to as a closure mechanism. Essentially, a closure mechanism deals with selecting the initial node in a name space from which name resolution is to start 00442078156340 HOME in UNIX

Linking and Mounting Strongly related to name resolution is the use of aliases. An alias is another name for the same entity. Two approaches to implement alias: The first approach is to simply allow multiple absolute paths names to refer to the same node in a naming graph. (Fig 4.1) (hard links). The second approach is to represent an entity by a leaf node, say N, but instead of storing the address or state of that entity, the node stores an absolute path name. (Fig 4.3) (path name /home/steen/keys, which refers to a node containing the absolute path name /keys, is a symbolic link to node n5. Mounting is one way to merge different name spaces Mount point and mounting point The directory node storing the node identifier is called a mount point. The directory node in the foreign name space is called a mounting point. To mount a foreign name space in distributed system requires at least the following information: The name of an access protocol The name of the server. The name of the mounting point in the foreign name space.

Linking and Mounting (1) The concept of a symbolic link explained in a naming graph.

Linking and Mounting (2) Mounting remote name spaces through a specific process protocol.

The implementation of a Name Space A name space forms the heart of a naming service, that is, a service that allows users and processes to add, remove, and look up names. A naming service is implemented by name server. The contents of this part includes: Name Space Distribution Implementation of Name Resolution

Name Space Distribution why name spaces should be arranged hierarchically? Decrease possibility of name conflicts, reduce the size of naming contexts, make name bindings more meaningful, make lookups more efficient and enable federation of name servers.

Name Space Distribution Name spaces for a large-scale, possibly worldwide distributed system, are usually organized hierarchically. The name space is partitioned into three logical layers: The name space is partitioned into three logical layers: The global layer is formed by highest-level. This layer is often characterized by its stability; the directory tables in this layer are rarely changed (19) The administrational layer is formed by directory nodes that together are managed within a single organization. A characteristic feature of the directory nodes in the administrational layer is that they represent groups of entities that belong to the same organization or administrational unit. The managerial layer consists of nodes that may typically change regularly. The nodes in this layer are maintained not only by system administrators, but also by individual end users of a distributed system.

Name Space Distribution The name space is divided into nonoverlapping parts, called zones in DNS. A zone is a part of the name space that is implemented by a separate name server. Name servers in each layer have to meet different requirements

Name Space Distribution (1) An example partitioning of the DNS name space, including Internet-accessible files, into three layers.

Name Space Distribution (2) Item Global Administrational Managerial Geographical scale of network Worldwide Organization Department Total number of nodes Few Many Vast numbers Responsiveness to lookups Seconds Milliseconds Immediate Update propagation Lazy Number of replicas None or few None Is client-side caching applied? Yes Sometimes A comparison between name servers for implementing nodes from a large-scale name space partitioned into a global layer, as an administrational layer, and a managerial layer.

Implementation of Name Resolution Each client has access to a local name resolver, which is responsible for ensuring that the name resolution process is carried out. Assume the (absolute) path name root:<nl,vu,cs,ftp,pub,globe,index.txt> is to be resolved. Using a URL notation, this path name would correspond to ftp://ftp.cs.vu.nl/pub/globe/index.txt , there is two ways to implement name resolution: In iterative name resolution, a name resolver hands over the complete name to the root name server. With recursive name resolution, a name server passes the result to the next name server it finds. The drawback of recursive name resolution is that it puts a higher performance demand on each name server. Its two important advantages are: caching result is more effective compared to iterative name resolution; the communication costs may be reduced.

Implementation of Name Resolution (1) The principle of iterative name resolution.

Implementation of Name Resolution (2) The principle of recursive name resolution.

Implementation of Name Resolution (3) Server for node Should resolve Looks up Passes to child Receives and caches Returns to requester cs <ftp> #<ftp> -- vu <cs,ftp> #<cs> #<cs> #<cs, ftp> ni <vu,cs,ftp> #<vu> #<cs> #<cs,ftp> #<vu> #<vu,cs> #<vu,cs,ftp> root <ni,vu,cs,ftp> #<nl> #<nl> #<nl,vu> #<nl,vu,cs> #<nl,vu,cs,ftp> Recursive name resolution of <nl, vu, cs, ftp>. Name servers cache intermediate results for subsequent lookups.

Example: The Domain Name System The DNS Name Space The DNS name space is hierarchically organized as a rooted tree. A label is a case-insensitive string made up of alphanumeric characters. A label has a maximum length of 63 characters; the length of a complete path name is restricted to 255 characters. The label attached to a node’s incoming edge is also used as the name for that node. A subtree is called a domain; a path name to its root node is called a domain name. The contents of a node is formed by a collection of resource records.

The DNS Name Space The most important types of resource records forming the contents of nodes in the DNS name space. Type of record Associated entity Description SOA Zone Holds information on the represented zone A Host Contains an IP address of the host this node represents MX Domain Refers to a mail server to handle mail addressed to this node SRV Refers to a server handling a specific service NS Refers to a name server that implements the represented zone CNAME Node Symbolic link with the primary name of the represented node PTR Contains the canonical name of a host HINFO Holds information on the host this node represents TXT Any kind Contains any entity-specific information considered useful

DNS Implementation (1) An excerpt from the DNS database for the zone cs.vu.nl.