Download presentation
Presentation is loading. Please wait.
1
IS473 Distributed Systems
CHAPTER 1 Characterization of Distributed Systems Distributed Systems Concept & Design fourth edition
2
Dr. Almetwally Mostafa © summer 2009
OUTLINE Distributed System Definitions. Distributed Systems Examples: The Internet. Intranets. Mobile and Ubiquitous Computing. Resource Sharing. The World Wide Web. Distributed Systems Challenges. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
3
Distributed System Definitions
A system in which hardware or software components located at networked computers communicate and coordinate their actions only by message passing. Concurrency of components. No global clock. Independent failures. A collection of two or more independent computers which coordinate their processing through the exchange of synchronous or asynchronous message passing. A collection of independent computers that appear to the users of the system as a single computer. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
4
Computer Networks vs. Distributed Systems
Computer Network: the independent computers are explicitly visible (can be explicitly addressed). Distributed System: existence of multiple independent computers is transparent. However, many problems in common, in some sense networks or parts of them (e.g., name services) are also distributed systems, and normally, every distributed system relies on services provided by a computer network. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
5
Why Distributed Systems
Functional distribution: computers have different functional capabilities (i.e., sharing of resources with specific functionalities). Load distribution / balancing: assign tasks to processors such that the overall system performance is optimized. Replication of processing power: independent processors working on the same task: Distributed systems consisting of collections of microcomputers may have processing powers that no supercomputer will ever achieve. Physical separation: systems that rely on the fact that computers are physically separated (e.g., to satisfy reliability requirements). Economics: collections of microprocessors offer a better price/performance ration than large mainframes. mainframes: 10 times faster, 1000 times as expensive IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
6
Distributed Systems Examples (The Internet)
The Internet is a vast interconnected collection of computer networks of many types. Its design enabling a program running anywhere to address messages to programs anywhere else. Allowing its users to make use of many services as: WWW, , Web hosting, and File transfer. Its services can be extended by adding new types of service (open-ended services). Small organizations and individual users can to access internet services through Internet Service Providers (ISPs). Independent intranets are linked together by high transmission capacity circuits called backbones. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
7
Distributed Systems Examples (The Internet)
intranet ISP desktop computer: backbone satellite link server: % network link: A typical portion of the Internet IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
8
Distributed Systems Examples (Intranets)
An Intranet is a portion of the internet that is administrated separately and its local security policies are enforced by a configured boundary. Composed of several local area networks (LANs) linked by backbone connections to allow its users to access the provided services. Connected to the Internet via a router which allows its users to make use of the internet services elsewhere. Many organization protect their own services from unauthorized use by filtering incoming and outgoing messages using a firewall. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
9
Distributed Systems Examples (Intranets)
A typical Intranet IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
10
Distributed Systems Examples (Mobile and Ubiquitous Computing)
The portability of many computing devices and the ability to connect to networks in different places makes mobile computing possible. Mobile computing is the performance of computing tasks while the users are on the move and away from their residence intranet but still provided with access to resources via the devices they carry with them. Ubiquitous computing is the harnessing (يستخدم) of many small cheap computational devices that are present in user’s physical environments. Ubiquitous and mobile computing overlap but they are generally distinct. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
11
Distributed Systems Examples (Mobile and Ubiquitous Computing)
Portable and handheld devices in a distributed system IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
12
Dr. Almetwally Mostafa © summer 2009
Resource Sharing Resources in a distributed system are: Encapsulated within computers. Can only accessed from other computers by communication. Managed by a program that offers a communication interface. The term service is used for a distinct part of a computer system which manages a collection of related resources and presents their functionality to users and applications via a well-defined set of operations. Server refers to a running program on a networked computer that accepts request messages from programs, clients, running on other computers to perform a service and responds appropriately by reply messages. A complete interaction between a client and a server is called a remote invocation. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
13
The World Wide Web (WWW)
An executing web browser is an example of a client communicates with a web server to request web pages from it. WWW is an open client-server architecture implemented on top of the Internet. Users use the Web through available web browsers to retrieve and view documents of many types and interact with unlimited set of services. Web provides a hypertext structure among the documents that it stores. (i.e., the documents contain references, links, to other related documents and resources stored also in the web). IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
14
The World Wide Web (WWW)
Internet Browsers Web servers Protocols Activity.html File system of Web servers and web browsers IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
15
The World Wide Web (WWW)
Web is an open system and can be extended and implemented in new ways without disturbing its existing functionality. Its operation is based on communication standards and document standards that are freely published and widely implemented. Types of resources that can be published and shared on it is open (plug-ins handle new document formats without changing the web browser). Web is based on three main standard technological components: The HyperText Markup Language (HTML) specifies the contents and layout of pages as they are displayed by web browser. Uniform Resource Locators (URLs) identify documents and other resources accessible via the web. The HyperText Transfer Protocol (HTTP) determines the standard rules for interaction between browsers and web servers to fetch documents and other resources. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
16
World Wide Web Components (HTML)
HTML is used to specify: The text and images that make up the contents of a web page, How web page components are formatted for presentation to the user, and Links and which resources are associated with them. HTML is produced using a standard text editor or an HTML tool. HTML uses tags to specify content: <IMG SRC = “ <P> Welcome to Earth! Visitor may also be interested in taking a look at the <A HREF = “ Moon </A> </P> ……………………… HTML text is stored in a file accessible by a web server. A browser retrieves the file from the web server and interprets its HTML text to display the web page in the familiar fashion. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
17
World Wide Web Components (URLs)
Browsers examine URLs in order to fetch the corresponding resources from web servers. A URL is typed by the user into the browser or selected from their bookmarks. Also, the browser looks up the corresponding URL when the user clicks on a link or fetches a resource embedded in a web page such an image. Every URL in its full form has tow top-level components: scheme : scheme-specific-location URL examples: ftp://ftp.downloadIt.com/software/aProg.exe IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
18
World Wide Web Components (HTTP)
HTTP defines the ways in which browsers and any other types of client interact with web server. HTTP is a request-reply protocol: The client sends a request message to the server containing the URL of the required resource, Then the server looks up the pathname and if it exists, sends back the file’s contents in a reply message to the client. Browser are not necessarily capable of handling every type of the file’s contents: A browser includes a list of preferred types when makes a request. The server take this into account when returns the file content and includes the content type in the reply message so that the browser know how to process it. HTTP allows the client to request one resource at a time. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
19
Distributed Systems Challenges
A number of challenges are recognized in the design of distributed systems. These challenges have been tackled with varying degrees of success in existing systems. The list below gives some indication of measures that are employed to meet each challenge: Heterogeneity (التغاير): standards and protocols; middleware; virtual machine; Openness: publication of services; notification of interfaces; Security: firewalls; encryption; Scalability: replication; caching; multiple servers; Failure Handling: failure tolerance; recover/roll-back; redundancy; Concurrency: concurrency control to ensure data consistency; Transparency: middleware; location transparent naming; anonymity (كتمان) IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
20
Distributed Systems Challenges (Heterogeneity)
The internet enables users to access services over a variety and difference: Networks. Operating systems. Implementations by different developers. The different networks are masked by the fact that all of the attached computers are communicate to each other using standard internet protocols. The middleware software layers (like CORBA) provide a programming abstraction as well as masking the heterogeneity of the underlying networks, hardware, operating systems and programming languages. Virtual machines approach provides a way of making code executable on any hardware – mobile code. Computer hardware. Programming languages. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
21
Distributed Systems Challenges (Openness)
The openness of distributed systems is determined by the degree to which new resource-sharing services can be added and be available for use by variety of client programs (services publication). The specification and documentation of key software interfaces of the system components must be available to software developers (interfaces notification). Systems that are designed to support resource sharing in this way are termed open distributed systems to emphasize the fact that they are extensible. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
22
Distributed Systems Challenges (Security)
The security of many information resources available and maintained in distributed systems is importance. Security for information resources has three components: Confidentiality(سرية): protection against disclosure (كشف) to unauthorized individuals. Integrity: protection against alteration or corruption. Availability: protection against interference with the means to access the resources. A firewall can be used to form a barrier around an intranet to protect it from outside users but does not deal with ensuring the appropriate use of resources by users within the intranet. Encryption can be used to provide adequate protection of shared resources and to keep sensitive information secret when is transmitted in messages over the internet. Receiving of an executable program (mobile code) as an electronic mail attachment needs to be handled with care – effects of running the program are unpredictable! IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
23
Distributed Systems Challenges (Scalability)
A system is described as scalable if will remain effective when there is a significant increase in the number of resources and the number of users. The design of scalable dist. systems presents many challenges: Controlling the cost of physical resources. Controlling the performance loss. Preventing software resources running out. Avoiding performance bottlenecks. Computers in the internet Date Computers Web servers 1979, Dec. 188 1989, July 130,000 1999, July 56,218,000 5,560,866 IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
24
Distributed Systems Challenges (Failure Handling)
Failures in a distributed system are partial, therefore the handling of it is particularly difficult. Some failures can be detected but others are difficult or impossible to detect it. Some detected failures can be hidden or made less severe. Recovery from failures involves the design of software so that the state of permanent data can be rolled back after a server has crashed. Services can be made to tolerate failures by the use of redundant components: At least two different routes between any two routers in the internet. Databases and domain name tables must be replicated in several servers. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
25
Distributed Systems Challenges (Concurrency)
Several clients in a distributed system can access a shared resource at the same time Any object that represents a shared resource in a distributed system (a programmer that implement it) must be responsible for ensuring that it operate correctly in a concurrent environment. Operations of objects in a concurrent environment must be synchronized in a way that its data remains consistent. The synchronization of an object operations can be achieved by standard techniques such as semaphores. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
26
Distributed Systems Challenges (Transparency)
Access transparency: enables local and remote resources to be accessed using identical operations. Location transparency: enables resources to be accessed without knowledge of their location. Concurrency transparency: enables several processes to operate concurrently using shared resources without interference between them. Replication transparency: enables multiple instances of resources to be used to increase reliability and performance without knowledge of the replicas by users or application programmers. Failure transparency: enables the concealment of faults, allowing users and application programs to complete their tasks despite the failure of hardware or software components. Mobility transparency: allows the movement of resources and clients within a system without affecting the operation of users or programs. Performance transparency: allows the system to be reconfigured to improve performance as loads vary. Scaling transparency: allows the system and applications to expand in scale without change to the system structure or the application algorithms. Access and location transparency together provide network transparency. IS473 - Chapter (1) George Coulouris Dr. Almetwally Mostafa © summer 2009
27
IS473 Distributed Systems
CHAPTER 2 System Models Dr. Almetwally Mostafa
28
Dr.Almetwally Mostyafa Summer 2009
OUTLINE Distributed System Models. Distributed System Software Layers Distributed System Architectures. Client-Server Model Variations. Distributed Processes Interfaces and Objects. Distributed Architectures Design Requirements. Fundamental Models: Interaction Model. Failure Model. Security Model. Dr.Almetwally Mostyafa Summer 2009
29
Distributed System Models
Architectural models: (as client-server and peer process models) Define the way in which the components of systems are: Interact with one another, and Mapped onto an underlying network of computers. Describe the layered structure of distributed system software. Fundamental models: Concerned with properties that are common in all of the architectural models. Addressed by three models: The interaction model: deals with the difficulty of setting time limits. The failure model: attempts to give a specification of the exhibited faults by processes and communication channels. The security model: discusses possible threats to processes and communication channels. Dr.Almetwally Mostyafa Summer 2009
30
Software Layers In the layered view of a system each layer offers its services to the level above and builds its own service on the services of the layer below. Software architecture is the structuring of software in terms of layers (modules) or services that can be requested locally or remotely. Applications Computer and network hardware Platform Operating system Middleware Dr.Almetwally Mostyafa Summer 2009
31
Dr.Almetwally Mostyafa Summer 2009
Software Layers Platform: Lowest-level layers that provide services to other higher layers. bring a system’s programming interface for communication and coordination between processes . Examples: Pentium processor / Windows NT SPARC processor / Solaris Middleware: Layer of software to mask heterogeneity and provide a unified distributed programming interface to application programmers. Provide services, infrastructure services, for use by application programs. Object Management Group’s Common Object Request Broker Architecture (CORBA). Java Remote Object Invocation (RMI). Microsoft’s Distributed Common Object Model (DCOM). Limitation: require application level involvement in some tasks. Dr.Almetwally Mostyafa Summer 2009
32
Dr.Almetwally Mostyafa Summer 2009
System Architectures The architecture include: The division of responsibilities between system components. The placement of the components on computers in the network. Client-server model: Most important and most widely distributed system architecture. Client and server roles are assigned and changeable. Servers may in turn be clients of other servers. Services may be implemented as several interacting processes in different host computers to provide a service to client processes: Servers partition the set of objects on which the service is based and distribute them among themselves (e.g. Web data and web servers) Servers maintain replicated copies of the service objects on several hosts for reliability and performance (e.g. AltaVista) Dr.Almetwally Mostyafa Summer 2009
33
System Architectures Clients invoke individual servers Key: Client
invocation invocation result result Server Client Key: Process: Computer: Clients invoke individual servers Dr.Almetwally Mostyafa Summer 2009
34
System Architectures A service provided by multiple servers Service
Client Server Client Server A service provided by multiple servers Dr.Almetwally Mostyafa Summer 2009
35
Centralized Architectures
Figure 2-3. General interaction between a client and a server. Dr.Almetwally Mostyafa Summer 2009 Dr. Almetwally Mostafa
36
The Client-Server Paradigm(نموذج)
Perhaps the best known paradigm for network applications, the client-server2 model assigns asymmetric roles to two collaborating processes. One process, the server, plays the role of a service provider which waits passively for the arrival of requests. The other, the client, issues specific requests to the server and awaits its response. Dr.Almetwally Mostyafa Summer 2009
37
The Client-Server Paradigm - 2
Simple in concept, the client-server model provides an efficient abstraction for the delivery of network services. Operations required include those for a server process to listen and to accept requests, and for a client process to issue requests and accept responses. By assigning asymmetric roles to the two sides, event synchronization is simplified: the server process waits for requests, and the client in turn waits for responses. Many Internet services are client-server applications. These services are often known by the protocol that the application implements. Well known Internet services include HTTP, FTP, DNS, finger, gopher, etc. Dr.Almetwally Mostyafa Summer 2009
38
Multitiered Architectures (3)
Figure 2-6. An example of a server acting as client. Dr.Almetwally Mostyafa Summer 2009 Dr. Almetwally Mostafa
39
Dr.Almetwally Mostyafa Summer 2009
System Architectures Caches and proxy servers: Cache: A store of recently used data objects that is closer to the client process than those remote objects. When an object is needed by a client process the caching service checks the cache and supplies the object from there in case of an up-to-date copy is available. Proxy server: Provides a shared cache of web resources for client machines at a site or across several sites. Increase availability and performance of a service by reducing load on the WAN and web servers. May be used to access remote web servers through a firewall. Dr.Almetwally Mostyafa Summer 2009
40
Dr.Almetwally Mostyafa Summer 2009
System Architectures Client Web server Proxy server Web Client server Web proxy server Dr.Almetwally Mostyafa Summer 2009
41
System Architectures Peer processes:
All processes play similar roles without destination as a client or a server. Interacting cooperatively to perform a distributed activity. Communications pattern will depend on application requirements. Application Coordination code A distributed application based on peer processes Dr.Almetwally Mostyafa Summer 2009
42
Client-server Model Variations (Mobile Code)
Example: Java applets The user running a browser selects a link to an applets whose code is stored on a web server. The code is downloaded to the browser and runs there. Advantage: Good interactive response since. Does not suffer from the delays or variability of bandwidth associated with network communication. Disadvantage: Security threat to the local resources in the destination computer. Dr.Almetwally Mostyafa Summer 2009
43
Client-server Model Variations (Mobile Code)
a) client request results in the downloading of applet code Web server Client Applet Applet code b) client interacts with the applet Web applets Dr.Almetwally Mostyafa Summer 2009
44
Client-server Model Variations (Mobile Agents)
A running program (including both code and data) that travels from one computer to another in a network carrying out a task on someone’s behalf. Can make many invocations to local resources at each visited site. Visited sites must decide which local resources are allowed to use based on the identity of the user owning the agent. Advantage: Reduce communication cost and time by replacing remote invocation with local ones. Disadvantages: Limited applicability. Security threat of the visited sites resources. Dr.Almetwally Mostyafa Summer 2009
45
Dr.Almetwally Mostyafa Summer 2009
Client-server Model Variations (Network Computers) Downloads its operating system and any applications needed by the user from a remote file server. Applications run locally but files are managed by a remote file server. Users can migrate from one network computer to another. Its processor and memory capacities can be restricted to reduce its cost. Its disk (if included) holds only a minimum of software and use the reminder space as cache storage to hold copies of the most recently software and data files loaded from servers. Dr.Almetwally Mostyafa Summer 2009
46
Dr.Almetwally Mostyafa Summer 2009
Client-server Model Variations (Thin Clients) Software layer that supports a window-based user interface on a local computer while executing application programs on a remote computer. Same as the network computer scheme but instead of downloading the applications code into the user’s computer, it runs them on a server machine, compute server. Compute server is a powerful computer that has the capacity to run large numbers of applications simultaneously. Disadvantage: Increasing of the delays in highly interactive graphical applications Dr.Almetwally Mostyafa Summer 2009
47
Client-server Model Variations (Thin Clients)
Compute server Network computer or PC network Application Thin Client Process Thin clients and compute servers Dr.Almetwally Mostyafa Summer 2009
48
Dr.Almetwally Mostyafa Summer 2009
The Peer-to-Peer System Architecture In system architecture and networks, peer-to-peer is an architecture where computer resources and services are direct exchanged between computer systems. These resources and services include the exchange of information, processing cycles, cache storage, and disk storage for files.. In such an architecture, computers that have traditionally been used solely as clients communicate directly among themselves and can act as both clients and servers, assuming whatever role is most efficient for the network. Dr.Almetwally Mostyafa Summer 2009
49
The Peer-to-Peer Distributed Computing Paradigm
In the peer-to-peer paradigm, the participating processes play equal roles, with equivalent capabilities and responsibilities (hence the term “peer”). Each participant may issue a request to another participant and receive a response. Dr.Almetwally Mostyafa Summer 2009
50
Peer-to-Peer distributed computing
Whereas the client-server paradigm is an ideal model for a centralized network service, the peer-to-peer paradigm is more appropriate for applications such as instant messaging, peer-to-peer file transfers, video conferencing, and collaborative work. It is also possible for an application to be based on both the client-server model and the peer-to-peer model. A well-known example of a peer-to-peer file transfer service is Napster.com or similar sites which allow files (primarily audio files) to be transmitted among computers on the Internet. It makes use of a server for directory in addition to the peer-to-peer computing. Dr.Almetwally Mostyafa Summer 2009
51
Dr.Almetwally Mostyafa Summer 2009
52
Peer-to-Peer distributed computing
The peer-to-peer paradigm can be implemented with facilities using any tool that provide message-passing, or with a higher-level tool such as one that supports the point-to-point model of the Message System paradigm. For web applications, the web agent is a protocol promoted by the XNSORG (the XNS Public Trust Organization) for peer-to-peer interprocess communication “Project JXTA is a set of open, generalized peer-to-peer protocols that allow any connected device (cell phone, to PDA, PC to server) on the network to communicate and collaborate. JXTA is short for Juxtapose, as in side by side. It is a recognition that peer to peer is juxtapose to client server or Web based computing -- what is considered today's traditional computing model. “ Dr.Almetwally Mostyafa Summer 2009
53
Interfaces and Objects
Interface definitions specify the set of functions available for invocation in a server (or peer) processes. In OO paradigm, distributed processes can be constructed as objects whose methods can be accessed by remote invocation (COBRA approach). Many objects can be encapsulated in server or peer processes. Number, types, and locations (in some implementation) of objects may change dynamically as system activates require. Therefore, new services can be instantiated and immediately be made available for invocation. Dr.Almetwally Mostyafa Summer 2009
54
The Message Passing Paradigm
Message passing is the most fundamental paradigm for distributed applications. A process sends a message representing a request. The message is delivered to a receiver, which processes the request, and sends a message in response. In turn, the reply may trigger a further request, which leads to a subsequent reply, and so forth. The message- Dr.Almetwally Mostyafa Summer 2009
55
The Message Passing Paradigm - 2
Message passing is the most fundamental paradigm for distributed applications. A process sends a message representing a request. The message is delivered to a receiver, which processes the request, and sends a message in response. In turn, the reply may trigger a further request, which leads to a subsequent reply, and so forth. - Dr.Almetwally Mostyafa Summer 2009
56
The Message Passing Paradigm - 3
The basic operations required to support the basic message passing paradigm are send, and receive. For connection-oriented communication, the operations connect and disconnect are also required. With the abstraction provided by this model, the interconnected processes perform input and output to each other, in a manner similar to file I/O. The I/O operations encapsulate the detail of network communication at the operating-system level. The socket application programming interface is based on this paradigm. Dr.Almetwally Mostyafa Summer 2009
57
Architectures Design Requirements
Performance Issues: Considered under the following factors: Responsiveness: Fast and consistent response time is important for the users of interactive applications. Response speed is determined by the load and performance of the server and the network and the delay in all the involved software components. System must be composed of relatively few software layers and small quantities of transferred data to achieve good response times. Throughput: The rate at which work is done for all users in a distributed system. Load balancing: Enable applications and service processes to proceed concurrently without competing for the same resources. Exploit (مأثرة)available processing resources. Dr.Almetwally Mostyafa Summer 2009
58
Architectures Design Requirements
Quality of Service: Main system properties that affect the service quality are: Reliability: related to failure fundamental model (discussed later). Performance: ability to meet timeliness guarantees. Security: related to security fundamental model (discussed later). Adaptability: ability to meet changing resource availability and system configurations. Dependability issues: A requirement in most application domains. Achieved by: Fault tolerance: continuing to function in the presence of failures. Security: locate sensitive data only in secure computers. Correctness of distributed concurrent programs: research topic. Dr.Almetwally Mostyafa Summer 2009
59
Fundamental Models (Interaction Model)
Distributed systems consists of multiple interacting processes with private set of data that can access. Distributed processes behavior is described by distributed algorithms. Distributed algorithms define the steps to be taken by each process in the system including the transmission of messages between them. Transmitted messages transfer information between these processes and coordinate their ordering and synchronization activities. Dr.Almetwally Mostyafa Summer 2009
60
Fundamental Models (Interaction Model)
Interacting processes in a distributed system are affected by two significant factors: Performance of communication channels: is characterized by: Latency: delay between sending and receipt of a message including Network access time. Time for first bit transmitted through a network to reach its destination. Processing time within the sending and receiving processes. Throughput: number of units (e.g., packets) delivered per time unit. Bandwidth: total amount of information transmitted per time unit. Jitter: variation in the time taken to deliver multiple messages of the same type (relevant to multimedia data). Dr.Almetwally Mostyafa Summer 2009
61
Fundamental Models (Interaction Model)
Computer clocks: Each computer in a distributed system has its own internal clock to supply the value of the current time to local processes. Therefore, two processes running on different computers read their clocks at the same time may take different time values. Clock drift rate refers to the relative amount a computer clock differs from a perfect reference clock. Several approaches to correcting the times on computer clocks are proposed. Clock corrections can be made by sending messages, from a computer has an accurate time to other computers, which will still be affected by network delays. Dr.Almetwally Mostyafa Summer 2009
62
Fundamental Models (Interaction Model)
Setting time limits for process execution, as message delivery, in a distributed system is hard. Two opposing extreme positions provide a pair of simple interaction models: Synchronous distributed systems: A system in which the following bounds are defined: Time to execute each step of a process has known lower and upper bounds. Each message transmitted over a channel is received within a known bounded time. Each process has a local clock whose drift rate from perfect time has a known bound. Easier to handle, but determining realistic bounds can be hard or impossible. Dr.Almetwally Mostyafa Summer 2009
63
Fundamental Models (Interaction Model)
Asynchronous distributed systems: A system in which there are no bounds on: process execution times. message delivery times. clock drift rate. Allows no assumptions about the time intervals involved in any execution. Exactly models the Internet. Browsers are designed to allow users to do other things while they are waiting. More abstract and general: A distributed algorithm executing on one system is likely to also work on another one. Dr.Almetwally Mostyafa Summer 2009
64
Fundamental Models (Interaction Model)
Event ordering: when need to know if an event at one process (sending or receiving a message) occurred before, after, or concurrently with another event at another process. It is impossible for any process in a distributed system to have a view on the current global state of the system. The execution of a system can be described in terms of events and their ordering despite the lack of accurate clocks. Logical clocks define some event order based on causality. Logical time can be used to provide ordering among events in different computers in a distributed system (since real clocks cannot be synchronized). Dr.Almetwally Mostyafa Summer 2009
65
Fundamental Models (Interaction Model)
Real-time ordering of events Dr.Almetwally Mostyafa Summer 2009
66
Fundamental Models (Failure Model)
Defines the ways in which failure may occur in order to provide an understanding of its effects. A taxonomy of failures which distinguish between the failures of processes and communication channels is provided: Omission failures Process or channel failed to do something. Arbitrary failures Any type of error can occur in processes or channels (worst). Timing failures Applicable only to synchronous distributed systems where time limits may not be met. Dr.Almetwally Mostyafa Summer 2009
67
Fundamental Models (Failure Model)
Process p Process q Communication channel send Outgoing message buffer Incoming message buffer receive m Processes and channels Dr.Almetwally Mostyafa Summer 2009
68
Fundamental Models (Failure Model)
Omission and arbitrary failures Class of failure Affects Description Fail-stop Process Process halts and remains halted. Other processes may detect this state. Crash not be able to detect this state. Omission Channel A message inserted in an outgoing message buffer never arrives at the other end’s incoming message buffer. Send-omission A process completes a send, but the message is not put in its outgoing message buffer. Receive-omission A message is put in a process’s incoming message buffer, but that process does not receive it. Arbitrary (Byzantine) Process or channel Process/channel exhibits arbitrary behaviour: it may send/transmit arbitrary messages at arbitrary times, commit omissions; a process may stop or take an incorrect step. Dr.Almetwally Mostyafa Summer 2009
69
Fundamental Models (Failure Model)
Timing failures Class of Failure Affects Description Clock Process Process’s local clock exceeds the bounds on its rate of drift from real time. Performance Process exceeds the bounds on the interval between two steps. Channel A message’s transmission takes longer than the stated bound. Dr.Almetwally Mostyafa Summer 2009
70
Fundamental Models (Security Model)
Secure processes and channels and protect objects encapsulated against unauthorized access. Protecting access to objects Access rights In client server systems: involves authentication of clients. Protecting processes and interactions Threats to processes: problem of unauthenticated requests / replies. e.g., "man in the middle" Threats to communication channels: enemy may copy, alter or inject messages as they travel across network. Use of “secure” channels, based on cryptographic methods. Dr.Almetwally Mostyafa Summer 2009
71
Fundamental Models (Security Model)
Network invocation result Client Server Principal (user) Principal (server) Object Access rights Objects and principals Dr.Almetwally Mostyafa Summer 2009
72
Fundamental Models (Security Model)
Communication channel Copy of m Process p q The enemy m’ The enemy Dr.Almetwally Mostyafa Summer 2009
73
Fundamental Models (Security Model)
Principal A Secure channel Process p q B Secure channels Dr.Almetwally Mostyafa Summer 2009
74
Fundamental Models (Security Model)
Denial of service e.g., “pings” to selected web sites Generating debilitating network or server load so that network services become de facto unavailable Mobile code: Requires executability privileges on target machine Code may be malicious (e.g., mail worms) Dr.Almetwally Mostyafa Summer 2009
75
Slides for Chapter 3: Networking and Internetworking
From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4, © Pearson Education 2005
76
IS473 Distributed Systems
CHAPTER 3 Networking and Internetworking
77
OUTLINE Communication Subsystem. Types of Network.
Principles of Network. Internet Protocols. Network Case studies. Dr. Almetwally mostafa
78
Communication Subsystem
The hardware and software within a distributed system which provides the communication facilities is known as the communication subsystem. Consists of: Transmission media: providing the physical connectivity, e.g. wire, cable, fibre and wireless channels; Hardware devices: providing the linkage, e.g. routers, bridges, hubs, repeaters, network interfaces and gateways; Software components: managing the communication, e.g. protocol stacks, communication handlers and drivers. Dr. Almetwally mostafa
79
Impact on Distributed Systems
The communication impact on a distributed system will be one of the delay introduced by the message passing. The delay experienced by each individual message can be broken down into two factors: Latency: is the time which is necessary to set up the communication, i.e. it is the delay incurred from the time the message is sent until it starts to arrive at the destination. Transmission delay: determined by the length of the message and the data transfer rate, the speed of data transfer between two computers in the network, usually in bits per second. Message transmission time = latency + length / data transfer rate Above equation is valid only for messages shorter than the maximum allowable length by the underlying network. Longer messages are segmented and the transmission time is the summation of segments transmission times. Dr. Almetwally mostafa
80
Network Types Local Area Networks (LANs)
High-speed communication on proprietary grounds (on-campus). Based on twisted copper wire, coaxial cable or optical fibre. Total system bandwidth is high and latency is low. Most typical solution: Ethernet with 100 Mbps Metropolitan Area Networks (MANs) High-speed communication for nodes distributed over medium-range distances, usually belonging to one organization. Based on high bandwidth copper and optical fibre. Providing "back-bone" to interconnect LAN's. Technology often based on ATM, FDDI or DSL. Dr. Almetwally mostafa
81
Network Types Wide Area Networks Wireless Networks
Communication over long distances (cities, countries, or continents). Covers computers of different organizations. High degree of heterogeneity of underlying computing infrastructure. Involves routers to manage network and route messages to their destinations. Speeds up to a few Mbps possible, but around Kbps more typical. Most prominent example: the Internet. Wireless Networks End user equipment accesses network through short or mid range radio or infrared signal transmission Wireless WANs: GSM (up to about 20 Kbps), UMTS (up to Mbps), PCS. Wireless LANs/MANs: WaveLAN (2-11 Mbps, radio up to 150 meters). Wireless Personal Area Networks: Bluetooth (up to 2 Mbps on low power radio signal, < 10 m distance). Dr. Almetwally mostafa
82
Figure 3.1 Network performance
Example Range Bandwidth (Mbps) Latency (ms) Wired: LAN Ethernet 1-2 kms 1-10 WAN IP routing worldwide MAN ATM 250 kms 1-150 10 Internetwork Internet Wireless: WPAN Bluetooth ( ) m 0.5-2 5-20 WLAN WiFi (IEEE ) km 2-54 WMAN WiMAX (802.16) 550 km 1.5-20 WWAN GSM, 3G phone nets Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005
83
Network Types Internetworks
Several networks linked together to provide common data communication facilities. Needed for developing open distributed systems that contain very large numbers of computers. Integrate a variety of local and wide area network technologies to provide the network capacity needed by each group of users. Interconnected by dedicated switching computers, routers, and general purpose computers, gateways. Addressing and transmission of data to included computers are supported by a software layer. Dr. Almetwally mostafa
84
Network Principles Packet transmission
A packet is a sequence of binary data with addressing information to identify the source and destination computers. A network message with arbitrary length is divided before transmission into packets of restricted length. Restricted length packets are used: To allow each computer in the network to allocate sufficient buffer storage to hold largest possible incoming packet. To avoid long waiting for communication channels to be free if long messages ware transmitted without subdivision. Dr. Almetwally mostafa
85
Network Principles Switching Schemes
A switching system is required to transmit information between two arbitrary nodes in the network using shared communications link. Four types of switching are used in computer network: Broadcast: Requires no switches. All messages are sent to all connected computers. Each computer is responsible extracting messages addressed to itself. Used approach in Ethernet and wireless networks. Dr. Almetwally mostafa
86
Network Principles Switching Schemes
Circuit switching: Approach taken in the telephone system. A physical link is established between the sender and the receiver. Packet switching: Otherwise known as store-and-forward (postal system). At each switching node (connection point) a computer manages the packets by reading each one into memory, examining its destination, and choosing an outgoing circuit appropriately. Frame relay: Reading in and storing the whole of each packet introduces a performance overhead which can become significant. In ATM networks a frame of fixed size is used in place of a packet and only its header needs to be examined. The remainder of the frame is simply relayed as a stream of bits. Dr. Almetwally mostafa
87
Network Principles Protocols
A well-known set of rules and formats used for communication between processes to perform a given task. Implemented by a pair of software modules located in the sending and receiving computers. Protocol software modules are arranged in a hierarchy of layers. A complete set of protocol layers is referred to as a protocol suite or protocol stack. Protocol layering brings benefits in simplifying and generalizing the software interface for access to the communication services, but it also carries significant performance costs. The application, presentation, and session layers are not distinguish in the Internet protocol stack: The application and presentation layers are implemented as a single middleware layer. The session layer is integrated with the transport layer. Dr. Almetwally mostafa
88
Figure 3.2 Conceptual layering of protocol software
Message sent Message received Layer n Layer 2 Layer 1 Sender Communication Recipient medium Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005
89
Figure 3.3 Encapsulation as it is applied in layered protocols
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005
90
Network Principles Protocols
Protocol layers in the ISO model Dr. Almetwally mostafa
91
Network Principles Protocols
Layer Description Application Protocols that are designed to meet the communication requirements of specific applications, often defining the interface to a service. Presentation Protocols at this level transmit data in a network representation that is independent of the representations used in individual computers, which may differ. Encryption is also performed in this layer, if required. Session At this level reliability and adaptation are performed, such as detection of failures and automatic recovery. Transport This is the lowest level at which messages (rather than packets) are handled. Messages are addressed to communication ports attached to processes. Network Transfers data packets between computers in a specific network. In a WAN or an internetwork this involves the generation of a route passing through routers. In a single LAN no routing is required. Data link Responsible for transmission of packets between nodes that are directly connected by a physical link. In a WAN transmission is between pairs of routers or between routers and hosts. In a LAN it is between any pair of hosts. Physical The circuits and hardware that drive the network. It transmits sequences binary data by analogue signalling (on cable circuits), light signals (on fibre optic circuits) or other electromagnetic signals (on radio and microwave circuits). Dr. Almetwally mostafa
92
Network Principles Protocols
The task of dividing messages into packets before transmission and reassembling them at receiving computer is performed in the transport layer. The transport layer is responsible for delivering messages to destinations with transport addresses. A transport address is composed of the network address number of a host computer (an IP number in the Internet) and a port number. Ports are software-definable destination points for communication within a host computer. In the Internet there are typically several ports at each host computer with well-known numbers, each allocated to a given Internet service. Dr. Almetwally mostafa
93
Figure 3.6 Internetwork layers
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005
94
Network Principles Routing
A function required in all networks except LANs. The best route for communication between points in the network is re-evaluated periodically to take into account the current traffic and any faults in the network: adaptive routing. Packets delivery to their destinations is the collective responsibility of the routers located at connection points. Routing algorithm, implemented by a program in the network layer at each point, has two functions: Decide the routes for packets transmission (on hop-by-hop basis): Whenever a virtual circuit or connection is established in case of circuit-switched and frame-relay network layers. Separately for each packet in case of packet-switched network layers. Update its knowledge of the network based on traffic monitoring and the detection of failures. Dr. Almetwally mostafa
95
Network Principles Routing
Hosts Links or local networks A D E B C 1 2 5 4 3 6 Routers Routing in wide area network Dr. Almetwally mostafa
96
Network Principles Routing
Routings from A Routings from B Routings from C To Link Cost To Link Cost To Link Cost A local A 1 1 A 2 2 B 1 1 B local B 2 1 C 1 2 C 2 1 C local D 3 1 D 1 2 D 5 2 E 1 2 E 4 1 E 5 1 Routings from D Routings from E To Link Cost To Link Cost A 3 1 A 4 2 B 3 2 B 4 1 C 6 2 C 5 1 D local D 6 1 E 6 1 E local Routing tables for the previous network Dr. Almetwally mostafa
97
Figure 3.9 Pseudo-code for RIP (router information prototcol)routing algorithm
Send: Each t seconds or when Tl changes, send Tl (local table)on each non-faulty outgoing link. Receive: Whenever a routing table Tr is received on link n: for all rows Rr in Tr { if (Rr.link | n) { Rr.cost = Rr.cost + 1; Rr.link = n; if (Rr.destination is not in Tl) add Rr to Tl; // add new destination to Tl else for all rows Rl in Tl { if (Rr.destination = Rl.destination and (Rr.cost < Rl.cost or Rl.link = n)) Rl = Rr; // Rr.cost < Rl.cost : remote node has better route // Rl.link = n : remote node is more authoritative } Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005
98
Network Principles Internetworking
Many subnets based on many network technologies are integrated to build an internetwork. To make this possible, the following are needed: A unified internetwork addressing scheme enables packets to be addressed to any host connected to any subnets (provided by IP addresses in the Internet). A protocol defining the format of internetwork packets and giving rules of handling them (IP protocol in the Internet). Interconnecting components that route packets to their destination in terms of internetwork addresses (performed by internet routers in the Internet). The next figure shows a small part of the Internet comprises several subnets interconnected by routers and contains also many connection devices as switches, gateways, and hubs. Dr. Almetwally mostafa
99
Network Principles Internetworking
file compute dialup hammer henry hotpoint bruno sickle /29 copper web /29 server desktop computers xx subnet Eswitch custard hub Student subnet Staff subnet other servers router/ firewall 1000 Mbps Ethernet Eswitch: Ethernet switch 100 Mbps Ethernet file server/ gateway printers Campus router xx router/ firewall Dr. Almetwally mostafa
100
Network Principles Internetworking
Routers: Interconnected through subnets in an internetwork. Have distinct identities (IP addresses) within each subnet. Responsible for forwarding the internetwork packets arrived on any connection to the correct outgoing connection and maintain routing tables for that purpose. Bridges: Link networks of different types. Bridge/Routers: Link networks of the same type and perform routing functions. Dr. Almetwally mostafa
101
Network Principles Internetworking
Hubs: Connecting together several segments of LAN cables. Have a number of sockets. A host computer can be connected to each socket. Switches: Perform a similar function to routers but for LANs only. Routing the incoming packets only to the connected hosts. Build up routing tables by the observation of traffic. Dr. Almetwally mostafa
102
Internet Protocols The Internet emerged from the development of the TCP/IP protocol suite. TCP stands for Transmission Control Protocol and IP for Internet Protocol. Many application services and application-level protocols now exist based on TCP/IP including: The Web (HTTP). (SMTP, POP). File transfer (FTP). Net News (NNTP). Telnet (telnet). Dr. Almetwally mostafa
103
Internet Protocols TCP/IP layers Message Layers Application
Messages (UDP) or Streams (TCP) Application Transport Internet UDP or TCP packets IP datagrams Network-specific frames Message Layers Underlying network Network interface TCP/IP layers Dr. Almetwally mostafa
104
Figure 3.13 Encapsulation in a message transmitted via TCP over an Ethernet
Application message TCP header IP header Ethernet header Ethernet frame port TCP IP Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005
105
Internet Protocols TCP is a transport protocol that can be used to support applications directly or additional protocols can be layered on it to provide additional features. TCP is a reliable connection-oriented protocol used to transport streams of data. Another transport protocol (User Datagram Protocol UDP) is used to meet traditional message-based communication. IP is the underlying network protocol that provide the basic transmission mechanism for the Internet and other subnets. Success of TCP/IP is based on their independence of underlying transmission technology enabling internetworks to built up from many heterogeneous networks and data links. Dr. Almetwally mostafa
106
A programmer’s conceptual view of an Internet TCP/IP
Internet Protocols Application Application TCP UDP IP A programmer’s conceptual view of an Internet TCP/IP Dr. Almetwally mostafa
107
Internet Protocols IP Addressing
Used scheme for assigning addresses to networks and the computers connected to them must satisfy the following requirements: Universal: any host on Internet can send a message to any other. Assign Unique IP address to each host in the Internet. Sufficient: defining large addressing space and using it efficiently. IPv4 (1984): 32-bit addresses for 232 (~ 4 billion) addresses, but insufficient due to: i) Unforeseen growth of internet. ii) Inefficient use of address space. IPv6 (1994): 128-bit addresses for 2128 (~ 3x1038) addressable nodes. Routing: support a flexible and efficient routing scheme, but addresses themselves should not contain routing information. Dr. Almetwally mostafa
108
Internet Protocols IP Addressing
The IP address: 32-bit numeric identifier containing: A unique network identifier within the Internet, allocated by the Internet Network Information Center (NIC). A unique host identifier within that network, assigned by its manager. Written as a sequence of four decimal numbers separated by dots. Has equivalent symbolic domain name represented in a hierarchy. Has five classes: Class A: reserved for very large networks (224 hosts on each). Class B: allocated for organization networks contain more than 255 hosts. Class C: allocated to all other networks (less than 255 hosts on each). Class D: reserved for multicasting but this is not supported by all routers. Class E: unallocated addresses reserved for future requirements. Dr. Almetwally mostafa
109
Internet Protocols IP Addressing
Internet addressing structure Dr. Almetwally mostafa
110
Internet Protocols IP Addressing
octet 1 octet 2 octet 3 Range of addresses Network ID Host ID to Class A: 1 to 127 0 to 255 0 to 255 0 to 255 Network ID Host ID Class B: to 128 to 191 0 to 255 0 to 255 0 to 255 Network ID Host ID to Class C: 192 to 223 0 to 255 0 to 255 1 to 254 Multicast address to Class D (multicast): 224 to 239 0 to 255 0 to 255 1 to 254 to Class E (reserved): 240 to 255 0 to 255 0 to 255 1 to 254 Decimal representation of Internet addressing Dr. Almetwally mostafa
111
Internet Protocols IP Protocol
Network protocol of the Internet protocol stack. Transmits datagrams from one host to another via intermediate routers with the following characteristics: No guarantee of delivery. Duplication possible. Unbounded delay. No order preservation. Address resolution IP addresses may need to be mapped to physical network addresses. Ethernet has 48-bit addresses. Use Address Resolution Protocol (ARP) Either direct relation between IP and physical address, or mapping. Dr. Almetwally mostafa
112
Internet Protocols IP Protocol
When an IP datagram (up to 64 Kbytes) is longer than the Maximum Transfer Unit (MTU) of the underlying network: It is broken into smaller packets at the source and reassembled at its final destination. Each packet has a fragment identifier to enable out-of-order fragments to be collected. data IP address of destination IP address of source header up to 64 kilobytes IP packet layout Dr. Almetwally mostafa
113
Internet Protocols IP Routing
IP network layer routes packets from their source to their destination using a routing algorithm: Distance-vector algorithm: Router Information Protocols (RIP-1, RIP-2, ……). Link state algorithms class. Open Shortest Path First (OSPF) protocol. Different routing algorithms may co-exist since routing tables contain identical information for all algorithms. However, for routing table creation and update, the same algorithm needs to be used. Therefore, the Internet is divided into topological areas and one algorithm used in every area. Dr. Almetwally mostafa
114
Internet Protocols IP Routing
Internet topological map is partitioned into autonomous systems which are subdivided into areas. Every autonomous system has a backbone area. The collection of routers connect non-backbone areas to the backbone and the links that interconnect those routers are the Internet backbone. Backbone links are usually of high bandwidth and are replicated for reliability. Packets addressed to hosts on the local network as the sender: Transmitted in a single hope without routing. IP layer uses the Address Resolution Protocol (ARP) to determine the network address of local destination host Dr. Almetwally mostafa
115
Internet Protocols IP Routing
The need to store information from every node in the IP address space to every other node leads to routing table size explosion. Two possible solutions: Topological grouping of IP addresses, so that addresses in one topological area are all routed to a central router of that area. For example, all addresses to in Europe. Routers outside Europe can have a single table entry to route all addresses in this range to the closest European router, which then perform detailed routing. Problem: before 1993, IP addresses were assigned without regard to geographic location, still in use. Usage of default routes: Not all nodes in a subnet need to store complete routing information as long as key routers close to backbone have complete routing information. Dr. Almetwally mostafa
116
Internet Protocols IP Routing
Hosts Links or local networks A D E B C 1 2 5 4 3 6 Routers Routings from C To Link Cost B C E 2 local 5 1 Default - Default Routing Dr. Almetwally mostafa
117
Internet Protocols IP Routing
Classless Inter Domain Routing (CIDR) is a scheme introduced in 1996 to face the shortage of IP addresses. CIDR scheme is used to allocate IP addresses and manage entries of routing tables. Main problem: scarcity of Class B addresses, while plenty of Class C addresses were available. Solution: allocate batches of contiguous Class C addresses to subnets of more than 255 hosts and vice versa. For efficient routing: add mask field to routing tables used to select the portion of an IP address that is compared with the table entry. Enables the network/host address to be any portion of the IP address. More flexible than the old class-based algorithm. Dr. Almetwally mostafa
118
Internet Protocols IP Routing
CIDR Example: net X: 2048 addresses , mask net Y: 4096 addresses , mask net Z: 1024 addresses , mask Given address , bitwise AND with all masks in table Only result of and-ing with net Y mask gives valid address: => route according to net Y line routing table information. address mask X Y Z Dr. Almetwally mostafa
119
Internet Protocols IP Version 6 (IPv6)
Adopted in 1994 to face the addressing limitations of IPv4. Addresses long are 128-bits (~ 3x1038 addressable entities). Address space is partitioned: One partition will hold the entire range of IPv4 addresses. Two partitions used to organize the address space: One according to the geographical locations of the addressed nodes. The other according to their organizational locations. Improved routing speed: No checksum applied to the packet content, only to its header. No datagram fragmentation occurs inside network Supporting a mechanism for determining the smallest datagram size (MTU) before a packet is transmitted. Dr. Almetwally mostafa
120
Internet Protocols IP Version 6 (IPv6)
Version (4 bits) Priority (4 bits) Flow label (24 bits) Payload length (16 bits) Next header (8 bits) Hop limit (8 bits) Source address (128 bits) Destination address (128 bits) IPv6 header layout Dr. Almetwally mostafa
121
Internet Protocols IP Version 6 (IPv6)
Multimedia streams and other real-time data elements can transmitted in identified flow. The priority and flow label fields can be used to enable handling specific packets more rapidly or with higher reliability than others. Flow labels enable resources to be reserved in order to meet timing requirements of specific real-time data streams. Support multicast (as IPv4 ): The transmission of packets to multiple hosts using a single address. Support a new mode of transmission called anycast: Deliver a packet to at least one of the hosts subscribed to the relevant address. Allow implementing of security at the IP level without the need for security-aware implementations of application programs. Internet protocol stack, routers software, and application programs require upgrading to support the migration to IPv6. Dr. Almetwally mostafa
122
Internet Protocols MobileIP
Support for roaming of laptop computers, personal digital assistants (PDAs), wearable computing devices, etc. IP addresses are bound to subnet addresses, but roaming may leave subnet boundary. MobileIP allows IP communication to continue transparently with respect to current location of the mobile host. The mobile host is allocated a permanent IP address, corresponding to its “home" domain. When the mobile host is roaming: A home agent runs on a fixed machine in the home domain. A foreign agent correspondingly running on a fixed machine at the temporary domain. Dr. Almetwally mostafa
123
Internet Protocols MobileIP
Sender 4. Subsequent IP packets send to FA directly Mobile host MH 2. Address of FA returned to sender 1. First IP packet addressed to MH Internet Foreign agent FA Home 3. First IP packet agent forwarded to FA MobileIP routing mechanism Dr. Almetwally mostafa
124
Internet Protocols MobileIP
The home agent keeps track of the current IP address of the mobile host and acts as a proxy during periods of disconnection. When the mobile machine is registered with the foreign agent, the foreign agent contacts the home agent, notifying it of the new temporary IP address. Requests for the server are captured by home agent and re-routed, embedded in MobileIP packets, to the foreign agent: The sender sends first IP packet addressed to the mobile host . The Home agent receive the packet as a proxy for the mobile host. The home agent returns the address of the foreign agent to the sender. The home agent forwards the first IP packet to the foreign agent. Subsequent IP packets sent to the foreign agent directly. Dr. Almetwally mostafa
125
Internet Protocols TCP and UDP
TCP and UDP provide the communication capabilities of Internet in a useful form for application programs. TCP and UDP are transport protocols that accomplish process-to-process communication by the use of ports. Port numbers are used for addressing messages to processes within a particular computer and are valid only within that computer. UDP (User Datagram Protocol) A transport-level replica of IP: A UDP datagram is encapsulated inside an IP packet. A UDP datagram has a short header includes the source and destination port numbers, a length field, and a checksum. Dr. Almetwally mostafa
126
Internet Protocols TCP and UDP
UDP (User Datagram Protocol) cont. Offer unreliable connectionless transport service: No guarantee of delivery. No guarantee of order preservation. No additional reliability mechanisms except the optional checksum. If the received host finds the checksum field is non-zero: Compute a check value from the packet contents. Compare the computed check value with received checksum. Drop the received packet in case of unmatching. Transmit messages of up to 64 bytes in size with minimal additional costs and delays above IP transmission: No setup costs. No administrative acknowledgement messages. Dr. Almetwally mostafa
127
Internet Protocols TCP and UDP
TCP (Transport Control Protocol) Offer reliable connection-oriented transport service: Guarantee the delivery of all sending data. Guarantee of order preservation. A bi-directional communication channel between the sending and receiving process is established. The sending process divides the data stream into a sequence of data segments and transmits them as IP packets: Each TCP segment is attached with a sequence number. Sequence numbers are used by the receiver to order the segments. No segment is placed in the input stream of the receiver until all lower-numbered segments. Each segment carries a checksum covering the header and data and the receiver drop any received segment with unmatched checksum. Dr. Almetwally mostafa
128
Internet Protocols TCP and UDP
TCP (Transport Control Protocol) cont. A segment acknowledgements system is used to control the flow of stream between the sender and receiving processes: The receiver sends from time to time an acknowledgement to the sender giving the sequence number of the highest successfully received segment. Acknowledgements are carried in the normal data segments if there is a reverse flow of data. If any segment is not acknowledged within a specified timeout the sender retransmits it. The incoming buffer at the receiver is used to balance the flow between the sender and the receiver: The buffer may overflow if the receive operations more slowly than the send operations. Incoming segments are dropped when the buffer is overflowed. The sender is obliged to retransmit that dropped segments. Dr. Almetwally mostafa
129
Internet Protocols Domain Names
The Internet supports a scheme of symbolic domain names for hosts and networks because IP addresses are not very memorable for human users. The domain name is represented in a hierarchical fashion designed to reflect the organizational hierarchy and independent of the physical arrangement of the Internet (location transparency). In order for communication to take place a domain name must be translated into an IP address. The translation of domain names is carried out using the Domain Name Service (DNS). DNS is implemented as a server process can be run on host computers anywhere in the Internet. Dr. Almetwally mostafa
130
Internet Protocols Domain Names
There are at least two DNS servers in each domain and often more. The servers in each domain hold a partial map of the domain name tree below their domain. The domain map tree must consist of all of the domain and host names within its own domain; often it will contain more. Name resolution is carried out recursively from right to left, issuing request to other DNS servers in relevant domains as necessary. The resulting translation is cached at the server handling the original request so that the future requests for the same domain can be resolved without reference to other servers. Dr. Almetwally mostafa
131
Internet Protocols Firewalls
The purpose of a firewall is to monitor and control all communication into and out of an intranet. A firewall is implemented by a set of processes that act as a gateway applying a security policy determined by the organization. The firewall security policy may include any or all of the following: Service control: determine which services on internal hosts are accessible for external access and reject all other incoming service requests. Filtering actions are based on the contents of IP packets and the included TCP and UDP requests. Behavior control: prevent behavior that infringes the organization’s policies and forming part of an attack. Some filtering actions are applicable at the IP or TCP level but others require higher level interpretation of messages. User control: the organization discriminate between its users by allowing some access to external services but inhibiting others from doing so. Dr. Almetwally mostafa
132
Internet Protocols Firewalls
Protected intranet Router/ filter Internet web/ftp server Firewall configuration Dr. Almetwally mostafa
133
Slides for Chapter 4: Interprocess Communication
From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4, © Addison-Wesley 2005
134
IS473 Distributed Systems
CHAPTER 4 Interprocess Communication (IPC)
135
OUTLINE Applications, services
Remote Method Invocation (RMI) Remote Procedure Calling (RPC) request-reply protocol Middleware This layers chapter marshalling and external data representation UDP and TCP IS473 - Chapter (4) Dr. Almetwally Mostafa
136
OUTLINE Interprocess Communication Characteristics.
API for Internet Protocols (UDP, TCP). External Data Representation and Marshalling. Client-Server Communication. Group Communication Unix Case Study. IS473 - Chapter (4) Dr. Almetwally Mostafa
137
Characteristics of IPC
Massage passing between a pair of processes supported by two communication operations: Send and Receive. A queue is associated with each message destination regarded as a buffer between the sender and receiver: When the buffer is empty the receiver must wait. When the buffer is full the sender must wait, or messages will be lost. Communication is termed as synchronous or asynchronous depending on the degree of coordination imposed between sender and receiver: Synchronous: both send and receive are blocking operations: Sender blocks until a receive is issued Receiver blocks until a message arrives. Asynchronous: send operation is non-blocking Sender is non-blocking (copy goes to its local buffer) Receiver blocking (ALWAYS) or non-blocking (out of control flow). IS473 - Chapter (4) Dr. Almetwally Mostafa
138
Ports and Sockets Communication between processes is made between ports. Each computer has 216 possible ports represented by integer numbers: some ports have specific use but others are available for general use. Each port corresponds to a single receiving process, but each process may have more than one port at which it receives. Any number of senders can send messages to a port. Sockets are software abstractions of ports used within running processes. Messages are sent between a pair of sockets. Sockets need to be bound to a port number and a host IP address in order to be useable for sending and receiving messages. Socket binding may be automatic (Java) or done explicitly (BSD UNIX). Each socket is associated with a particular transport protocol, i.e. UDP (datagrams) or TCP (streams). IS473 - Chapter (4) Dr. Almetwally Mostafa
139
Ports and Sockets agreed port any port socket socket message client
server other ports IP address = IP address = IS473 - Chapter (4) Dr. Almetwally Mostafa
140
Characteristics of IPC mechanisms
Ports do more than identify a process • Single process can use more than one port •Can maintain multiple connections / listen for communication on separate ports Sockets are a software abstraction which provide a communication endpoint for processes • Combination of – Host ID (e.g. IP address) – Port number • (Typically) implement – Non-blocking send – Blocking receive IS473 - Chapter (4) Dr. Almetwally Mostafa
141
IPC Communication using UDP
To send or receive using UDP, a process must – Create a socket, bound to a port on the localhost • Clients can then issue a send, identifying the IP address and port number to which the destination socket is bound • Datagrams are received at a socket without consideration of the source – any client socket can send to any other socket IS473 - Chapter (4) Dr. Almetwally Mostafa
142
UDP Datagram Size IP puts a limit of 64kbytes on packets
• UDP datagram must fit into this size, taking into consideration the addition of IP headers • Typical applications use limit of 8kbytes IS473 - Chapter (4) Dr. Almetwally Mostafa
143
UDP Failure Model Omission Failures
– Datagrams can be dropped because of -Full Buffers -Checksum Failure -Either send-omission, receive omission • Ordering – Messages can arrive out of order • Underlying IP routes packets independently IS473 - Chapter (4) Dr. Almetwally Mostafa
144
UDP Applications Where failures are tolerable – DNS – VOIP
• Where overheads are costly – E.g. where the requirement to store state information at sender or receiver, or the requirement to retransmit messages introduces costly latency in the application IS473 - Chapter (4) Dr. Almetwally Mostafa
145
Java API for Internet Addresses
The java.net package is intended to provide an application programmer with a powerful infrastructure for networking: Java provides three basic socket classes: DatagramSocket for datagram communication and Both of ServerSocket and Socket for stream communication. Java includes classes for representing URLs (the URL class) and Internet addresses (the InetAddress class). More explicit networking can be considered using the socket classes, which makes use of the InetAddress class. InetAddress class has no public constructor but has getByName() method that take a DNS name and make use of the DNS server. InetAddress aComputer = InetAddress.getByName(“wassif.ccis.ksu.edu.sa"); InetAddress class is designed to handle both IPv4 (4 bytes) and IPv6 (16 bytes) addresses. IS473 - Chapter (4) Dr. Almetwally Mostafa
146
UDP Datagram Communication
A datagram is communicated between processes when one process sends it and the other receives it. Datagrams sent from socket to socket without acknowledgement or reliability. UDP applications provide their own checks to achieve the required quality of reliable communication. A server binds its socket to a specific server port known to clients but a client binds its socket to any free port. Sender socket information is passed with the datagram content so that a reply can be sent. The receiver must specify a buffer of a particular size (8 KBytes in most implementations) to place received message. Message will be truncated if the buffer size is not enough. Longer messages must fragmented into chunks to match buffer sizes. IS473 - Chapter (4) Dr. Almetwally Mostafa
147
UDP Datagram Communication
Datagram communication provides non-blocking sends and blocking receives although timeouts can be set. Timeouts can be set on receiver socket to forbid indefinitely waiting in case of crashed sender or lost message. Choosing an appropriate timeout interval is difficult. The receive method does not specify who the message should be received from: It will retrieve any message which has arrived at its socket from any sender. However, it is possible to bind a socket so that it only sends to or receives messages from a particular remote port. UDP is suitable for applications that need low overheads and accepting occasional omission failures (as DNS service). IS473 - Chapter (4) Dr. Almetwally Mostafa
148
Java API for UDP Datagrams
Java API provides datagram communication by two classes: DatagramPacket: Has a constructor to create datagram packets comprising a message, its length, and destination socket information Has another constructor to put a received message into that packet. The message can be retrieved using getData method and the Internet address and the port are accessed by getAddress and getPort methods. Array of bytes containing message-length of message-internet address-port number DatagramSocket: Supports sockets for sending and receiving UDP datagrams. Has a constructor with a port number as argument to tie that port, if a particular port is needed, or without argument to use any free local port. send and receive methods are used for transmitting datagrames between a pair of sockets. The method setSoTimeout allows a timeouts to be set and connect method is used for connecting a socket to a particular remote port and host address. IS473 - Chapter (4) Dr. Almetwally Mostafa
149
Figure 4.3 UDP client sends a message to the server and gets a reply
import java.net.*; import java.io.*; public class UDPClient{ public static void main(String args[]){ // args give message contents and server hostname DatagramSocket aSocket = null; try { aSocket = new DatagramSocket(); byte [] m = args[0].getBytes(); InetAddress aHost = InetAddress.getByName(args[1]); int serverPort = 6789; DatagramPacket request = new DatagramPacket(m, args[0].length(), aHost, serverPort); aSocket.send(request); byte[] buffer = new byte[1000]; DatagramPacket reply = new DatagramPacket(buffer, buffer.length); aSocket.receive(reply); System.out.println("Reply: " + new String(reply.getData())); }catch (SocketException e){System.out.println("Socket: " + e.getMessage()); }catch (IOException e){System.out.println("IO: " + e.getMessage());} }finally {if(aSocket != null) aSocket.close();} }
150
Figure 4.4 UDP server repeatedly receives a request and sends it back to the client
import java.net.*; import java.io.*; public class UDPServer{ public static void main(String args[]){ DatagramSocket aSocket = null; try{ aSocket = new DatagramSocket(6789); byte[] buffer = new byte[1000]; while(true){ DatagramPacket request = new DatagramPacket(buffer, buffer.length); aSocket.receive(request); DatagramPacket reply = new DatagramPacket(request.getData(), request.getLength(), request.getAddress(), request.getPort()); aSocket.send(reply); } }catch (SocketException e){System.out.println("Socket: " + e.getMessage()); }catch (IOException e) {System.out.println("IO: " + e.getMessage());} }finally {if(aSocket != null) aSocket.close();}
151
TCP Stream Communication
An abstraction provided by TCP API as a stream of bytes to which data may be written and from which data may be read. Stream abstraction offers the following characteristics: No restriction about message sizes: Applications can choose how much data written to or read from a stream. Guarantee message delivery using acknowledgment scheme: The sender retransmits the message if it does not receive an acknowledgment within a timeout. Attempt to match the speeds of reader and writer processes: The writer may be blocked until the reader has consumed sufficient data. Guarantee unduplication and ordering of messages: Message identifiers enable the receiver to detect and reject duplicates or to reorder messages not arrived in sender order. IS473 - Chapter (4) Dr. Almetwally Mostafa
152
TCP Stream Communication
A pair of communicating processes establish a connection before they can communicate over a stream: One of the two processes plays the client role and the other plays the server role. The client role involves: Creating a stream socket bound to any free port. Making a connect request to a server at its server port. The server role involves: Creating a listening socket bound to a server port and waiting for clients to request connections. Making accept request to reply a client request at its port. Creating a new stream socket to communicate with this client. Keeping the listening socket for new connecting requests. Once the connection is established the processes write to and read from the stream without needing to use Internet addresses and ports. IS473 - Chapter (4) Dr. Almetwally Mostafa
153
TCP Stream Communication
The pair of sockets in client and server are connected by input and output streams, one for each direction. The connection is broken when any of its sockets is closed: Sockets may be closed by a close request from either of the connected processes or when any of them is failed or exited. A process using a broken connect will be notified when attempting to read from or write to its stream. Many frequently used services run over TCP connections with reserved port numbers: HTTP is used for communication between web browsers and web servers. FTP allows browsing of remote computer directories and transfer files from one computer to another over a connection. SMTP is used to send mail between computers. IS473 - Chapter (4) Dr. Almetwally Mostafa
154
TCP Failure Model Timeout and retransmission – Sliding window protocol
• Connections can break down – No guarantee that connection will not break, so not fully reliable • Limitations – Processes cannot distinguish between network failure and receiving process failure – Processes cannot be certain whether packets sent were received IS473 - Chapter (4) Dr. Almetwally Mostafa
155
Java API for TCP Streams
Java API provides stream communication by two classes: ServerSocket: Intended for use by a server to create a socket at a server port for listening connect request from clients. Has an accept method that is blocked until arriving of any request to create an instance of stream socket. Socket: Used by a pair of processes with a stream connection. Has a constructor used by the client to create its stream socket and send a connect request to the specified remote server. Provides getInputStream and getOutputStream methods for accessing the two input and output streams associated with a socket. read and write methods are used for bytes reading from and writing to InputStream and OutputStream respectively. IS473 - Chapter (4) Dr. Almetwally Mostafa
156
Figure 4.5 TCP client makes connection to server, sends request and receives reply
import java.net.*; import java.io.*; public class TCPClient { public static void main (String args[]) { // arguments supply message and hostname of destination Socket s = null; try{ int serverPort = 7896; s = new Socket(args[1], serverPort); DataInputStream in = new DataInputStream( s.getInputStream()); DataOutputStream out = new DataOutputStream( s.getOutputStream()); out.writeUTF(args[0]); // UTF is a string encoding see Sn 4.3 String data = in.readUTF(); System.out.println("Received: "+ data) ; }catch (UnknownHostException e){ System.out.println("Sock:"+e.getMessage()); }catch (EOFException e){System.out.println("EOF:"+e.getMessage()); }catch (IOException e){System.out.println("IO:"+e.getMessage());} }finally {if(s!=null) try {s.close();}catch (IOException e){System.out.println("close:"+e.getMessage());}} }
157
Figure 4.6 TCP server makes a connection for each client and then echoes the client’s request
import java.net.*; import java.io.*; public class TCPServer { public static void main (String args[]) { try{ int serverPort = 7896; ServerSocket listenSocket = new ServerSocket(serverPort); while(true) { Socket clientSocket = listenSocket.accept(); Connection c = new Connection(clientSocket); } } catch(IOException e) {System.out.println("Listen :"+e.getMessage());} // this figure continues on the next slide
158
Figure 4.6 continued class Connection extends Thread {
DataInputStream in; DataOutputStream out; Socket clientSocket; public Connection (Socket aClientSocket) { try { clientSocket = aClientSocket; in = new DataInputStream( clientSocket.getInputStream()); out =new DataOutputStream( clientSocket.getOutputStream()); this.start(); } catch(IOException e) {System.out.println("Connection:"+e.getMessage());} } public void run(){ try { // an echo server String data = in.readUTF(); out.writeUTF(data); } catch(EOFException e) {System.out.println("EOF:"+e.getMessage()); } catch(IOException e) {System.out.println("IO:"+e.getMessage());} } finally{ try {clientSocket.close();}catch (IOException e){/*close failed*/}}
159
4-3 External Data Representation
Representation problems: Data structures must be flattened, converted to a sequence of bytes, before transmission and rebuilt on arrival. Not all computers store primitive data values in the same order. Solution methods: Values are converted to an agreed external format, external data representation, before transmission and reconverted to the local form on receipt. Values are transmitted in the sender’s format with an indication of the used format and the receipt converts the values to its format (if necessary). Marshalling is the process of assembling a collection of data items in a form suitable for transmission and unmarshalling is the reverse process of disassembling them on arrival. IS473 - Chapter (4) Dr. Almetwally Mostafa
160
External Data Representation
Two alternative approaches to external data representation and marshalling are: CORBA’s common data representation (CDR): Concerned with an external representation for the structured and primitives types. Can be used by a variety of programming languages. Java’s object serialization: Concerned with the flattening and external data representation of any single object or tree of objects. Used only by Java. In both cases, marshalling and unmarshalling activities should be carried out by a middleware layer. The common marshaling approach is to convert the primitive types into a binary form but the conversion into ASCII text (e.g. HTTP protocol) is an alternative. IS473 - Chapter (4) Dr. Almetwally Mostafa
161
CORBA’s CDR CDR can represent all data types used as arguments or return values during remote invocation of CORBA objects. 15 primitive types and a range of composite types are represented by a sequence of bytes in the invocation or result message. Marshaling and unmarshalling operations are generated automatically from the specification of data items transmitted in a message. The type definitions of data structures and primitive data items are described in CORBA Interface Definition Language (IDL). CORBA interface compiler generates appropriate marshaling and unmarshalling operations from the type definitions of remote methods arguments and results IS473 - Chapter (4) Dr. Almetwally Mostafa
162
Java Object Serialization
Serialization and deserialization in Java: Serialization is the activity of flattening object or a related set of objects in a serial form suitable for transmitting in a message. Deserialization is the activity of restoring the state of an object or a set of objects from their serialized form. All objects referenced on an object are serialized as handles when that object is serialized. Serialization of a specific object is done by creating an instance of the class ObjectOutputStream and invoking its writeObject method with the object name as rgument. ObjectInputStream class is opened on a stream of data to deserialize an object from that stream using the readObject method. Making generic serialization and deserialization is possible using reflection rather than to generate special marshalling functions for each type of object (as in CORBA). IS473 - Chapter (4) Dr. Almetwally Mostafa
163
Remote Object References
Needed when a client invokes a method of an object located on a remote server. Must be generated in a manner that ensure uniqueness over space and time. Unique among all of the processes in the various computers in a distributed system. References of deleted remote objects are not reused to avoid accessing different objects. Constructed by connecting the IP address of its computer and the port number of the process created it with the time of its creation and a local object number. Internet address port number time object number interface of remote object 32 bits IS473 - Chapter (4) Dr. Almetwally Mostafa
164
Client-server Communication
Request-reply protocols are designed to support client-server communication with the following characteristics: Synchronous in normal cases because the client process blocks until the reply arrives from the server. Reliable and grantee of message delivery. Acknowledgments are not required since requests are followed by replies. Flow control is not needed due to small amounts of data transferred Connection establishment involves only two extra pair of messages. Usually implemented over UDP datagrames but can also applied over TCP streams. IS473 - Chapter (4) Dr. Almetwally Mostafa
165
Client-server Communication
Request Server Client doOperation (wait) (continuation) Reply message getRequest execute method select object sendReply Request-reply communication IS473 - Chapter (4) Dr. Almetwally Mostafa
166
Client-server Communication
Request-reply protocol is based on a triple of communication primitives: doOperation: Used by clients to send a request message for invoking remote operations. Its arguments specify the remote object reference, the method to be invoked and the arguments of that method. Its result is a remote method reply. getRequest: Used by a server process to acquire service request messages. sendReply: Used by a server process to send the reply messages to clients Its arguments are the reply, and IP address and port of designated client. IS473 - Chapter (4) Dr. Almetwally Mostafa
167
Client-server Communication
The structure of a request or a reply message as follow: Message type: indicate whether the message is a request or a reply. Message ID: contain a request message identifier generated by the client and the server refer to it into the corresponding reply message to enable the client to check that the reply is the result of the current request. Object reference: the reference of remote object. Method ID: the identifier for the method to be invoked. Arguments of the invoked method Each message identifier must be unique in the distributed system and include two parts: A request ID taken from an increasing sequence by the client process to make the identifier unique to the client. An identifier of the client process (IP address + port number). IS473 - Chapter (4) Dr. Almetwally Mostafa
168
Client-server Communication Failure Recovery
Request-reply protocol can suffer from the following failures: Omission failures: Use timeout and resend request when timeout expires and reply hasn’t arrived Server receives repeated request Idempotent (unchangable) operations: same result obtained on every invocation. Non-idempotent operations: re-send result stored from previous request, requires maintenance of a history of replies. Loss of replies: request - reply - acknowledge reply (RRA) protocol Message duplication: return request ID with reply. IS473 - Chapter (4) Dr. Almetwally Mostafa
169
Client-server Communication Over TCP Streams
Used if successive requests and replies between the same client-server pair are sent over the same stream. Reasons: Avoid implementing multi-packet protocols. Reduce the connection overhead to every remote invocation. Allow transmit remote methods arguments and results of any size. Ensure reliable delivery of request and reply messages. Avoid the need to deal with messages retransmission, duplicates filtering, and histories. Provide flow-control mechanism that allows large arguments and results to be passed without special measures. IS473 - Chapter (4) Dr. Almetwally Mostafa
170
Client-server Communication Hypertext Transfer Protocol (HTTP)
A lightweight request-reply protocol for the exchange of network resources between web browser clients and web servers in the following steps: Connection establishment between client and server (likely TCP, but any reliable transport protocol is acceptable) Client sends request Server sends reply Connection closure The need to establish and close a connection for every request-reply exchange is inefficient scheme, therefore HTTP 1.1 allows persistent transport connections (remains open for successive request/reply pairs) Resources are supplied as MIME-like structures (e.g. text/plain, text/html, image/gif and image/jpeg). Requests and replies are marshalled into ASCII text strings. IS473 - Chapter (4) Dr. Almetwally Mostafa
171
Client-server Communication Hypertext Transfer Protocol (HTTP)
Each client request specifies the URL of a server resource and a method name applied to it. The methods include the following: GET: request of resource, identified by URL, may refer to data: server returns identified data. program: server runs program and returns output data. HEAD: request similar like GET, but only meta data on resource is returned (like date of last modification). POST: specifies resource (for instance, a server program) that can deal with the client data provided with previous request. PUT: supplied data to be stored in the given URL. DELETE: delete an identified resource on server. OPTIONS: request a list of methods can be applied to the given URL. IS473 - Chapter (4) Dr. Almetwally Mostafa
172
Client-server Communication Hypertext Transfer Protocol (HTTP)
A request message specifies the method name, the URL of resource, the protocol version, some headers and an optional message body (not needed in case of data resource). A reply message specifies the protocol version, a status code and reason, some header and an optional message body. The status code and reason provide an report on the success otherwise in carrying out the request. method URL or pathname HTTP version headers message body GET // HTTP/ 1.1 HTTP version status code reason headers message body HTTP/1.1 200 OK resource data IS473 - Chapter (4) Dr. Almetwally Mostafa
173
Group Communication Multicasting: operation to send a single message from one process to each of the members of a predefined group of processes transparently. Provide a useful infrastructure for constructing distributed systems with the following characteristics: Replicated services: client requests are multicast to a group of servers, each of them performs an identical operation. Clients can still be served when some of the group members fail. Finding services and their interfaces. Better performance: data are replicated to increase the performance of a service. Propagation of event notification: multicast to notify processes when something happens (e.g. newsgroups). IS473 - Chapter (4) Dr. Almetwally Mostafa
174
Group Communication IP multicast is built on top of IP protocol, and is designed for communication between groups of hosts (not ports at this level of the protocol stack). A specific class of IP addresses (class D) is reserved for multicast groups, on a temporary or a permanent basis. IP multicast is available only via UDP protocol: Datagrams are sent to a multicast IP address and ordinary port numbers. When a multicast message arrives at a host, copies are forwarded to all the local sockets that have joined the specified multicast address and are bound to the specified port number. Java API provides a datagram interface to IP multicast through the subclass MulticastSocket of the class DatagramSocket with the additional capabilities of being able to join multicast groups. IS473 - Chapter (4) Dr. Almetwally Mostafa
175
IS473 Distributed Systems
CHAPTER 5 Distributed Objects & Remote Invocation
176
OUTLINE Applications This RMI, RPC, and events chapter
request-reply protocol Middleware layers marshalling and external data representation UDP and TCP Dr. Almetwally Mostafa
177
OUTLINE Distributed Programming Models. Distributed Module Interfaces.
Distributed Objects Model. Remote Procedure Call. Events and Notifications. Java RMI Case Study. Dr. Almetwally Mostafa
178
Distributed Programming Models
A distributed programming model is as an abstraction layer above communication protocols details. Different Programming models provided by middleware to support distributed applications include: Remote Procedure Call (RPC): Client programs call procedures in server programs running in separate remote hosts but in the same way as local procedure call. Remote Method Invocation (RMI): Client objects may invoke methods of remote objects residing in another process running on another host also in the same way as local method invocation. Event-based processing and event notification: The behavior of the system is driven by events, which represent local state changes within objects. Objects receive notifications of events at other objects in which they have registered interest. Dr. Almetwally Mostafa
179
Distributed Programming Models
Middleware is the provision of transparency and heterogeneity challenges: Location transparency: RMI and RPC invoked without knowledge of the location of invoked method/procedure. Transport protocol transparency: e.g., request/reply protocol used to implement RPC can use either transport protocols (UDP or TCP). Transparency of computer hardware and operating system: e.g., use of external data representations. Transparency of programming language used e.g., by use of programming language independent Interface Definition Languages, such as CORBA IDL. Dr. Almetwally Mostafa
180
Distributed Module Interfaces
Allow interaction between modules of distributed programs. Specify procedures and variables that can be accessed from other modules. Modules are implemented to hide all information about them except the available through their interfaces. Module implementation can be changed without affecting its users. Specifics for interfaces in distributed systems: No direct access to remote variables: Use message passing mechanism to transmit data variables. Local parameter passing mechanisms (by value, by reference) not applicable to remote invocations. Specify input or output as parameters attribute. Input parameters are transmitted with request message and output parameters are transmitted with reply message. Pointers are not valid in remote parameter passing. Dr. Almetwally Mostafa
181
Distributed Module Interfaces
RPC and RMI interfaces are often seen as: Service interface (in client/server model for RPC) Each server provides a set of procedures available for use by clients Defining the types of input and output arguments for each procedure. Remote interface (in distributed object model for RMI) Specify the methods of an object that can be invoked by objects in other processes. Defining the types of input and output arguments for each method. Methods can receive objects or object references (not pointers) as arguments and may return objects as results. Dr. Almetwally Mostafa
182
Distributed Interface Definition
Language-specific approach: The RMI/RPC mechanism is integrated with a particular programming language if its notation can be used to define the interfaces and deal with input and output parameters. Allows the distributed applications programmers to use a single language for local and remote invocation. Example: Java RMI. Interface definition language approach: Allows objects implemented in different languages to invoke useful objects implemented in one another. Uses a general IDL notation for defining interfaces including input and output parameters with their types. Example: CORBA IDL. Dr. Almetwally Mostafa
183
Distributed Objects Model
The logical partition of object-based programs into objects is naturally extended by the physical distribution of objects into different processes or computers in a distributed system. Distributed object systems adopt the client-server architecture: Objects are managed by servers and their clients invoke their methods using remote method invocation. The client’s RMI request is sent in a message to the server managing the invoked method object. The method of the object at the server is executed and its result is returned to the client in another message. Objects in servers are allowed to become clients of objects in other servers. Distributed objects can be replicated and migrated to obtain the benefits of fault tolerance, enhanced performance and availability. Having client and server objects in different processes enforces encapsulation due to concurrency and heterogeneity of RMI calls. Dr. Almetwally Mostafa
184
Distributed Objects Model
invocation remote local A B C D E F Remote and local method invocations Dr. Almetwally Mostafa
185
Distributed Objects Model
Each process contains a collection of objects, some of which can receive both local and remote invocations and others can receive only local invocations. Objects that can receive remote invocations are called remote objects. Other objects need to know the remote object reference in another process in order to invoke its methods. A remote object reference is an identifier that can be used through a distributed system to refer to a particular unique remote object. Every remote object has a remote interface to specify which of its methods can be invoked remotely. Objects in other processes can invoke only the methods that belong to the remote interface of the remote object. Local objects can invoke the methods in the remote interface as well as other methods of the remote object. Dr. Almetwally Mostafa
186
Distributed Objects Model
remote object Data m4 { m1 implementation m5 m2 remote m6 of methods m3 interface A remote object and its remote interface Dr. Almetwally Mostafa
187
Distributed Objects Model Design Issues
Although local invocations are executed exactly once, remote invocations cannot achieve this. Why not? RMI usually applied via request/reply protocol which can suffer from many types of failure: Message omission and duplication can occur if implemented over UDP. Process failures (crash or arbitrary) are also possible. Therefore, different choices of fault-tolerance measures can be applied: Retransmit request message until a reply is received or the server is assumed to be failed Filter out duplicate requests at the server if retransmit is used. Re-execute the operation when retransmitted request arrives or result retransmission (requires keeping a history at the server). Combinations of fault-tolerance choices lead to a variety of invocation semantics for the reliability of remote invocation. Dr. Almetwally Mostafa
188
Distributed Objects Model Design Issues
Fault tolerance measures Invocation semantics Retransmit request message Duplicate filtering Re-execute procedure or retransmit reply No Yes Not applicable Re-execute operation Retransmit reply At-most-once At-least-once Maybe Invocation semantics Dr. Almetwally Mostafa
189
Distributed Objects Model Design Issues
Maybe invocation semantics: No fault-tolerance measure is applied. The invoker cannot tell if a remote method has been executed or not. Suffer from the following types of failure: Omission failures if invocation or result message is lost. Crash failures if the server containing the remote object fails. Useful only for application accepting occasional failed invocations. At-least-once invocation semantics: Uses only retransmission of request messages – masks omission failures. The invoker receives either a result (after at least one execution) or an exception informing no result was received. Crash failure if the server containing the remote object fails. Arbitrary failures when re-execute the method more than once which causing wrong values to stored or returned (non-idempotent operations). Applicable only if all the remote methods are idempotent operations. Dr. Almetwally Mostafa
190
Distributed Objects Model Design Issues
At-most-once invocation semantics: Uses all the fault-tolerance measures. The invoker receives either a result (after exactly one execution) or an exception informing no result was received. Prevents arbitrary failures by ensuring that a method is never executed more than once for the same RMI. CORBA and Java uses at-most-once invocation semantics. COBRA allows maybe invocation semantics to be requested for methods without results. Sun RPC provides at-last-once call semantics. Dr. Almetwally Mostafa
191
Distributed Objects Model RMI Implementation
Several separate objects and modules are involved to achieve a remote method invocation: Communication modules: Responsible for providing a specified invocation semantics. Carry out the request-reply protocol to transmit request and reply messages between the cooperating server and client. Use only the first three items of the request and reply messages: message type, request id, and the invoked remote object reference. Select the dispatcher software for the invoked object class in the server. Dr. Almetwally Mostafa
192
Distributed Objects Model RMI Implementation
server client remote object A proxy for B skeleton object B Request & dispatcher for B’s class Reply Communication Communication Remote reference Remote module reference module module module The proxy and skeleton role in remote method invocation Dr. Almetwally Mostafa
193
Distributed Objects Model RMI Implementation
Remote reference modules: Translate between local and remote object references and create remote object references. Have a remote object table in each process to support the translation. An entry for all the remote objects held by the server in its table. An entry for each local proxy held by the client in its table. Create a remote object reference and add it to the table when a remote object is passed in a request or a replay message for the first time. Translate the remote object reference to the corresponding local object reference which refer either to a remote object (in the server) or to a proxy (in the client). Dr. Almetwally Mostafa
194
Distributed Objects Model RMI Implementation
RMI Software: Proxy: Make remote method invocation transparent to clients by behaving like a local object to the invoker. Forward the call in a message to the remote object and hiding all in-between operations (send, receive, marshalling, and unmarshalling) from the client. A client has one proxy for each remote object to hold its remote reference and implement the methods in its remote interface. Marshal a reference to the target object, its own methodId and its arguments into a request message and send it to the target then await the reply message to unmarshal it and return the result to the invoker. Dispatcher: A server has one dispatcher for each class representing a remote object. Receive the request message from the communication module and use the methodId to select the appropriate method in the skeleton and passing on the request message. Dr. Almetwally Mostafa
195
Distributed Objects Model RMI Implementation
RMI Software: (cont.) Skeleton: Each class representing a remote object has a skeleton in the server to implement each method in the remote interface. Unmarshal the argument in the request message and invoke the corresponding method in the remote object. Await the invocation to complete to marshal the result in the reply message to the client proxy’s method. The classes of the proxy, skeleton, and dispatcher used in RMI are generated automatically by an interface complier. The server program contains the classes for the dispatchers and skeletons together with the implementations of the classes of all remote objects that it supports – servant classes. The client program contains the classes of the proxies for all the remote objects that it will invoke and use a binder to look up remote object references. Dr. Almetwally Mostafa
196
Distributed Objects Model Distributed Garbage Collection
Free resources occupied by objects that have no more reference to it in the system. Work in cooperation with the local garbage collector in each process in the system. Java Distributed Garbage Collection algorithm: Each server maintains a list of clients that hold remote object references for each of its remote objects. When a client C first receives a remote reference to a particular remote object X, it makes an addRef(X) invocation to the server of that remote object and then the server adds C to the list of object X. When a client C’s garbage collector notices that a proxy for a remote object X is no longer referenced, it sends removeRef(X) invocation to the corresponding server which deletes C from the list of object X. When the list of a remote object X is empty and there are no local references to it, the server’s local garbage collector collect resources held by the object X. Dr. Almetwally Mostafa
197
Distributed Objects Model Distributed Garbage Collection
Java algorithm fault tolerance: Carried out by request/reply communication with at-most-once invocation semantics between the remote reference modules in processes. Toleration of communication failures: addRef and removeRef are indempotent. If addRef(X) returns exception, then client will not instantiate proxy but will send removeRef(X). Effect of removeRef(X) is correct no matter whether addRef(X) succeeded or not. Failure of removeRef(X) is masked by the use of leases. Leases Servers lease their objects to clients for limited periods of time, starting with addRef. Helps tolerate failures of removeRef as well as client failures. Avoids problem of having to maintain lists of remote object references if all object references are handled through leases. Dr. Almetwally Mostafa
198
Remote Procedure Call (RPC)
Very similar to a remote method invocation A client process calls a procedure in a server process. Servers may be clients of other servers to allow chains of RPCs. Implemented to have one of the choices of invocation semantics. Implemented over a request-reply protocol . The contents of request and reply messages are the same as RMI except that the object reference field is omitted. The supported software is similar except that no remote reference modules are required and client proxies and server skeletons are replaced by client and server stub procedures. The service interface of the server defines the procedures that are available for calling remotely. The client and server stub procedures and the dispatcher are generated by an interface compiler from the definition of the service interface. Dr. Almetwally Mostafa
199
Remote Procedure Call (RPC)
client process server process Request Reply client stub server stub procedure procedure client service program Communication Communication procedure module module dispatcher Role of client and server stub procedures in RPC Dr. Almetwally Mostafa
200
Events and Notifications
Distributed event-based systems allowing multiple objects at different locations to be notified of an event occurrence at a particular object. The publish-subscribe paradigm is used: An object generating events publishes the list of events available for observation by other objects. Objects requiring notifications subscribes to the notification service at an object offering notification for this particular event through its publication list. When a publisher experiences an event, subscribers that expressed an interest in that type of event will receive notifications. Events and notifications can be used in a wide variety of different applications: A piece of equipment or an electronically tagged book is at new location. A client has entered participation in a collaborative work environment. An electronic document has been modified. Dr. Almetwally Mostafa
201
Events and Notifications Dealing Room System
Allow dealers using computer to see the latest information about the market prices of the stocks. The market price for a single named stock is represented by an object with several instance variables. The information is arrived from different external sources in the form of updates to some or all instance variables of the stock objects and collected by information providers. Dealers are interested only in their specialist stocks. Modeled by processes with two different tasks: An information provider process continuously receives information from a single external source to apply it to the appropriate stock objects as an event and notifies all the dealers subscribed to the corresponding stock. A dealer process creates a local object to represent each named stock to subscribe the stock object at the relevant information provider and receive all information sent in notifications to display it to the user. Dr. Almetwally Mostafa
202
Events and Notifications Dealing Room System
Dealer’s computer Information provider Dealer External source Notification Dr. Almetwally Mostafa
203
Events and Notifications Event Types
An event source can generate events of one or more different types. Each event has attributes to specify information about that event: the name of object generated it, the operation, its parameters and the time. Types and attributes are used in subscribing to events and in notifications. Whenever an event of a specific type that matches the attributes occurs, the interested parties will be notified. In dealing room system example: There is one type of event – the arrival of an update to a stock. The attributes might specify the name of a stock, its current price, and latest rice or fall. Dealers may specify the interesting in all the events of a particular stock. Dr. Almetwally Mostafa
204
IS473 Distributed Systems
CHAPTER 6 Operating System Support
205
Computer and network hardware
OUTLINE Applications Computer and network hardware Platform Operating system Middleware Dr. Almetwally Mostafa
206
OUTLINE Distributed Operating System. Operating System Layer.
Processes and Threads. Communication and Invocation. Operating System Architecture. Dr. Almetwally Mostafa
207
Network and Distributed OS
Network operating system: Have networking capability to access remote resources. Retain autonomy (استقلال) in managing own resources. Remote resource access not always transparent. Separate system image on each node. Distributed operating system: Single system image across multiple nodes. Resource access completely transparent. Not practically in use: Compatibility with existing applications. Emulations offer very bad performance. Users prefer a degree of autonomy for their machines. Middleware and network operating systems combination provides an acceptable balance between the requirement of autonomy and network-transparent resource access. Dr. Almetwally Mostafa
208
Operating System Layer
Applications, services Computer & Platform Middleware OS: kernel, libraries & servers network hardware OS1 Node 1 Node 2 Processes, threads, communication, ... OS2 System layers Dr. Almetwally Mostafa
209
Operating System Layer
Users satisfaction is achieved if middleware-OS combination has good performance. OS running at a node provides its own abstractions of local hardware resources for processing, storage and communication. Middleware utilizes a combination of local resources to implement its mechanisms for remote invocations between objects or processes. OSs provide support for middleware layer to work effectively: Encapsulation: provide transparent service interface to resources of the computer. Protection: protect resources from illegitimate access. Concurrent processing: users/clients may share resources and access concurrently. Provide the resources needed for (distributed) services and applications to complete their task: Communication and scheduling. Dr. Almetwally Mostafa
210
Operating System Layer
Communication manager Thread manager Memory manager Supervisor Process manager Core OS functionality Dr. Almetwally Mostafa
211
Operating System Layer
The core OS components include the following: Process manager: Handles the creation of and operation upon process. Thread manager: Handles thread creation, synchronization and scheduling. Communication manager: Handles communication between threads attached to different processes on the same computer. Some OSs support communication between threads in remote processes. Memory manager: Manages physical and virtual memory. Supervisor: Dispatches interrupts, system call traps and other exceptions. Dr. Almetwally Mostafa
212
Processes and Threads Process (program in execution):
Unit of resource management for operating system. Execution environment: Address space. Thread synchronization and communication resources (e.g. semaphores). Computing resources (file systems, windows, etc.) Expensive to create and manage. Threads (lightweight process): Schedulable activities attached to processes. Arise from the need for concurrent activities to share resources within one process. Enable to overlap computation with input and output. Allow concurrent processing of client requests in servers – each request handled by one thread. Easier to create and destroy. Dr. Almetwally Mostafa
213
Processes and Threads Address Spaces
A unit of management a process’s virtual memory. Large and consists of one or more regions separated by inaccessible area of virtual memory to allow growth. A region is an area of contiguous virtual memory accessible by the threads of the owning process. Each region is specified by: Lowest virtual address and size. Read/write/execute permissions for the process’s threads. Whether can be grown upwards or downwards. Gaps are left between regions to allow for growth and regions can be overlapped when extended in size. Data files can be mapped into the address space as an array of bytes in memory. Dr. Almetwally Mostafa
214
Processes and Threads Address Spaces
There are at least three main regions: Text: a fixed unmodifiable region containing program code. Heap: extensible region initialized by values stored in the program binary file. Stack: downward extensible region used by subroutines. The need to support a separate stack for each thread is the main reason of an indefinite number of regions. Shared regions are regions of virtual memory mapped to identical physical memory for different processes to enable inter-process communication. Dr. Almetwally Mostafa
215
Processes and Threads New Process Creation
An indivisible operation provided by the operating system. The UNIX fork system call creates a process with an execution environment copied from the caller. But, the creation of a new process in a distributed system can be separated into two independent aspects: The choice of a target host. The creation of an execution environment. Choice of process host: Determine the node at which the new process will reside according to transfer and location policies for sharing the processing load: The transfer policy determines whether to situate a new process locally or remotely. The location policy determines which node should host a new process. Dr. Almetwally Mostafa
216
Processes and Threads New Process Creation
Choice of process host (cont.): Location polices may be static or adaptive: Static location policies operate without regard to the current state of the system based on a mathematical analysis aimed at optimizing the all system and may be deterministic or probabilistic. Adaptive location polices apply heuristics to make the decision based on unpredictable run-time factors on each node. Load sharing system may be centralized, hierarchical or decentralized: One manger component take the decision in the centralized system. There are several mangers organized in a tree structure in hierarchical system and each manger makes the decisions as far down the tree. Nodes in the decentralized system exchange information with one another directly to make allocation decisions using: Sender-initiated algorithm: the node requires creating a new process is responsible for initiating the transfer decisions. Receiver-initiated algorithm: the node with relatively low load advertises its existence other nodes to transfer work to it. Dr. Almetwally Mostafa
217
Processes and Threads New Process Creation
Creation of a new execution environment: There are two approaches to defining and initiating the address space of a new created process: The address space is of statically defined format and initialized with zeros. The address space is defined with respect to an existing execution environment. In case of UNIX fork, the newly created child process share the parent’s text region and has its own heap and stack regions. Copy-on-write approach: A general approach of inheriting all regions of the parent process by the child process. An inherited region is logically copied from the parent’s region by sharing its frame between the two address spaces. A page in a region is physically copied when one or other process attempts to modify it. Dr. Almetwally Mostafa
218
Processes and Threads New Process Creation
a) Before write b) After write Shared frame A's page table B's page Process A’s address space Process B’s address space Kernel RA RB RB copied from RA Dr. Almetwally Mostafa
219
Processes and Threads Threads Performance
Consider the server has a pool of one or more threads. Each thread removes a client request from a queue of received requests and process it. Example: (how multi-threading maximize the server throughput) Request processing: 2 ms I/O delay (no caching): 8 ms Single thread: 10 ms per requests, 100 requests per second. Two threads (no caching): 8 ms per request, 125 requests per second two threads and caching: 75% hit rate mean I/O time per request: 0.75 * * 8ms = 2 ms 500 requests per second increased processing time per request as a result of caching : 2.5 ms 400 requests per second Dr. Almetwally Mostafa
220
Processes and Threads Threads Performance
Server N threads Input-output Client Thread 2 makes T1 Thread 1 requests to server generates results Requests Receipt & queuing Client and server with threads Dr. Almetwally Mostafa
221
Processes and Threads Multi-threaded Server Architectures
There are various ways to mapping requests to threads within a server. The threading architectures of various implementations are: Worker pool architecture: Pool of server threads serves requests in queue. Possible to maintain priorities per queue. Thread-per-request architecture: Thread lives only for the duration of request handling. Maximizes throughput (no queueing). Expensive overhead for thread creation and destruction. Thread-per-connection/per-object architecture: Compromise solution. No overhead for creation and deletion of threads. Requests may still block, hence throughput is not maximal. Dr. Almetwally Mostafa
222
Processes and Threads Multi-threaded Server Architectures
per-connection threads per-object threads workers I/O remote I/O remote remote objects objects objects a. Thread-per-request b. Thread-per-connection c. Thread-per-object Alternative server threading architectures Dr. Almetwally Mostafa
223
Processes and Threads Threads vs. Multiple Processes
Why the multi-threaded process model is preferred than multiple single-threaded processes? Creating a new thread within an existing process is cheaper than creating a process (~10-20 times) New process under Unix: 11ms, new thread under Topaz: 1 ms Switching to a different thread within the same process is cheaper than switching between threads in different processes (~5-50 times). Process switch in Unix: 1.8ms, thread switch in Topaz: 0.4 ms. Threads within a process can share data and other resources more conveniently and efficiently (without copying or messages). No need for message passing. Communication via shared memory. Threads within a process are not protected from each other. One thread can access other thread's data, unless a type-safe programming language is being used Dr. Almetwally Mostafa
224
Processes and Threads Threads Programming
Some languages provided direct support for threads concurrent programming (e.g. C, Ada95, Modula-3 and Java). Java provides Thread class that includes the following methods for creating, destroying and synchronizing threads: Thread(ThreadGroup group, Runnable target, String name) Creates a new thread, in the SUSPENDED state, belong to group and be identified as name; the thread will execute the run() method of target. setPriority(int newPriority), getPriority() - Set and return the thread’s priority. run() - A thread executes the run() method of its target object, if it has one, and otherwise its own run() method. start() - Change the state of the thread from SUSPENDED to RUNNABLE. sleep(int millisecs) - Cause the thread to enter the SUSPENDED state for the specified time. destroy() - Destroy the thread. Dr. Almetwally Mostafa
225
Processes and Threads Java Thread Lifetimes
A new thread is created in the SUSPENDED state on the same Java Virtual Machine (JVM) as its creator. A thread executes run() method after it is made RUNABLE with the start() method. Threads can be assigned a priority and Java implementations will run a particular thread in preference to any thread with lower priority. A thread ends its life when it returns from the run() method or when its destroy() method is called. Programs can manage threads in groups: Thread group is assigned at the time of its creation. thread groups useful to shield various applications running in parallel on one JVM. A thread in one group may not interrupt thread in another group. Thread group facilitates control of the relative priorities of threads. Dr. Almetwally Mostafa
226
Processes and Threads Java Thread Synchronization
Each thread’s local variables in methods are private to it. An object can have synchronized and non-synchronized methods. Example: Synchronized addTo() and removeFrom() methods to serialize requests in worker pool example. Any object can only be accessed through one invocation of any of its synchronized methods. Threads can be blocked and woken up via condition variables: Thread awaiting a certain condition calls an object’s wait() method. Other thread calls notify() or notifyAll() to awake one or all blocked threads. Example: When worker thread discovers no requests to be processed calls wait() on instance of Queue. When I/O thread adds request to queue calls notify() method of queue to wake up worker. Dr. Almetwally Mostafa
227
Processes and Threads Java Thread Scheduling
A special yield() method is used to enable scheduling of threads to make progress. There are two types of scheduling threads: Preemptive scheduling threads A thread may be suspended at any point to make away for another thread. Non-preemptive scheduling threads A thread runs until makes a call to the threading system to de-schedule it and schedule another thread to run. A code section without a threading system call is automatically a critical section. Run exclusively and therefore it can not take advantage of a multiprocessor. Must take care long-running code sections that do not contain threading system calls. Unsuitable for real-time applications processed in absolute times. Dr. Almetwally Mostafa
228
Processes and Threads Threads Implementation
Many operating system kernels provide support for multi-threaded processes (e.g. Windows NT, Solaris, and Mach). Provide thread creation, management system calls and scheduling individual threads. Other operating systems have only a single-threaded process abstraction. Multi-threaded processes are implemented by linking a user library of procedures to application programs. Suffer from the following problems: Threads within a process can not take advantage of a multiprocessor. A thread that takes a page fault blocks the entire process and all its threads. Threads within different process can not scheduled according to a single schema of relative prioritization. But have significant advantages: Operations of thread creation are significantly less costly. Allow customizing the thread scheduling module and support more user-level threads to suit particular application requirements. Dr. Almetwally Mostafa
229
Processes and Threads Threads Implementation
The advantages of user-level and kernel-level threads implementations can be combined: Mach OS enable user-level code to provide scheduling hints to the kernel’s thread scheduler. Solaris 2 adopts hierarchical scheduling that supports both kernel-level and user-level threads. A user-level scheduler assigns each user-level thread to a kernel-level thread. Take the advantage of a multiprocessor. Disadvantage: still lakes flexibility If a kernel-level thread is blocked, then all user-level threads assigned to it are also prevented from running. Several research projects have developed hierarchical scheduling further to provide greater efficiency and flexibility: FastThreads implementation of a hierarchic event-based scheduling system: Consider the main system components is a kernel running on a computer with one or more processors and a set of application programs running on it. Each application process contains a user-level scheduler to manage its threads. The kernel is responsible for allocating virtual processors to processes. Dr. Almetwally Mostafa
230
Processes and Threads Threads Implementation
P added Process Process A B SA preempted Process SA unblocked SA blocked Kernel P idle Kernel Virtual processors P needed A. Assignment of virtual processors B. Events between user-level scheduler & kernel to processes Key: P = processor; SA = scheduler activation Scheduler activations Dr. Almetwally Mostafa
231
Communication and Invocation Invocation Performance
The performance of RPC and RMI mechanisms is a critical factor for effective distributed systems. Clients and servers may make many millions of invocation-related operations in their lifetimes. Software overheads often predominate over network overheads in invocation times. Invocation times have not decreased in proportion with increases in network bandwidth. Each invocation mechanism executes a code out of the calling procedure or object scope and involves the arguments communication to the code and the data values return to the caller. The important performance-related distinctions between invocation mechanisms: Whether they are synchronous or asynchronous. Whether they involve a domain transition. Whether they involve communication across a network. Dr. Almetwally Mostafa
232
Communication and Invocation Invocation Performance
Control transfer via (a) System call Thread trap instruction Control transfer via privileged instructions User Kernel Protection domain boundary (b) RPC/RMI (within one computer) Thread 1 Thread 2 User 1 Kernel User 2 (c) RPC/RMI (between computers) Thread 1 Network Thread 2 User 1 User 2 Kernel 1 Kernel 2 Dr. Almetwally Mostafa
233
Communication and Invocation Invocation Performance
A null invocation is an RPC (or RMI) without parameters that execute a null procedure and returns no values. Important to measure a fixed overhead, the latency. Execution time for a null procedure call: Local procedure call < 1 microseconds Remote procedure call ~ 10 milliseconds Much of the delay taken by the actions of the operating system kernel and middleware. Network time involving about 100 bytes (null invocation size) transferred at 100 megabits/sec. accounts for only .01 millisecond. Factors affecting remote invocation performance: Marshalling/unmarshalling + operation dispatch at the server Data copying:- application -> kernel space -> communication buffers Thread scheduling and context switching Protocol processing:- for each protocol layer Network access delays:- connection setup, network latency 10,000 times slower! Dr. Almetwally Mostafa
234
Communication and Invocation Invocation Performance
Shared regions may be used for rapid communication between a user process and the kernel or between user processes. Data is communicated by writing to and reading from the shared region without coping them to and from the kernel address spaces. The delay of a client experiences during request-replay interactions over TCP is not necessarily worse than UDP and it is sometimes better for large messages. The operating system default buffering can be used to collect several small massages and then send them together rather than sending them in separate packets. Develop a more efficient invocation mechanism for the case of two processes on the same computer, lightweight RPC (LRPC): Based on optimization concerning data coping and thread scheduling. Use shared regions for client-server communication with a different private region (A stack) between the server and each of its local clients. Each client and the server are able to pass arguments and return values directly via an A stack. Dr. Almetwally Mostafa
235
Communication and Invocation Invocation Performance
1. Copy args 2. Trap to Kernel 4. Execute procedure and copy results Client User stub Server Kernel 3. Upcall 5. Return (trap) A A stack A lightweight remote procedure call Dr. Almetwally Mostafa
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.