Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 DISTRIBUTED SYSTEMS.

Similar presentations


Presentation on theme: "Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 DISTRIBUTED SYSTEMS."— Presentation transcript:

1 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 5 Naming

2 Chapter 5 Naming Objectives: NAMES, IDENTIFIERS, AND ADDRESSES FLAT NAMING STRUCTURED NAMING ATTRIBUTE-BASED NAMING Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

3 NAMES, IDENTIFIERS, AND ADDRESSES A name in a distributed system is a string of bits or characters that is used to refer to an entity An entity in a distributed system can be anything. examples include resources such as hosts, printers, disks, and files Other examples of entities that are often explicitly named are processes, users, mailboxes, newsgroups, Web pages, To operate on an entity, it is necessary to access it, for which we need an access point An access point is another, but special kind of entity in a distributed system. The name of an access point is called an address Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

4 NAMES, IDENTIFIERS, AND ADDRESSES An entity can offer more than one access point. e.g. a telephone can be viewed as an access point of a person, whereas the telephone number corresponds to an address An entity may change its access points in the course of time, example: when a mobile computer moves to another location, it is assigned a different IP address than the one it had before Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

5 NAMES, IDENTIFIERS, AND ADDRESSES an entity may easily change an access point an access point may be reassigned to a different entity. a name for an entity that is independent from its addresses is often much easier and more flexible to use. Such a name is called location independent Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

6 Names, Identifiers, And Addresses Identifiers uniquely represent entities Properties of a true identifier: An identifier refers to at most one entity. Each entity is referred to by at most one identifier. An identifier always refers to the same entity

7 Names, Identifiers, And Addresses Addresses and identifiers are represented in machine- readable form only, in the form of bit strings. For example, an Ethernet address is a random string of 48 bits. Likewise, memory addresses are represented as 32-bit or 64-bit strings. human-friendly names is represented as a character string. For example, files in UNIX systems have character-string names that can be as long as 255 characters, and which are defined entirely by the user. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

8 Names, Identifiers, And Addresses how do we resolve names and identifiers to addresses? a naming system maintains a name-to-address binding in the form of a table of (name, address) pairs. However, in distributed systems that span large networks and for which many resources need to be named, a centralized table is not going to work a name is decomposed into several parts, name resolution takes place through a recursive lookup of those parts for example: a client needing to know the address of the FTP server named ftp.cs.vu.nl NS(.) ~ NS(nl) ~ NS(vu.nl) ~ address of ftp.cs.vu.nl Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

9 FLAT NAMING Problem identifiers are simply random bit strings, unstructured, or flat names. it does not contain any information whatsoever on how to locate the access point of its associated entity. Given an essentially unstructured name (e.g., an identifier), how can we locate its associated access point? Simple solutions (broadcasting) Home-based approaches Distributed Hash Tables (structured P2P) Hierarchical location service Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

10 Simple solutions (broadcasting) Broadcasting and Multicasting applicable only to local-area networks Broadcasting becomes inefficient when the network grows too many hosts interrupted by requests they cannot answer Another solution is to switch to multicasting, by which only a restricted group of hosts receives the request hosts join a specific multicast group. Such groups are identified by a multicast address. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

11 Simple solutions (Forwarding Pointers) Another approach to locating mobile entities is to make use of forwarding pointers When an entity moves, it leaves behind a pointer to next location advantage of this approach is its simplicity Drawbacks: - a chain can become so long that locating that entity is prohibitively expensive - all intermediate locations in a chain will have to maintain their part of the chain of forwarding pointers - broken links: As soon as any forwarding pointer is lost, the entity can no longer be reached Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

12 Forwarding Pointers (1) Figure 5-1. The principle of forwarding pointers using (client stub, server stub) pairs.

13 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Forwarding Pointers (2) Figure 5-2. Redirecting a forwarding pointer by storing a shortcut in a client stub.

14 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Forwarding Pointers (3) Figure 5-2. Redirecting a forwarding pointer by storing a shortcut in a client stub.

15 Home-Based Approaches Problems with simple solutions: broadcasting and forwarding pointers imposes scalability problems. Broadcasting or multicasting is difficult to implement efficiently in large-scale networks long chains of forwarding pointers introduce performance problems and are susceptible to broken links. Another solution: to support mobile entities in large-scale networks is to introduce a home location (keeps track of the current location of an entity) home location is often chosen to be the place where an entity was created Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

16 Home-Based Approaches Each mobile host uses a fixed IP address communication to fixed IP address is initially directed to the mobile host's home agent Whenever the mobile host moves to another network, it requests a temporary address that it can use for communication. This care-of address is registered at the home agent. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

17 Home-Based Approaches Figure 5-3. The principle of Mobile IP.

18 Home-Based Approaches Problems with home-based approaches Home address has to be supported for entity’s lifetime Home address is fixed ===< unnecessary burden when the entity permanently moves Poor geographical scalability (entity may be next to client) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

19 Distributed Hash Tables The Chord system as a representative for DHT-based system Chord uses an m-bit identifier space to assign randomly- chosen identifiers to nodes as well as keys to specific entities. The latter can be virtually anything: files, processes, etc. The number m of bits is usually 128 or 160, depending on which hash function is used. An entity with key k falls under the jurisdiction of the node with the smallest identifier id ~ k. This node is referred to as the successor of k and denoted as succ(k). Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

20 Distributed Hash Tables General Mechanism Figure 5-4. Resolving key 26 from node 1 and key 12 from node 28 in a Chord system.

21 Hierarchical Approaches The network is divided into nonoverlapping domains. Domains can be grouped into higher-level (nonoverlapping) domains, and so on. There is a single top-level domain that covers the entire network. Each domain at every level has an associated directory node. If an entity is located in a domain D, the directory node of the next higher-level domain will have a pointer to D. A lowest-level directory node stores the address of the entity. The top-level directory node knows about all entities Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

22 Hierarchical Approaches (1) Figure 5-5. Hierarchical organization of a location service into domains, each having an associated directory node.

23 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Hierarchical Approaches (2) Figure 5-6. An example of storing information of an entity having two addresses in different leaf domains.

24 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Hierarchical Approaches (3) Figure 5-7. Looking up a location in a hierarchically organized location service.

25 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Hierarchical Approaches (4) Figure 5-8. (a) An insert request is forwarded to the first node that knows about entity E.

26 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Hierarchical Approaches (5) Figure 5-8. (b) A chain of forwarding pointers to the leaf node is created.

27 STRUCTURED NAMING Flat names are good for machines, not very convenient for humans to use. As an alternative Structured names are composed from simple, human- readable names (file naming, host naming on the Internet) structured names and how these names are resolved to addresses. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

28 Name Spaces Names are commonly organized into what is called a name space A graph in which a leaf node represents a (named) entity. A directory node is an entity that refers to other nodes. A directory node contains a (directory) table of (edge label, node identifier) pairs. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

29 Name Spaces (1) Figure 5-9. A general naming graph with a single root node.

30 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Name Spaces (2) Figure 5-10. The general organization of the UNIX file system implementation on a logical disk of contiguous disk blocks.

31 Name Resolution The process of looking up a name is called name resolution given a path name, it should be possible to look up any information stored in the node referred to by that name A name lookup returns the identifier of a node from where the name resolution process continues Example: N:>label Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

32 Closure Mechanism Name resolution can take place only if we know how and where to start. Knowing how and where to start name resolution is generally referred to as a closure mechanism closure mechanism deals with selecting the initial node in a name space from which name resolution is to start Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

33 Linking and Mounting An alias is another name for the same entity In terms of naming graphs, there are two different ways to implement an alias: - allow multiple absolute paths names to refer to the same node in a naming graph, e.g. node n5 can be referred to by two different path names: /keys and / homelsteen/keys (Fig. 5-9) - Hard link - to represent an entity by a leaf node, but instead of storing the address or state of that entity, the node stores an absolute path name (Fig. 5-11) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

34 Linking and Mounting Figure 5-11. The concept of a symbolic link explained in a naming graph.

35 Linking and Mounting Name resolution as described so far takes place completely within a single name space name resolution can also be used to merge different name spaces In distributed systems name spaces is distributed across different machines, each name space is implemented by a different server, each possibly running on a separate machine Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

36 Linking and Mounting Information required to mount a foreign name space in a distributed system The name of an access protocol. The name of the server. The name of the mounting point in the foreign name space.

37 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Linking and Mounting Figure 5-12. Mounting remote name spaces through a specific access protocol.

38 Implementation of a Name Space A name space forms the heart of a naming service a naming service that allows users and processes to add, remove, and look up names. A naming service is implemented by name servers. If a distributed system is restricted to a local area network, it is often feasible to implement a naming service by means of only a single name server. in large-scale distributed systems with many entities, spread across a large geographical area, it is necessary to distribute the implementation of a name space over multiple name servers Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

39 Name Space Distribution Distinguish three levels Global level: Consists of the high-level directory nodes. Main aspect is that these directory nodes have to be jointly managed by different administrations (e.g. root & logically close nodes) Administrational level: Contains mid-level directory nodes that can be grouped in such a way that each group can be assigned to a separate administration (single organization) Managerial level: Consists of low-level directory nodes within a single administration. Main issue is effectively mapping directory nodes to local name servers (e.g. nodes change Regularly – nodes representing host in LAN). Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

40 Name Space Distribution (1) Figure 5-13. An example partitioning of the DNS name space, including Internet-accessible files, into three layers.

41 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Name Space Distribution Figure 5-14. A comparison between name servers for implementing nodes from a large-scale name space partitioned into a global layer, an administrational layer, and a managerial layer.

42 Implementation of Name Resolution The distribution of a name space across multiple name servers affects the implementation of name resolution - iterative name resolution (Figure 5-15) returning each intermediate result back to the client's name resolver - Recursive name resolution (Figure 5-16) a name server passes the result to the next name server Disadvantage it puts a higher performance demand on each name server Advantage: caching results is more effective communication costs may be reduced Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

43 Implementation of Name Resolution Figure 5-15. The principle of iterative name resolution.

44 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Implementation of Name Resolution (2) Figure 5-16. The principle of recursive name resolution.

45 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Implementation of Name Resolution Figure 5-17. Recursive name resolution of. Name servers cache intermediate results for subsequent lookups.

46 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Example: The Domain Name System Figure 5-18. The comparison between recursive and iterative name resolution with respect to communication costs.

47 Example: The Domain Name System Internet Domain Name System (DNS) DNS is used for looking up IP addresses of hosts and mail servers One of the largest distributed naming services in use today - the organization of the DNS name space - information stored in its nodes - the actual implementation of DNS Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

48 The DNS Name Space The DNS name space is hierarchically organized as a rooted tree A label is a case-insensitive string made up of alphanumeric characters A label has a maximum length of 63 characters the length of a complete path name is restricted to 255 characters. The string representation of a path name consists of listing its labels, starting with the rightmost one, and separating the labels by a dot.The root is represented by a dot for example, the path name root:, is represented by the string flits.cs.vu.nl., which includes the rightmost dot to indicate the root node Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

49 The DNS Name Space DNS name space has exactly one incoming edge (except root) the label attached to a node's incoming edge is also used as the name for that node. A subtree is called a domain; a path name to its root node is called a domain name. The contents of a node is formed by a collection of resource records. There are different types of resource records. The major ones are shown in Fig. 5-19. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

50 The DNS Name Space Figure 5-19. The most important types of resource records forming the contents of nodes in the DNS name space.

51 DNS Implementation DNS name space can be divided into a global layer and an administrational layer in Fig. 5-13. The managerial layer, which is generally formed by local file systems, is formally not part of DNS and is therefore also not managed by it. Each zone is implemented by a name server, which is virtually always replicated for availability. Updates for a zone are normally handled by the primary name server. Updates take place by modifying the DNS database local to the primary server A DNS database is implemented as a (small) collection of files Fig. 5-20 shows a small part of the file for the cs.vu.nl domain Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

52 DNS Implementation Figure 5-20. An excerpt from the DNS database for the zone cs.vu.nl.

53 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 DNS Implementation (2) Figure 5-20. An excerpt from the DNS database for the zone cs.vu.nl.

54 Decentralized DNS Implementations The implementation of DNS we described so far is the standard one. a hierarchy of servers with root servers and millions of servers at the leaves. higher-level nodes receive many more requests than lower- level nodes. Only by caching the name-to- address bindings of these higher levels is it possible to avoid sending requests to them and thus swamping them. These scalability problems can be avoided with fully decentralized solutions Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

55 Decentralized DNS Implementations compute the hash of a DNS name, and subsequently take that hash as a key value to be looked up in a distributed- hash table or a hierarchical location service with a fully partitioned root node. Disadvantage: loss of the structure of the original name. Advantages: scalability Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

56 ATTRIBUTE-BASED NAMING Flat and structured names are location independence and human friendliness (not enough) User should be able to search by description of the entity Solution: attribute-based naming describe an entity in terms of (attribute, value) pairs Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

57 ATTRIBUTE-BASED NAMING Attribute-based naming systems are also known as directory services entities have a set of associated attributes that can be used for searching e.g. in an e-mail system, messages are tagged with attributes for the sender, recipient, subject, etc. designing an appropriate set of attributes is not trivial Unifying the ways that resources can be described. resource description framework (RDF) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

58 Hierarchical Implementations: LDAP A common approach to tackling distributed directory services is to combine structured naming with attribute-based naming. adopted, for example, in Microsoft's Active Directory service lightweight directory access protocol (LDAP) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

59 Hierarchical Implementations: LDAP (1) Figure 5-22. A simple example of an LDAP directory entry using LDAP naming conventions.

60 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Hierarchical Implementations: LDAP (2) Figure 5-23. (a) Part of a directory information tree.

61 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Hierarchical Implementations: LDAP (3) Figure 5-23. (b) Two directory entries having Host_Name as RDN.

62 SUMMARY Names are used to refer to entities. there are three types of names. An address is the name of an access point associated with an entity, called the address of an entity. An identifier is another type of name. It has three properties: - each entity is referred to by exactly one identifier - an identifier refers to only one entity - an identifier is never assigned to another entity. human-friendly names are targeted to be used by humans and are represented as character strings flat naming, structured naming, and attribute-based naming Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

63 SUMMARY flat naming essentially need to resolve an identifier to the address of its associated entity. - broadcasting or multicasting: The identifier of the entity is broadcast to every process in the distributed system. - forwarding pointers: Each time an entity moves to a next location, it leaves behind a pointer telling where it will be next. - allocate a home to an entity: Each time an entity moves to another location, it informs its home where it is - Distributed Hash Tables: to organize all nodes into a structured peer-to-peer system - hierarchical search tree: The network is divided into nonoverlapping domains Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5

64 SUMMARY Structured names are easily organized in a name space. simple, human-readable names attribute-based naming schemes in which entities are described by a collection of (attribute, value) pairs. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5


Download ppt "Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 DISTRIBUTED SYSTEMS."

Similar presentations


Ads by Google