Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 5/A Naming Modified by Dr. Gheith Abandah
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Overview 5.1 Names, Identifiers, and Addresses 5.2 Flat Naming Simple Solutions Home-based Approaches Distributed Hash Tables Hierarchical Approaches 5.3 Structured Naming Name Spaces Name Resolution The Implementation of a Name Space Example: The Domain Name System 5.4 Attribute-based Naming Directory Services Hierarchical Implementations: LDAP Decentralized Implementations
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Names, Identifiers, and Addresses (1) We have names, national numbers (identifier), and addresses (like phone number). Names are used to denote entities in a distributed system. To operate on an entity, we need to access it at an access point. Access points are entities that are named by means of an address. Note A location-independent name for an entity E, is independent from the addresses of the access points offered by E.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Names, Identifiers, and Addresses (2) Pure name: A name that has no meaning at all; it is just a random string. Pure names can be used for comparison only. Identifier: A name having the following properties: P1: Each identifier refers to at most one entity P2: Each entity is referred to by at most one identifier P3: An identifier always refers to the same entity (prohibits reusing an identifier) Observation: An identifier need not necessarily be a pure name, i.e., it may have content.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Flat Naming (1) Problem: Given an essentially unstructured name (e.g., an identifier), how can we locate its associated access point? 1.Simple solutions (broadcasting) 2.Home-based approaches 3.Distributed Hash Tables (structured P2P) 4.Hierarchical location service
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Simple Solutions Broadcasting: Broadcast the ID, requesting the entity to return its current address. Can never scale beyond local-area networks Requires all processes to listen to incoming location requests Forwarding pointers: When an entity moves, it leaves behind a pointer to next location Dereferencing can be made entirely transparent to clients by simply following the chain of pointers Update a client’s reference when present location is found Geographical scalability problems (for which separate chain reduction mechanisms are needed): Long chains are not fault tolerant Increased network latency at dereferencing
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Forwarding Pointers (1) The principle of forwarding pointers using (client stub, server stub) pairs.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Forwarding Pointers (2) Redirecting a forwarding pointer by storing a shortcut in a client stub.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Forwarding Pointers (3) Redirecting a forwarding pointer by storing a shortcut in a client stub.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Home-Based Approaches (1) Single-tiered scheme Let a home keep track of where the entity is: Entity’s home address registered at a naming service The home registers the foreign address of the entity Client contacts the home first, and then continues with foreign location
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Home-Based Approaches (2) The principle of Mobile IP: Home address and care-of address
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Home-Based Approaches (3) Two-tiered scheme: Keep track of visiting entities: Check local visitor register first Fall back to home location if local lookup fails Problems with home-based approaches: Home address has to be supported for entity’s lifetime Home address is fixed → unnecessary burden when the entity permanently moves Poor geographical scalability (entity may be next to client) Question How can we solve the “permanent move” problem?
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Distributed Hash Tables (1) Chord: Consider the organization of many nodes into a logical ring Each node is assigned a random m-bit identifier. Every entity is assigned a unique m-bit key. Entity with key k falls under jurisdiction of node with smallest id ≥ k (called its successor).
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Distributed Hash Tables (2) General Mechanism Resolving key 26 from node 1, and key 12 from node 28 in a Chord system. Nodes can join and leave.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Hierarchical Approaches (1) Hierarchical organization of a location service into domains. Build a large-scale search tree for which the underlying network is divided into hierarchical domains. Each domain is represented by a separate directory node.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Hierarchical Approaches (2) An example of storing information of an entity having two addresses in different leaf domains. Tree organization: Address of entity E is stored in a leaf or intermediate node Intermediate nodes contain a pointer to a child iff the subtree rooted at the child stores an address of the entity The root knows about all entities
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Hierarchical Approaches (3) Looking up a location in a hierarchically organized location service.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Hierarchical Approaches (4) (a) An insert request is forwarded to the first node that knows about entity E. (b) A chain of forwarding pointers to the leaf node is created.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Overview 5.1 Names, Identifiers, and Addresses 5.2 Flat Naming Simple Solutions Home-based Approaches Distributed Hash Tables Hierarchical Approaches 5.3 Structured Naming Name Spaces Name Resolution The Implementation of a Name Space Example: The Domain Name System 5.4 Attribute-based Naming Directory Services Hierarchical Implementations: LDAP Decentralized Implementations
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Name Spaces (1) A general naming graph with a single root node. A graph in which a leaf node represents a (named) entity. A directory node is an entity that refers to other nodes. A directory node contains a (directory) table of (edge label, node identifier) pairs.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Name Spaces (2) The general organization of the UNIX file system implementation on a logical disk of contiguous disk blocks. The Superblock contains information about the entire file system. The inode contains information about a directory or file.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Name Resolution Problem: To resolve a name we need a directory node. How do we actually find that (initial) node? Closure mechanism: The mechanism to select the implicit context from which to start name resolution: start at a DNS name server /home/steen/mbox: start at the local NFS file server (possible recursive search) : dial a phone number : route to the VU’s Web server
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Linking and Mounting (1) Name Linking Hard link What we have described so far as a path name: a name that is resolved by following a specific path in a naming graph from one node to another. Soft link Allow a node O to contain a name of another node: First resolve O’s name (leading to O) Read the content of O, yielding name Name resolution continues with name
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Linking and Mounting (2) The concept of a symbolic link explained in a naming graph. n6 is a symbolic link to n5
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Linking and Mounting (3) Information required to mount a foreign name space in a distributed system The name of an access protocol. The name of the server. The name of the mounting point in the foreign name space. Example: Sun NFS network file system
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Linking and Mounting (4) Mounting remote name spaces through a specific access protocol.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Name-space Implementation (1) Basic issue: Distribute the name resolution process as well as name space management across multiple machines, by distributing nodes of the naming graph. Distinguish three levels Global level: Consists of the high-level directory nodes. Main aspect is that these directory nodes have to be jointly managed by different administrations Administrational level: Contains mid-level directory nodes that can be grouped in such a way that each group can be assigned to a separate administration. Managerial level: Consists of low-level directory nodes within a single administration. Main issue is effectively mapping directory nodes to local name servers.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Name-space Implementation (2) An example partitioning of the DNS name space, including Internet-accessible files, into three layers.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Name-space Implementation (3) A comparison between name servers for implementing nodes from a large-scale name space partitioned into a global layer, an administrational layer, and a managerial layer.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Implementation of Name Resolution (1) The principle of iterative name resolution.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Implementation of Name Resolution (2) The principle of recursive name resolution. Great burden, not used in global layer
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Implementation of Name Resolution (3) Recursive name resolution of. Name servers cache intermediate results for subsequent lookups.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Example: The Domain Name System The comparison between recursive and iterative name resolution with respect to communication costs. Recursive resolution offers less costly communication.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved The DNS Name Space The most important types of resource records forming the contents of nodes in the DNS name space. A: one more multiple per name, PTR: IP to host name HINFO and TXT: information
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DNS Implementation (1) An excerpt from the DNS database for the zone cs.vu.nl.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DNS Implementation (2) Rest of the excerpt. soling is ftp and web server, inkt and pen are local printers