Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream.

Similar presentations


Presentation on theme: "ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream."— Presentation transcript:

1 ICS362 Distributed Systems Dr Ken Cosh Week 5

2 Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream Oriented Communication – Multicast Communication

3 This Week Naming – Names, Identifiers & Addresses – Flat Naming – Structured Naming

4 Names A string of bits / characters referring to an entity. – Entity could be resources, hosts, printers, disks, files, processes, users, mailboxes, newsgroups, webpages, messages… Entities can be operated on through their interfaces – But for that we need an access point – or address

5 Access points An entity can have more than one access point – We have more than one telephone – A host offers multiple ports An entity can change its access points – A new IP address in a new network – A new email address

6 Entity Access Point It appears an access point is tightly associated with an entity But the name of the entity and the name of the access point should be independent – Making a naming system which is more flexible and easier to use.

7 Identifiers Uniquely refer to an entity – An identifier refers to at most one entity. – Each entity is referred to by at most one identifier. – An identifier always refers to the same entity (i.e., it is never reused).

8 Human Friendly Names Most names are represented in machine readable form, i.e. a bit string. Human Friendly Names convert this to a character string.

9 Name Resolution The crucial aspect is how to resolve names, identifiers and addresses? – Close link to message routing Simply a table of name address pairs – With a large distributed system this becomes a large table which can’t be centralised. Most of this section will deal with alternative approaches to name resolution – Flat Naming – Structured Naming – Attribute Based Naming

10 Flat Naming Generally names are just random bit strings – i.e. nothing about the name gives any indication of where the access point is. – (In contrast to cis.payap.ac.th for example) Alternatives here include: – Broadcast Based – Home Based – Distributed Hash Tables – Hierarchical Based

11 Broadcasting Message sent out to all machines in network – Broadcast a message containing the entity of that is being looked for – Each machine checks if they have the entity – Those with an access point respond accordingly As the network grows it becomes inefficient – Wasted Bandwidth – Too many hosts being interrupted with messages they can’t answer

12 Multicasting Multicasting can improve things as only a specified group of machines will receive the ‘broadcast’

13 Forwarding Pointers When an entity moves, it leaves a forwarding pointer at its last address – Once an entity has been found we can find the current address by following forwarding pointers Drawbacks – The chain for a mobile entity can become very long! – What happens if part of the chain is unreliable? Scalability?

14 Home Based Approaches A Home Location keeps track of the current location of an entity. – This is the ‘Care of’ address of the entity If a request comes it is first routed to the home location, but then forwarded to the current location – With the client being updated with the new location.

15 Home Based Approaches

16 Home Location Drawbacks Communication latency due to potential distances between locations What if the Home Location doesn’t exist or is unavailable? What is the entity decides to move permanently?

17 Distributed Hash Tables A hash function is used to allocated random identifiers to nodes and keys to entities An entities with key k is under the jurisdiction of the node with the smallest id >= k If a node needs to find an entity that isn’t under it’s jurisdiction it could simply check with it’s predecessor or succeeding node. – This is made more efficient by storing a finger table of nearby nodes.

18 Distribute d Hash Table

19 Distributed Hash Tables With randomly assigned ids the requests could be routed across long distances Topology based assignments of node identifiers – Make sure that nearby nodes get nearby ids Proximity Routing – By storing multiple successors & predecessors a node can choose to check with a nearby node assuming it satisfies the conditions ( ) of the key

20 Hierarchical Approaches The network is divided into a collection of domains, each with subdomains until you reach a leaf domain Each domain has an associated directory node dir(D) which leads to a tree of directory nodes. – With a root directory node at the top.

21 Hierarchical Approaches Root Directory Top Level Domain Subdomain Leaf Domain

22 Location Records Each directory node has a location record for each entity within its directory – If an entity is within a subdomain then it contains a location record of the subdomain containing the entity. If an entity has multiple locations (is replicated) a directory may contain more than one reference for the entity

23 Location Records

24 Look Up Look Up is done through ever increasing circles – based on locality. Consider Worst Case?

25 Insertion

26 Structured Naming Flat names are convenient for machines, – But not really for humans File naming & host naming allow convenient human friendly names. Here we discuss Namespaces & Name Resolution

27 Namespaces Names can be represented as a labeled, directed graph. 2 Types of node – Leaf Nodes The address of a named entity, or the actual entity. No outgoing edges – Directory Nodes Named nodes with a number of outgoing edges

28 Naming Graph with 1 root node

29 Naming Graphs Most have a single root Many are strictly hierarchical – Making them into a tree where each node has exactly 1 incoming edge Some are directed acyclic graphs (as in previous slide) – Each node can have multiple incoming edges, but no cycles allowed

30 Aliases In the previous example the entity “/keys” has an alias “/home/steen/keys” – Multiple absolute paths referring to the same node (Hard links) An alternative is to use symbolic links – When resolving “/home/steen/keys” the absolute path “/keys” is returned. – (As in the following slide)

31 Symbolic Link

32 Name Resolution Resolving a name involves following a path through the graph; – E.g. /home/steen/mbox Closure Mechanism – Resolution works on the assumption that we know where to start the path from – i.e. where is the root node? Is it a node in a higher graph? Have we already resolved that node? – What would you do with the string 0031204430784?

33 Mounting Points A directory node can store the identifier of a directory node from a different namespace. – This is the Mounting Point Consider a collection of distributed namespaces, we can mount a foreign namespace with; – The name of an access protocol – The name of the server – The name of the mounting point in the foreign name space For Example – ftp://cis.payap.ac.th

34 Foreign Mounting Point

35 Namespace Implementation A naming service implemented by name servers – For large scale DS, it is distributed across multiple servers This is separated into layers – Global Layer High level nodes (root node and neighbours), hence relatively fixed & stable. – Administrational Layer Nodes from within a single organisation, e.g. groups of entities, perhaps a node for each department in an organisation – Managerial Layer Frequently changing nodes e.g. hosts in a local network

36 DNS example

37 Global Layer High Availability is particularly necessary – If one fails a large part will be unavailable as resolution can not continue past the failed server. But, as names rarely change, clients can cache the results – So speedy results are not as important as availability Normally implemented using replicated servers

38 Administrational Layer Availability is important – for clients in the same organisation as the nameserver, but less important for those outside of the organisation. Responsiveness is much more important at this layer – Updates need to be processed more quickly – e.g. a new user account needs to be processed quickly.

39 Managerial Layer Availability is less demanding – Can be managed on a single machine Performance is crucial – Responses should be immediate

40 Layer Comparison

41 Name Resolution Implementation Choices: – Iterative or Recursive? Lets consider needing to resolve: – root: – Otherwise known as: – ftp://ftp.cs.vu.nl/pub/globe/index.html

42 Iterative Resolution root: The root server resolves ‘nl’ and returns that location to the client – Remaining pathname: nl: The nl nameserver resolves ‘vu’ – Remaining pathname: vu: The vu nameserver resolves ‘cs’ and‘ftp’ – Remaining pathname: ftp: Then the ftp server can return the requested file. Each time the location of the next server is returned to the client and the client makes a new request.

43 Iterative Name Resolution

44 Recursive Resolution root: The nameserver passes the request on to the next nameserver it finds; – i.e. root identifies nl and passes on the request: – nl: nl passes on the request to vu: cs: – vu passes on the request to ftp: – ftp: Finally the results are returned to the client back through the chain.

45 Recursive Resolution

46 Recursive vs Iterative Recursive places more demands on the servers – Which generally makes it prohibitive for global layer servers dealing with many requests

47 Recursive vs Iterative Recursive name resolution enables each server to learn the address of lower level nodes – And cache these results This makes subsequent requests much quicker – The results can be cached both by the root server and every other server in the chain

48 Recursive Caching

49 Recursive vs Iterative Recursive can also be cheaper in terms of communication – Consider if the request in the example given was made from Chiang Mai…


Download ppt "ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream."

Similar presentations


Ads by Google