CSCI 599: Beyond Web Browsers Professor Shahram Ghandeharizadeh Computer Science Department Los Angeles, CA 90089
QUIZ 1: 1. When you register your with Google, Google s you a key that must be included with each request to their web methods. (True) 2. The Google web service API can be invoked using ASP.NET (True) 3. Once a client caches the results of a Google search, Google will invalidate this client’s cache when it detects updates to its information system. (False) 4. Napster employed a central server to store the index of all files available for download by a client. (True) 5. CAN assumes nodes that insert (key,value) pairs will periodically refresh their inserted entries. (True)
A Scalable Content Addressable Network (CAN) by S. Ratnsmy, P. Francis, M. Handley, R. Karp, S. Shenker.
CAN CAN is composed of individual nodes. CAN is composed of individual nodes. CAN employs a hash function to insert, lookup, and delete (key,value) pairs. CAN employs a hash function to insert, lookup, and delete (key,value) pairs. A node stores a chunk, termed a zone, of the entire hash table. A node stores a chunk, termed a zone, of the entire hash table. A node maintains information about its neighboring nodes. A node maintains information about its neighboring nodes.
EXAMPLE HASH FUNCTION A two dimensional hash function: A two dimensional hash function: h(K) = a 6 bit unsigned integer h(K) = a 6 bit unsigned integer The low three and high three bits form the 2 dimensions of a hash index. The low three and high three bits form the 2 dimensions of a hash index. e.g., h(“Thriller”) = Low 3 bits = 011 High 3 bits = 111 Three bits range in value from 0 (000) to 7 (111) Three bits range in value from 0 (000) to 7 (111)
ADDRESS SPACE A 2 dimensional address space, can be partitioned across 64 nodes A 2 dimensional address space, can be partitioned across 64 nodes High bits Low bits
EXAMPLE A 2 dimensional space partitioned across six nodes A 2 dimensional space partitioned across six nodes High bits Low bits
EXAMPLE (CONT…) h(“Thriller”) = = (111, 011) = node 6 h(“Thriller”) = = (111, 011) = node High bits Low bits
NEIGHBORS Two nodes are neighbors if their coordinate spans overlap along d-1 dimensions and about along one dimension. Two nodes are neighbors if their coordinate spans overlap along d-1 dimensions and about along one dimension
NEIGHBORS (CONT…) 3’s neighbors: 3’s neighbors:
NEIGHBORS (CONT…) 5 is not 3’s neighbor because it does not overlap along one dimension; it only abuts along two dimensions. 5 is not 3’s neighbor because it does not overlap along one dimension; it only abuts along two dimensions
NEIGHBORS (CONT…) The coordinate space is a d-torus, it wraps. Example, 5’s neighbors: The coordinate space is a d-torus, it wraps. Example, 5’s neighbors:
NEIGHBORS (CONT…) A node maintains information about its neighbors in order to route a lookup, insert, and delete: A node maintains information about its neighbors in order to route a lookup, insert, and delete:
NODE ADDRESSING CAN has an associated DNS domain name that resolves to the IP address of one or more CAN bootstrap nodes. CAN has an associated DNS domain name that resolves to the IP address of one or more CAN bootstrap nodes. A bootstrap node maintains a partial list of CAN nodes it believes are currently in the system. A bootstrap node maintains a partial list of CAN nodes it believes are currently in the system. A request is routed to one of these nodes. A request is routed to one of these nodes. The contacted node applies the hash function and routes the request towards its target destination (using information about its neighbors). The contacted node applies the hash function and routes the request towards its target destination (using information about its neighbors).
EXAMPLE A client looks up “Fragile”, h(“Fragile”) = , (4,2) by contacting N5 (7,0). A client looks up “Fragile”, h(“Fragile”) = , (4,2) by contacting N5 (7,0). Reduce the y-value (high-bits) from 7 to 4, Increase x-value (low bits) from 0 to 2 Reduce the y-value (high-bits) from 7 to 4, Increase x-value (low bits) from 0 to High bits Low bits
EXAMPLE A client looks up “Fragile”, h(“Fragile”) = , by contacting N5. A client looks up “Fragile”, h(“Fragile”) = , by contacting N High bits Low bits
EXAMPLE A client looks up “Fragile”, h(“Fragile”) = , by contacting N5. A client looks up “Fragile”, h(“Fragile”) = , by contacting N High bits Low bits
EXAMPLE A client looks up “Hey bebe!”, h(“Hey bebe!”) = , by contacting N5; how is the request routed? A client looks up “Hey bebe!”, h(“Hey bebe!”) = , by contacting N5; how is the request routed? High bits Low bits
OBSERVATIONS Observation 1: Observation 1: In a d-dimensional space, each node has 2d neighbors. A node maintains information about its neighbors. Thus, one may grow the number of nodes without increasing the node state. Observation 2: Observation 2: The average path length grows as O(n 1/d ) as a function of the number of nodes, n. Observation 3: Observation 3: The path length is O(d n 1/d ) hops for d dimensions and n nodes
NUMBER OF DIMENSIONS Figure 4 Figure 4 Substantial improvement with going from d=2 to 4. Beyond 4, the percentage improvement levels off. This same observation is shown in Figure 6.
NEW NODE CAN incorporates a new node, say N7, as follows: CAN incorporates a new node, say N7, as follows: N7 must find a CAN node. N7 randomly chooses a point P that maps to a node, say N1, and sends it a join request. N7’s zone will be partitioned between N7 and N1. N1 splits is zone in half, retains one half and handles the other half to N7. N7 identifies its neighbors. Neighbors of N1 are notified to include N7 for routing.
NEW NODE (Cont…) Zone belonging to N1 is partitioned between N1 and N7 Zone belonging to N1 is partitioned between N1 and N High bits Low bits 7
Questions & Answers
NODE REMOVAL (FAILURE)
IMPROVEMENTS Categories of improvements: Categories of improvements: Replication: reduce path length Multiple realities: additional state information (Sec 3.2) Multiple hash functions (Sec 3.5) MAX replica based: additional state information (Sec 3.4) Routing of requests: reduce path latency Route requests to a candidate neighbor with minimum RTT: additional state information (3.3) Assignment of nodes to zones Uniform partitioning of space (3.7): load balancing Topologically close nodes are assigned to the same zone (3.6): reduce path latency
IMPROVEMENTS A matrix perspective (missing: data caching) A matrix perspective (missing: data caching) One may consider a combination, e.g., data placement & replication One may consider a combination, e.g., data placement & replication Reduce path length Reduce path latency Load balancing Replication Better Routing Data placement
REPLICATION: MULTIPLE REALITIES Maintain multiple, independent coordinate spaces. Maintain multiple, independent coordinate spaces. Each node is assigned a different zone in each coordinate space. Each node is assigned a different zone in each coordinate space. Here are two realities: Here are two realities: Reality-1Reality-2
MULTIPLE REALITIES Replication increases availability of data in the presence of failures Replication increases availability of data in the presence of failures Figure 5 Figure 5 The benefit (percentage improvement) with 2 and 3 realities is substantial. It levels off with 4 or more realities. Figure 6 Figure 6 Number of neighbors is fixed on the x-axis with both (a) d=2&r=varying, and (b) d=varying&r=2 To improve routing efficiency, multiple dimensions is more beneficial than increasing the number of realities (given the same amount of space). Qualitatively: additional realities provide a higher degree of data availability (in the presence of failures). Notice the knee of both curves in Figure 6 (impact of realities & dimensions is marginal beyond a certain point).
REPLICATION: MULTIPLE HASH FUNCTION Use k different hash functions to map a single key to k different points in the coordinate space. Use k different hash functions to map a single key to k different points in the coordinate space. This results in 0 to k replicas of a single key. In case of collisions to a single zone do not construct replicas. This results in 0 to k replicas of a single key. In case of collisions to a single zone do not construct replicas. With a lookup, retrieve the entry from the closest node. (Retrieve the node from all the k potential targets, consuming more bandwidth.) With a lookup, retrieve the entry from the closest node. (Retrieve the node from all the k potential targets, consuming more bandwidth.) Figure 7 Figure 7
CONTROLLED REPLICATION Multiple nodes share the same zone. Multiple nodes share the same zone. These nodes are termed peers. MAXPEERS is a system parameter to control the number of replicas. MAXPEERS is a system parameter to control the number of replicas. Logically, a node has 2d(MAXPEERS) neighbors Logically, a node has 2d(MAXPEERS) neighbors To maintain a fixed amount of state information per node: To maintain a fixed amount of state information per node: A node selects one neighbor from amongst the peers in each of its neighboring zones.
CONTROLLED REPLICATION Neighbor selection: Neighbor selection: Periodically, a node request its coordinate neighbor to transmit its peer list Measure the RTT to all nodes in its neighboring zone Retain the node with the lowest RTT as its neighbor Replication: Replication: Increases data availability Improves performance: path length, path latency, and load balancing Increases update overhead
CONTROLLED REPLICATION Table 2: per-hop latency Table 2: per-hop latency MAXPEERS Relative % Improvement
Questions & Answers
BETER ROUTING Standard routing metric: Progress towards the destintion in terms of the Cartesian distance. Standard routing metric: Progress towards the destintion in terms of the Cartesian distance. Better routing: Better routing: Each node measures the network-level round-trip-time (RTT) to each of its neighbors For a given destination, a message is forwarded to the neighbor with the max(progress / RTT) Table 1 (20-40% improvement) Table 1 (20-40% improvement)
Questions & Answers
Topologically sensitive routing Assign zones to nodes in a manner that assigns neighboring zones to nodes that have a minimum RTT Assign zones to nodes in a manner that assigns neighboring zones to nodes that have a minimum RTT USC’s neighboring node should be UCLA (instead of Cornell) assuming the RTT to UCLA is smaller. How? How? Identify m well known set of machines as landmarks. Every CAN measures its RTT to these machines and maintains a vector listing closest to farthest. m! ordering of landmarks is possible Partition the coordinate space into m! portions When a new node joints, it is mapped to a portion with a matching landmark ordering
Topologically sensitive routing Assuming 3 landmarks, the Cartesian space is divided into six portions: m splits along x-axis, m-1 splits along the y-axis Assuming 3 landmarks, the Cartesian space is divided into six portions: m splits along x-axis, m-1 splits along the y-axis High bits Low bits
Topologically sensitive routing Figure 8: Figure 8: 4 landmarks with a minimum hop distance of 5 Latency stretch = CAN network latency average IP network latency (2-d with landmark ordering) out performs (4-d without landmark ordering)
Questions & Answers