The Cache Location Problem. Overview TERCs Vs. Proxies Stability Cache location.

Slides:



Advertisements
Similar presentations
COS 461 Fall 1997 Routing COS 461 Fall 1997 Typical Structure.
Advertisements

Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Greed is good. (Some of the time)
Wavelength Assignment in Optical Network Design Team 6: Lisa Zhang (Mentor) Brendan Farrell, Yi Huang, Mark Iwen, Ting Wang, Jintong Zheng Progress Report.
1 Efficient and Robust Streaming Provisioning in VPNs Z. Morley Mao David Johnson Oliver Spatscheck Kobus van der Merwe Jia Wang.
New Models and Algorithms for Active Networks. 2 The Active Bell-Labs Engine An adjunct active engine to any COTS router Only some packets are diverted.
1 Content Delivery Networks iBAND2 May 24, 1999 Dave Farber CTO Sandpiper Networks, Inc.
The Cache Location Problem IEEE/ACM Transactions on Networking, Vol. 8, No. 5, October 2000 P. Krishnan, Danny Raz, Member, IEEE, and Yuval Shavitt, Member,
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
1 Internet Networking Spring 2006 Tutorial 6 Network Cost of Minimum Spanning Tree.
Online Algorithms for Network Design Adam Meyerson UCLA.
1 Vertex Cover Problem Given a graph G=(V, E), find V' ⊆ V such that for each edge (u, v) ∈ E at least one of u and v belongs to V’ and |V’| is minimized.
1 Web Proxies Dr. Rocky K. C. Chang 6 November 2005.
1 Internet Networking Spring 2004 Tutorial 6 Network Cost of Minimum Spanning Tree.
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 10 Instructor: Paul Beame.
Implementing ISA Server Caching. Caching Overview ISA Server supports caching as a way to improve the speed of retrieving information from the Internet.
Steiner trees Algorithms and Networks. Steiner Trees2 Today Steiner trees: what and why? NP-completeness Approximation algorithms Preprocessing.
EECC694 - Shaaban #1 lec #7 Spring The OSI Reference Model Network Layer.
On Self Adaptive Routing in Dynamic Environments -- A probabilistic routing scheme Haiyong Xie, Lili Qiu, Yang Richard Yang and Yin Yale, MR and.
11 ASSIGNING IP ADDRESSES Chapter 2. Chapter 2: ASSIGNING IP ADDRESSES2 CHAPTER OVERVIEW  Describe the structure of IP addresses and subnet masks. 
Web Caching and Content Delivery. Caching for a Better Web Performance is a major concern in the Web Proxy caching is the most widely used method to improve.
FIREWALL TECHNOLOGIES Tahani al jehani. Firewall benefits  A firewall functions as a choke point – all traffic in and out must pass through this single.
1 Distributed Computing Optical networks: switching cost and traffic grooming Shmuel Zaks ©
P2P File Sharing Systems
Efficient Algorithms for Locating Web Proxies Copyright, 1996 © Dale Carnegie & Associates, Inc. Li-Chuan Chen The MITRE Corporation Co-author:
Randomized Algorithms Morteza ZadiMoghaddam Amin Sayedi.
A User Experience-based Cloud Service Redeployment Mechanism KANG Yu.
Topology Design for Service Overlay Networks with Bandwidth Guarantees Sibelius Vieira* Jorg Liebeherr** *Department of Computer Science Catholic University.
Infrastructure for Better Quality Internet Access & Web Publishing without Increasing Bandwidth Prof. Chi Chi Hung School of Computing, National University.
Network Aware Resource Allocation in Distributed Clouds.
Overcast: Reliable Multicasting with an Overlay Network CS294 Paul Burstein 9/15/2003.
Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
COSC 2007 Data Structures II Chapter 14 Graphs III.
Web Caching and Content Distribution: A View From the Interior Syam Gadde Jeff Chase Duke University Michael Rabinovich AT&T Labs - Research.
Researchers: Preet Bola Mike Earnest Kevin Varela-O’Hara Han Zou Advisor: Walter Rusin Data Storage Networks.
Network Layer4-1 Chapter 4: Network Layer r 4. 1 Introduction r 4.2 Virtual circuit and datagram networks r 4.3 What’s inside a router r 4.4 IP: Internet.
The Lower Bounds of Problems
Review: –Ethernet What is the MAC protocol in Ethernet? –CSMA/CD –Binary exponential backoff Is there any relationship between the minimum frame size and.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
EMIS 8373: Integer Programming NP-Complete Problems updated 21 April 2009.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
MA/CSSE 473 Day 28 Dynamic Programming Binomial Coefficients Warshall's algorithm Student questions?
Intradomain Traffic Engineering By Behzad Akbari These slides are based in part upon slides of J. Rexford (Princeton university)
Technology Mapping. 2 Technology mapping is the phase of logic synthesis when gates are selected from a technology library to implement the circuit. Technology.
ASSIGNMENT, DISTRIBUTION AND QOS PROVISIONING IN COMMUNICATION NETWORKS.
Reliable Multicast Routing for Software-Defined Networks.
Foundation of Computing Systems
Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.
A numerical example Update frequency : 12. Simulation Setup Inet topology generator, 
Bing Wang, Wei Wei, Hieu Dinh, Wei Zeng, Krishna R. Pattipati (Fellow IEEE) IEEE Transactions on Mobile Computing, March 2012.
On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.
Optimizing server placement in distributed systems in the presence of competition Jan-Jan Wu( 吳真貞 ), Shu-Fan Shih ( 施書帆 ), Pangfeng Liu ( 劉邦鋒 ), Yi-Min.
Facility Location and Network Design Models. Facility Location Assumptions Based on graph G=(V,E) Demand nodes, I  V, are known and fixed Set J  V of.
Second case study: Network Creation Games (a.k.a. Local Connection Games)
Cost-Effective Video Streaming Techniques Kien A. Hua School of EE & Computer Science University of Central Florida Orlando, FL U.S.A.
Cooperative Caching in Wireless P2P Networks: Design, Implementation And Evaluation.
Constraint-Based Routing
PROTEAN: A Scalable Architecture for Active Networks
Local Connection Game.
I206: Lecture 15: Graphs Marti Hearst Spring 2012.
Lecture 19-Problem Solving 4 Incremental Method
Approximation Algorithms
Smart Content Delivery in Large Networks: En-Route Caching
Replica Placement Heuristics of Application-level Multicast
An Optimization Problem in Adaptive Virtual Environments
Presentation transcript:

The Cache Location Problem

Overview TERCs Vs. Proxies Stability Cache location

Proxy Web Caching is Good Saves network bandwidth Reduces delay Reduces server’s load But it is not perfect: – –not everybody uses it (configuration) – –may become a bottleneck and increase delay – –increases delay for unsatisfied pages

Caches are located along routes from clients to servers, and are transparent to both server and client Requests are intercepted by the TERC on their way to the server, and either answered by the cache if the information exists otherwise, forwarded to the server Advantages: No configuration required! No management! No change required in current network infrastructure Can be deployed independently within an ISP subnetwork Transparent En-Route Caches (TERCs)

TERCs (-) Must be on the route from client to server: – –sensitive to route changes – –hierarchies are much harder to implemen Needs to intercept traffic: – –implementation problem – –more complex – –can TERCs work at line speed? Depends on routing stability, and flow stability Where should TERCs be placed?

Route Stability Published results indicate that routing is stable (Paxon, Labovitz) We need stability only during the connection lifetime (~1 min.): – –[KRS00] measurements to more that destinations show that >93% of connections were stable – –real numbers are probably higher TCP route caching equivalent of IP addresses

Stability of Flows We built the flow tree from servers: Data from Bell-Labs servers ( labs.com, labs.com ) – –Nov Jan. 98 – –~14000 different hosts, 1 Gbytes, ~200k cachable requests (per week) From log files to results: – –extract unique host – –run traceroute for each host – –obtain the routing tree (or is it DAG?)

Stability - Visual

Client return rate between days day

Stability (3) The relative flow in the tree is stable in time, although the client population changes significantly Routing is stable for the lifetime of the connection Placing caches based on past traffic yields good results

How Fixed is the Hit Ratio?

How Fixed is the Hit Ratio?(2)

Where Should the TERCs be Placed?

The Model Wide area network Requests are represented by a set of demands (of client i from server j) Goal: minimize average delay = minimize total flow The hit ratio (P) abstracts cache behavior most hits due to small number of popular pages full dependency - the same pages are cached everywhere But part of the flow can come from Proxies Each flow is associated with a hit ratio P i,j =>

The General k-cache Location Problem Instance: an undirected graph G=(V,E) a set of demands F={f i,j } a set of hit ratios P={p i,j } k - the number of caches Solution: K, a subset of V of size k Objective: minimizing total flow min f i,j [p i,j d(i,v) + (1-p i,j ) (d(i,v)+d(v,j))] i,j  v  K+{j}

The k-TERC Location Problem v  K+{j} on the path from j to i min f i,j [p i,j d(i,v) + (1-p i,j ) (d(i,v)+d(v,j))] i,j  Instance: an undirected graph G=(V,E) a set of demands F={f i,j } a set of hit ratios P={p i,j } k - the number of caches Solution: K, a subset of V of size k Objective: minimizing total flow

Remarks A generalization of the p-median problem (in the p-median problem we want to minimize the total cost of serving a set of demands from at most p centers) In the k-TERC location problem: – –it is enough to solve the problem for fixed p (p i,j = p) – –The optimal set K does not depend on p. – –(not true in general) The k-TERC location problem is a special case of the general k-location problem (p=1/n)

The independence of p s,c TERC constant

Hardness Results linetreegeneral graph one server m servers Poly. NP - hard

Topology: a line of n nodes Every node may be a server, a client, or both. FR(i) – The flow demand on the segment (i-1,i) FR can be easily computed from the input. FC(i,l o,l i ) - The flow on the segment (i-1,i) when the closest caches to i are in l o and l i. FC can be computed from the input with p=1. Note: FR(i) = FC(i,n-1,0) Placement on a line 012 n-1

Placement on a line C(j,l o,l i,k) the overall flow in segment [0,j] when k caches are locate optimally inside the segment, and the closest caches to j are in l o and l i.

The dynamic Program Base case (j=1)Base case (j=1) For j>1:For j>1:

The Algorithm 1.Compute C(1,l i,1,1) and C(1,l i,0,0) for 1≤l i ≤n-1 2.For each j>1 compute C(j,l o,l i,k’) for all 0≤k’≤k and 0≤l i ≤j≤l o ≤n-1 Complexity: O(n 3 k)

Optimizing for a single server The routes from the server to all clients form a tree (actually a DAG)The routes from the server to all clients form a tree (actually a DAG) We’ll use dynamic programing to find the optimal cache locationsWe’ll use dynamic programing to find the optimal cache locations

The Greedy Algorithm Optimal algorithm using a bottom up dynamic programming: – not trivial – complexity O(n k 2 h) Greedy: –repeat k times {find the best cache location} – complexity O(n k) How bad can it be? How bad can it be?

Greedy Vs. Optimal

Dynamic Programming for Tree First we convert the tree to a binary tree by adding dummy nodes. Sort all nodes in reverse BFS order: nodes descendents are numbered before the node itself. Children of node i are: i R and i L

Notations C(i,k’,l) is the cost of a subtree rooted at i with k’ optimally located caches, where the next cache up the tree is at distance l from i. F(i,k’,l) is the sum of demands in the subtree i that do not pass thru a cache in the solution C(i,k’,l).

The Dynamic Program

The DP Formula for C(i,k,l) The cost if a cache is not placed at node i: The cost if a cache is placed at node i: Complexity: O(n·h·k) variables  O(n·h·k 2 ) time cmplx Finer analysis yields O(n·h·k) time complexity

The Server’s Point of View

Traffic Reduction

TERCs Vs. Edge Caches

The Server’s Point of View (2)

Popularity Stability