Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient Algorithms for Locating Web Proxies Copyright, 1996 © Dale Carnegie & Associates, Inc. Li-Chuan Chen The MITRE Corporation Co-author:

Similar presentations


Presentation on theme: "Efficient Algorithms for Locating Web Proxies Copyright, 1996 © Dale Carnegie & Associates, Inc. Li-Chuan Chen The MITRE Corporation Co-author:"— Presentation transcript:

1 Efficient Algorithms for Locating Web Proxies Copyright, 1996 © Dale Carnegie & Associates, Inc. Li-Chuan Chen lichen@mitre.org The MITRE Corporation Co-author: Hyeong-Ah Choi George Washington University 2001 CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS’01)

2 MITRE: Li-Chuan Chen2 Outline Research Motivation. Background. Research Goals. Literature Review, problem formulation, and results. Summary.

3 MITRE: Li-Chuan Chen3 Research Motivation With the increased popularity of World- Wide-Web (WWW or Web) there are a number of problems: –Servers overloaded –Internet backbone congestion –Slow Web services access

4 MITRE: Li-Chuan Chen4 Background Approaches to Reduce Server Load: Mirror Web Sites: Replicate web server contents throughout network. (User must select server.) Distributed Web Server: Cluster of distributed servers acting as a single server. Web Caching: Stores frequently requested Web documents in proxy servers or user’s machines.

5 MITRE: Li-Chuan Chen5 Research Goals To reduce Web server load and to increase efficiency and reliability of Web system performance by caching frequently accessed documents at strategically Web proxy locations throughout the network. We will consider the design of optimization algorithms for achieving these objectives. –Note that most formulations of these problems are NP- hard. –We consider special cases and approximation algorithms for Proxy Location.

6 MITRE: Li-Chuan Chen6 Proxy Location Problem Popular web sites have to cope with an enormous number of requests. A Web proxy (cache) sits between users and servers. Proxy returns the requested document to the user if it is in the cache, else requests the document from the server and stores it before returning it to the user.

7 MITRE: Li-Chuan Chen7 Proxy Location Problem A popular Web site places its documents closer to users by replicating them on Web proxies throughout the network. Goal: Locate k proxy servers throughout network of n nodes to minimize the overall cost for accessing Web documents.

8 MITRE: Li-Chuan Chen8 Proxy Location: Literature Replacement Algorithms: When the cache is full, how do you replace existing Web documents with new one? [CK99, Ira97, AY97] Cache Consistency: Deals with problem of keeping Web documents consistent with the original copy [LC98, Din96]. Proxy Placement: Where to place proxies so the Web documents are closer to the user? [KRS00, LGIDS99, LDGS98, HT91]

9 MITRE: Li-Chuan Chen9 Proxy Location: Problem Formulation Given a network G=(V,E) with n nodes and integer k. Each node v i is associated with number of document requests w(v i ). Let D(u,u i ) denote the communication cost from u to proxy u i. Objective: Place k proxies U = {u 1, u 2,…, u k } and assign each node v to its nearest proxy u i, to minimize the sum w(v)D(v,u i ) over all nodes and over all proxies. Linear topology Ring topology

10 MITRE: Li-Chuan Chen10 Proxy Location: History Li, et al. [LDGS98] presented an O(kn 2 ) algorithm for the linear unidirectional case. We improved this to O((log k)n 2 ) and generalized to the bidirectional case with the same running time. Krishman, et al. [KRS00] recently presented an O(kn 3 ) time algorithm for the unidirectional case. Later we discovered an O(kn) time algorithm in the OR literature by Hassin and Tamir [HT91].

11 MITRE: Li-Chuan Chen11 Proxy Allocation: Results Uni- and bidirectional Ring Topologies: Compute optimal proxy placement in O(n 2 ) time. (Improves O(kn 4 ) by Krishman, et al. [KRS00].)

12 MITRE: Li-Chuan Chen12 Proxy Location: Linear Topology Dynamic Programming Formulation: Break interval [1,j] into subintervals [1,j ’ ] and [j ’ +1,j]. Place one proxy in [j ’ +1,j] and k-1 proxies in [1,j ’ ]. i=1 jj ’ +1 Find j ’

13 MITRE: Li-Chuan Chen13 Proxy Location: Ring Topology Break ring at any point, and reduce to linear case. Solve linear problem in O(kn) time [HT91]. To get the optimal solution, we need to break the ring at an optimal break point. A brute-force approach would result in an O(kn 2 ) time algorithm.

14 MITRE: Li-Chuan Chen14 Ring Topology: Our Method Rather than trying all n possible choices for the optimal break point, we show that the optimal break point can be selected from a set of only n/k candidate break points. Interleaving Property: Let x 1,x 2,…,x k denote the optimum break point sequence for the ring, and let y 1,y 2,…,y k be the optimal linear break points resulting from an arbitrary cut to the ring, then x1x1 x2x2 x3x3 x4x4 y1y1 y2y2 y3y3 y4y4

15 MITRE: Li-Chuan Chen15 Ring Topology: Our Method Break the ring at each of these positions, and solve the linear problem for each. By interleaving, one of these will be optimal. Select the one with lowest cost. Using Interleaving, we can find a set of n/k candidate break points as follows. We break the ring at an arbitrary point and compute the optimal linear break points. Choose the interval that has least # of node (at most n/k).

16 MITRE: Li-Chuan Chen16 Heuristics and Performance Analysis Many of our existing results are approximations or apply to special cases, because the underlying optimization problems are NP-hard. We implemented heuristics for the proxy location problem for general Internet topologies given a fixed number of servers k.

17 MITRE: Li-Chuan Chen17 Internet Topology Input Graph Used the Tiers model, by Calvert, Doar and Zegura of Georgia Tech [CDZ97] for Internet topology generation. Tiers is based on a 3-level hierarchical network (WAN, MAN, LAN). 20 random Internet graphs were generated for each of 63, 119, 267, 575, and 1144 nodes.

18 MITRE: Li-Chuan Chen18 Internet Topology Input Graph Example of n = 575 nodes:

19 MITRE: Li-Chuan Chen19 Heuristics for Proxy Location Given Number of Servers k: Random: Randomly select k servers and output cost. n-(and n log n)-Random-Pairs: Start with Random. Repeatedly select a node i at random and swap with a random existing server. If swap is profitable, then do it. The process is repeated for n (or n log n) times. (n log n)-Random-Clients: Similar to (n log n)- Random-Pairs, except after randomly selecting node i, we swap with the server giving the best cost. We assume all nodes have equal demand, w(v i ) = 1.

20 MITRE: Li-Chuan Chen20 Heuristics for Proxy Location Given Number of Servers k: (continued) Swap-to-Limit: Start with Random. For each existing server j, swap j with each client i. Select the swap that gives the best cost. Repeat until no swap improves the cost.

21 MITRE: Li-Chuan Chen21 Simulation Results Brute-Force Search: Computes optimal solution by generating all k-node subsets of {1,2, …, n}, and computing the cost for each subset. Requires O(n k ) time, and so is not practical for large values of k and n. Given small values of k = 2, 3, 4, 5, 6 servers, we ran and compared the heuristics with the brute-force algorithm.

22 MITRE: Li-Chuan Chen22 Simulation Results: Brute force versus heuristics for k = 3: cost

23 MITRE: Li-Chuan Chen23 Simulation Results: Brute force versus heuristics for k = 3: CPU

24 MITRE: Li-Chuan Chen24 Simulation Results: For larger values of k = 2, 4, 8, 16, 32 servers, we ran and compared the heuristics for proxy location given a fixed number of servers k. Also collected statistics on the intermediate costs for n = 63, 119, 267, 575, 1144 and k = 2, 4, 8, 16, 32.

25 MITRE: Li-Chuan Chen25 Simulation Results: Heuristics given number of servers for k = 32: cost

26 MITRE: Li-Chuan Chen26 Simulation Results: Heuristics given number of servers for k = 32: CPU time

27 MITRE: Li-Chuan Chen27 Simulation Results: Intermediate cost for n = 1144, k = 32:

28 MITRE: Li-Chuan Chen28 Summary We have introduced the problem of improving efficiency of access to Web system services through the use of proxy location. Most problem formulations are NP-hard. We have presented algorithms for the ring topology. We have implemented heuristics for the general case and presented simulations for performance evaluation.

29 MITRE: Li-Chuan Chen29 Thank you!

30 MITRE: Li-Chuan Chen30 Proxy Location: Ring Submodular:

31 MITRE: Li-Chuan Chen31 Proxy Location: Ring Interleaving : X’ and Y’ interleave but not X_opt and Y_opt

32 MITRE: Li-Chuan Chen32 Heuristics for Proxy Location Given the cost of opening each server: We assume that there is a fixed cost for opening each server. Random: Opens a random server and computes the cost. Repeat as long as cost decreases. Greedy: Similar to random, but repeatedly selects the server that gives the maximum cost reduction (never deletes a server). Repeated until a server cannot be added without increasing the cost.

33 MITRE: Li-Chuan Chen33 Heuristics for Proxy Location Run-(n log n): (Charikar and Guha [CG99]). –Start with Random. –Repeat n log n times: Select node i at random as a new server location. For each existing server i’, consider closing i’ and reassigning its clients to i. If this is profitable do it. If the overall cost is lower, then open i and do all of this, otherwise ignore. Run-to-Limit: Same as Run-(n log n) except the algorithm only terminates when no more improvement can be made.

34 MITRE: Li-Chuan Chen34 Simulation Results: Heuristics given the cost of $20K of opening each server:

35 MITRE: Li-Chuan Chen35 Simulation Results: Heuristics given the cost of $20K of opening a server: CPU

36 MITRE: Li-Chuan Chen36 Fault Tolerance Possible Faults: network failures, server failures, document demand changes, network transfer rate changes. Constraints: After any single failure, a constant number of proxies may be relocated. Goal: Design an algorithm to achieve approximately optimal solution to restore Web services when server fails.

37 MITRE: Li-Chuan Chen37 Fault Tolerance Optimal placement for 5 proxies Optimal placement for 4 proxies: very costly. Proxy fails

38 MITRE: Li-Chuan Chen38 Fault Tolerance: On-line Approach On-line placement for 5 proxies. Not optimal but good. 1 5 4 3 2 Proxy fails 1 4 3 Move last proxy to replace failed server. Same as 4 proxy on-line placement. (2)

39 MITRE: Li-Chuan Chen39 Fault Tolerance: Server Failures On-line Algorithm: Makes decisions to a series of requests without knowledge of the entire input sequence. Approximate Optimality: For any m, our on-line algorithm for 2m proxies has cost is less than the optimal algorithm with m proxies. Strategy: Build an initial set of m proxies using the on-line algorithm.When a server x fails: –If x is the last proxy added, then no action is needed. –Else let y be the last proxy, move y’s documents to node nearest to x, and remove y. We now have m-1 remaining servers, and approximate optimality.

40 MITRE: Li-Chuan Chen40 Fault Tolerance: Other Problems Network Failures: How to reroute network traffic to make use of the existing set of proxies? How to determine the best way to place proxies in the updated topology? (Cannot be tolerated in linear topology, only applies to more general topologies.) Link Transfer Rate Changes: How to move proxies when such changes are present? (Can model this as a change in the distance function.) Temporal Variations: Demand rate and network transfer rate varies (e.g., lunch time, events). Determine solutions that are approx. optimal for each possible demand scenario and apply them accordingly.

41 MITRE: Li-Chuan Chen41 Major Contributions Efficient algorithms for optimal proxy location on ring topologies. Use of submodularity to produce more efficient DP solutions.

42 MITRE: Li-Chuan Chen42 Future Directions Consider ways of strengthening our existing results either by improving efficiency of the algorithms or by eliminating some of the assumptions that are made. Tree topology: Generalize the proxy location results from linear to tree topologies. Non-homogeneous proxies/Documents: We have assumed all proxies hold the same documents. An important generalization would be to determine both the placement of proxies and how documents are assigned to proxies. Fault-Tolerance: How to deal with proxy failures and fluctuations in demand.


Download ppt "Efficient Algorithms for Locating Web Proxies Copyright, 1996 © Dale Carnegie & Associates, Inc. Li-Chuan Chen The MITRE Corporation Co-author:"

Similar presentations


Ads by Google