Presentation is loading. Please wait.

Presentation is loading. Please wait.

Internet Networking Spring 2002 Tutorial 13 Web Caching Protocols ICP, CARP.

Similar presentations


Presentation on theme: "Internet Networking Spring 2002 Tutorial 13 Web Caching Protocols ICP, CARP."— Presentation transcript:

1 Internet Networking Spring 2002 Tutorial 13 Web Caching Protocols ICP, CARP

2 2 ICP - Internet Caching Protocol ICP is Web caching protocol –ICP version 2 defined in RFC 2186 –Used to exchange hints about the existence of URLs in neighbor caches. Caches exchange ICP queries and replies –gather information to use in selecting the most appropriate location from which to retrieve an object

3 3 ICPv2 Protocol specification Generally, Web caches use HTTP for the transfer of object data However, Caches can benefit from a simpler, lighter communication protocol. ICP is primarily used in a cache mesh to locate specific Web objects in neighboring caches. –One cache sends an ICP query to its neighbors. – The neighbors send back ICP replies indicating a "HIT" or a "MISS."

4 4 ICP Implementation In current practice, ICP is implemented on UDP There is no requirement that it be limited to UDP. ICP over UDP offers features important to Web caching applications. –Query/reply exchange needs to occur quickly –A cache cannot wait longer than that before beginning to retrieve an object. –Failure to receive a reply message means the network path is either congested or broken. In either case we would not want to select that neighbor.

5 5 Cache selection Failure to receive a reply from a cache –network or system failure. The ICP reply may include extra information –Can assist selection of the most appropriate source from which to retrieve an object.

6 6 ICP Opcodes ICP_OP_QUERY ICP_OP_HIT ICP_OP_MISS ICP_OP_ERR ICP_OP_DENIED ICP_OP_HIT_OBJ

7 7 ICPv2 application specification RFC 2187RFC 2187 A single Web cache will reduce the amount of traffic generated by the clients behind it Similarly, a group of Web caches can benefit by sharing another cache in much the same way In a cache hierarchy (or mesh) one cache establishes peering relationships with its neighbor caches

8 8 Types of relationship Parent –A parent cache is essentially one level up in a cache hierarchy Sibling – A sibling cache is on the same level Neighbor (peer) –Is either parent or sibling which is a single “cache-hop” away

9 9 Levels The general flow of document requests is up the hierarchy When a cache does not hold a requested object –It may ask via ICP whether any of its neighbor caches has the object. If there is a ‘Hit’ then the cache will request it from them. Else the cache must forward the request either to a parent, or directly to the origin server.

10 10 The essential difference between a parent and sibling "neighbor hit" may be fetched from either one "neighbor miss" may NOT be fetched from a sibling. The cache can ask only a parent to retrieve any object regardless of whether or not it is cached.

11 11 ICP Delay Caches are designed to return ICP requests quickly. The application does minimal processing of the ICP request, but most ICP-related delay is due to transmission on the network. ICP also serves to provide an indication of neighbor reachability. If ICP replies from a neighbor fail to arrive –Network path is congested (or down) –Cache application is not running on the ICP-queried neighbor machine

12 12 Determine whether to use ICP Not every HTTP request requires an ICP query to be sent –Obviously, cache hits will not need ICP because the request is satisfied immediately For origin servers very close to the cache, we do not want to use any neighbor caches

13 13 Determine whether to use ICP (cont.) In order for an HTTP request to yield an ICP transaction it must –Not be a cache hit –Not be to a local server –Be a GET request –Not match the `hierarchy_stoplist' configuration.

14 14 Timeouts ICP uses UDP as underlying transport –ICP queries and replies may sometimes be dropped by the network The cache installs a timeout event in case not all of the expected replies arrive –By default Squid and Harvest use a two-second timeout –When a peer fails to reply to 20 consecutive ICP queries it is marked to be down

15 15 Multicast for Efficient Distribution A cache may deliver ICP queries to a multicast address Neighbor caches may join the multicast group to receive such queries –But for multicast we have no way to know exactly how many replies to expect ICP replies sent to unicast address

16 16 Differences Between ICP and HTTP HTTP supports a rich and sophisticated set of features ICP was designed to be simple, small, and efficient HTTP request and reply headers consist of lines of ASCII text ICP uses a fixed size header and represents numbers in binary

17 17 CARP - Cache Array Routing Protocol Microsoft® Proxy Server 2.0 uses the Cache Array Routing Protocol (CARP) Series of algorithms that are applied on top of HTTP Multiple proxy servers are arrayed as a single logical cache Does not require a new wire protocol –Uses HTTP, compatible with existing firewalls and proxy servers

18 18 Hash-based Routing Provides a deterministic "request resolution path" through an array of proxies The request resolution path –Hashing of proxy array member identities and URLs –For any given URL request, the proxy server will know exactly where in the proxy array the information will be stored (or still not)

19 19 Benefits Deterministic request resolution path –No query messaging between proxy servers that existed in ICP –Eliminates the duplication of contents that otherwise occurs on an array of proxy servers –Has positive scalability, becomes faster and more efficient as more proxy servers are added

20 20 How CARP works A hash function is computed for the name of each proxy server A hash function is computed for the name of each requested URL The hash value of the URL is combined with the hash value for each proxy –Whichever URL+Proxy Server hash comes up with the highest value, becomes "owner" of the information cache –If a server fails its, URLs are automatically rerouted to the server with the next highest score

21 21 How CARP works (cont.) The result –Deterministic location for all cached information –Web browser or downstream proxy server can know exactly where a requested URL either already is stored locally, or will be located after caching –Because the hash functions used to assign values are so great: 2^32 = 4294967296 - the result is a statistically distributed load balancing across the array

22 22 Updating Membership List Array manager maintains a current list of members of a particular proxy array All proxies servers in the array stores their own local copies of the array list and periodically send requests for updates to the array manager –They also watches all HTTP requests to any array members and if a request fails, then marks that proxy member down until next update from the array manager


Download ppt "Internet Networking Spring 2002 Tutorial 13 Web Caching Protocols ICP, CARP."

Similar presentations


Ads by Google