Cache Digest Alex Rousskov Duane Wessels National Laboratory for Applied Network Research April 17, 1998 元智大學 資訊工程研究所 系統實驗室 陳桂慧 February 9, 1999.

Slides:



Advertisements
Similar presentations
Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol Li Fan, Pei Cao and Jussara Almeida University of Wisconsin-Madison Andrei Broder Compaq/DEC.
Advertisements

Umut Girit  One of the core members of the Internet Protocol Suite, the set of network protocols used for the Internet. With UDP, computer.
CCNA – Network Fundamentals
1 Caching in HTTP Representation and Management of Data on the Internet.
EEC-484/584 Computer Networks Lecture 6 Wenbing Zhao
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol By Abuzafor Rasal and Vinoth Rayappan.
Hypertext Transfer Protocol Kyle Roth Mark Hoover.
MOBILITY SUPPORT IN IPv6
EEC-484/584 Computer Networks Discussion Session for HTTP and DNS Wenbing Zhao
Chapter 29 Structure of Computer Names Domain Names Within an Organization The DNS Client-Server Model The DNS Server Hierarchy Resolving a Name Optimization.
Adaptive Web Caching: Towards a New Caching Architecture Authors and Institutions: Scott Michel, Khoi Nguyen, Adam Rosenstein and Lixia Zhang UCLA Computer.
What’s a Web Cache? Why do people use them? Web cache location Web cache purpose There are two main reasons that Web cache are used:  to reduce latency.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
Towards a Better Understanding of Web Resources and Server Responses for Improved Caching Craig E. Wills and Mikhail Mikhailov Computer Science Department.
Internet Networking Spring 2002 Tutorial 13 Web Caching Protocols ICP, CARP.
Implementing ISA Server Caching. Caching Overview ISA Server supports caching as a way to improve the speed of retrieving information from the Internet.
1Bloom Filters Lookup questions: Does item “ x ” exist in a set or multiset? Data set may be very big or expensive to access. Filter lookup questions with.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 7: Planning a DNS Strategy.
Web Caching Schemes For The Internet – cont. By Jia Wang.
1 The Mystery of Cooperative Web Caching 2 b b Web caching : is a process implemented by a caching proxy to improve the efficiency of the web. It reduces.
Networking. Protocol Stack Generally speaking, sending an message is equivalent to copying a file from sender to receiver.
Error Checking continued. Network Layers in Action Each layer in the OSI Model will add header information that pertains to that specific protocol. On.
World Wide Web Caching: Trends and Technology Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented.
Web Proxy Server Anagh Pathak Jesus Cervantes Henry Tjhen Luis Luna.
Web Cache. Introduction what is web cache?  Introducing proxy servers at certain points in the network that serve in caching Web documents for faster.
Process-to-Process Delivery:
TRANSPORT LAYER T.Najah Al-Subaie Kingdom of Saudi Arabia Prince Norah bint Abdul Rahman University College of Computer Since and Information System NET331.
Mapping Internet Addresses to Physical Addresses (ARP)
1 3 Web Proxies Web Protocols and Practice. 2 Topics Web Protocols and Practice WEB PROXIES  Web Proxy Definition  Three of the Most Common Intermediaries.
Design and Implement an Efficient Web Application Server Presented by Tai-Lin Han Date: 11/28/2000.
CMPE 421 Parallel Computer Architecture
World Wide Web Caching: Trends and Technologys Gerg Barish & Katia Obraczka USC Information Sciences Institute, USA,2000.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 17 This presentation © 2004, MacAvon Media Productions Multimedia and Networks.
Compact Data Structures and Applications Gil Einziger and Roy Friedman Technion, Haifa.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
NetCache Architecture and Deployment Peter Danzig Network Appliance, Santa Clara, CA 元智大學 系統實驗室 陳桂慧
Web Performance 성민영 SNU Computer Systems lab.. 2 차례 4 Modeling the Performance of HTTP Over Several Transport Protocols. 4 Summary Cache : A Scaleable.
Dr. Yingwu Zhu Summary Cache : A Scalable Wide- Area Web Cache Sharing Protocol.
ICP and the Squid Web Cache Duanc Wessels k Claffy August 13, 1997 元智大學系統實驗室 宮春富 2000/01/26.
Multimedia and Networks. Protocols (rules) Rules governing the exchange of data over networks Conceptually organized into stacked layers – Application-oriented.
The concept of RAID in Databases By Junaid Ali Siddiqui.
Computer Science Lecture 14, page 1 CS677: Distributed OS Last Class: Concurrency Control Concurrency control –Two phase locks –Time stamps Intro to Replication.
Chapter 24 Transport Control Protocol (TCP) Layer 4 protocol Responsible for reliable end-to-end transmission Provides illusion of reliable network to.
An Efficient Wireless Mesh Network A New Architecture 指導教授:許子衡 教授 學生:王志嘉.
HTTP evolution - TCP/IP issues Lecture 4 CM David De Roure
ICP and the Squid Web Cache Duane Wessels and K. Claffy 산업공학과 조희권.
Setup and Management for the CacheRaQ. Confidential, Page 2 Cache Installation Outline – Setup & Wizard – Cache Configurations –ICP.
The LSAM Proxy Cache - a Multicast Distributed Virtual Cache Joe Touch USC / Information Sciences Institute 元智大學 資訊工程研究所 系統實驗室 陳桂慧
CFTP - A Caching FTP Server Mark Russell and Tim Hopkins Computing Laboratory University of Kent Canterbury, CT2 7NF Kent, UK 元智大學 資訊工程研究所 系統實驗室 陳桂慧.
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 17 This presentation © 2004, MacAvon Media Productions Multimedia and Networks.
Web Services. 2 Internet Collection of physically interconnected computers. Messages decomposed into packets. Packets transmitted from source to destination.
The Measured Access Characteristics of World-Wide-Web Client Proxy Caches Bradley M. Duska, David Marwood, and Michael J. Feeley Department of Computer.
Mobile IP 순천향대학교 전산학과 문종식
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Internet Cache Protocol Erez Tal Assaf Oren Avner Cohen Submission Date: 5/2/01 Guides: Ran Wolff and Itai Dabran.
Overview on Web Caching COSC 513 Class Presentation Instructor: Prof. M. Anvari Student name: Wei Wei ID:
McGraw-Hill Chapter 23 Process-to-Process Delivery: UDP, TCP Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
Improving the WWW: Caching or Multicast? Pablo RodriguezErnst W. BiersackKeith W. Ross Institut EURECOM 2229, route des Cretes. BP , Sophia Antipolis.
Distributed Computing & Embedded Systems Chapter 4: Remote Method Invocation Dr. Umair Ali Khan.
Process-to-Process Delivery:
Exploration 3 Chapter 4. What is VTP? VTP allows a network manager to configure a switch so that it will propagate VLAN configurations to other switches.
Web Caching? Web Caching:.
Internet Networking recitation #12
Process-to-Process Delivery:
Lecture 1: Bloom Filters
Process-to-Process Delivery: UDP, TCP
Presentation transcript:

Cache Digest Alex Rousskov Duane Wessels National Laboratory for Applied Network Research April 17, 1998 元智大學 資訊工程研究所 系統實驗室 陳桂慧 February 9, 1999

Problem of ICP To determine which member of a cache group (if any) a cache miss should be forwarded to –Internet Cache Protocol (ICP) via UDP, 20 byte fixed-format header plus a URL. sends an ICP_QUERY message to one or more neighbor caches reply with ICP_HIT or ICP_MISS

ICP –disadvantages additional delay introduced by the query/reply (RTT) effectiveness drops off noticeably when sending more than four queries per cache miss –advantages replies inherently indicate current network conditions provides a logical tie-breaking mechanism if all neighbors return ICP_MISS

Cache Digests Cache Digest –alternative method of object location which eliminates the query/reply step and its associated delays. Bloom Filters –an array of bits, some of which are on and some of which are off. –add an entry to the bloom filter hash function values specify which bits of the filter should be turned on. –check whether a specific entry is in the filter calculate the same hash function values for its key and examine the corresponding bits.

Bloom Filter The size of a Bloom Filter determines the probability an “all-bits-on” lookup is correct. Hit and miss - whether or not the bits of the Bloom Filter predict that a given object is in the filter. True and false - the correctness of the prediction.

Bloom Filter Filter’s prediction –true hit: correctly predicts an entry is in the cache. –false hit: incorrectly predicts the entry is in the cache. –true miss: correctly predicts the entry is not in the cache. –false miss: incorrectly predicts the entry is not in the cache. Goal: maximize the true values –Bloom Filter is perfectly synchronized with its source, will be zero false misses. –In our implementation, cache digests are not always perfectly synchronized, expect some number of false misses.

Building the Bloom Filter What are the tradeoffs between cache digest size and effectiveness? When we use more bits per object, the percentage of false hits should decrease. A related question is the optimal density of the Bloom Filter. As objects are added to the filter, its density will increase. Note that the peer that checks our cache digest will always use the same (maximum) number of hash functions.

Local Storage

Local Updates and Deletes There are two kinds of updates to a local digest. –new entries need to be added as they enter the cache. –if an object gets purged from the cache –it is desirable to undo its effect on the local digest It is not possible to delete entries from a standard bloom filter. –Any bit in the array which is on may have been set due to any number of entries. –One way to support deletes is to use integer counters instead of single bits. increase the digest size, need to check for overflows

Digest Dissemination and Update Propagation Which transport protocol is best for exchanging digests? –ICP messages (via UDP), must be able to tolerate gaps from lost packets, or implement a message retransmission scheme. –TCP reliability develop a customized digest exchange protocol, or exchange digests via HTTP.

Digest Dissemination and Update Propagation In exchanging cache digests, should we employ a push or pull technique? –The push model puts the server in control of distributing updates to its clients. the server knows exactly how rapidly its cache contents changes. this also requires that the server know which of its clients are digest-aware caches, and which are browsers, or caches which do not support digests.

Remote Storage A proxy must keep peer digests in memory because they are consulted on most misses. –When a proxy gets restarted, disk resident digests can be reused if they are fresh enough. –Note that expiration and other useful information is stored on disk along with the digest itself. Use of disk resident digests –sharing those digests with other proxies. –makes sense to fetch a cached digest from a neighbor rather than from its original producer, if the neighbor is closer. –treat them as any other cached object.

Handling False Hits For sibling relationships –we cannot tolerate a significant (or perhaps any) number of false hits. –We will require some feature in HTTP to detect false hits and deal with them appropriately.

Proposed Approach- Building a Digest Cache digests are built using a fixed set of hash functions, the hash values modulo the digest size, determine which bits will be turned on each cache determines the size of its digest independently from its peers Note that the proposed scheme allows for a very efficient lookup implementation on foreign digests while allowing great flexibility in building a digest. –to reduce the number of false hits, a proxy may decide to apply fewer hash functions for old or stale entries. –this will affect the way a cache digest is computed.

Proposed Approach- Storing a Digest The local digest can be recomputed at configurable (but fixed) time intervals. –monitor the density of the current digest –trigger rebuild the number of bits turned on exceeds some threshold. The presence and quality of local update algorithms is transparent to other proxies. –attach an expiration date to its cache digest so the peers will know when the digest may become out-of-date. –a precise expiry time is not required, and peers are not required to discard expired digests.

Proposed Approach- Disseminating Digest Use a pull technique for disseminating cache digests. –well with the current distribution and access control schemes for Web objects. –Push requires a parent proxy to maintain state data for all of its children. To preserve bandwidth and proxy resources –peers will use an “If-Modified-Since” request or equivalent technique when refreshing neighbor digests –desirable to fetch the digest of a remote parent via a local neighbor. – digests are treated as any other Web object and can be disseminated using existing cache hierarchies

Proposed Approach- Digest Update This difference may be acceptable, if the following two conditions hold until the next digest transfer : –(1) the neighbor does not add a lot of new, popular objects to its cache, and –(2) only unpopular documents are purged from the neighbor's cache. –If both conditions hold, the inconsistency embedded within an unsynchronized digest is harmless –If at least one of the conditions is false, it may be desirable for a proxy to notify its digest users about recent changes in its cache contents frequent updates => reduce false hits/misses => require significant amounts of bandwidth

Proposed Approach- Digest Update Piggyback - update messages in HTTP replies by using two custom HTTP headers: –X-Accept-Digest-Update: is used in the requests to notify the recipient that originator can accept digest updates in the reply for that particular message. –X-Accept-Update: is used in the reply to send current updates –major implications: 1. HTTP takes care to handle exceptional conditions 2. No extra messages are generated between cooperating proxies. 3. Parents are not required to maintain state information about child caches. 4. There are practical limits on the amount of information we can or should place into HTTP headers.

Proposed Approach- Digest Accuracy When a proxy requests an object based on the cache digest hit prediction, and the request is sent to a sibling (which cannot forward misses for us), –1. We add a standard HTTP cache control directive called only-if- cached. –2. If the request turns out to be a miss, according to HTTP/1.1[5], the proxy will send a 504 (Gateway Timeout) reply. –3. Upon receipt of a 504 reply, the originating proxy will either forward the request to a parent or directly to the origin server. (the only-if-cached directive is not added)

Implementation and Methodology 1. Each cache builds a digest of its own contents. –is built as the swap.state file is read 2. Updated as new objects enter the cache. –Because deletes are not supported, the entire cache digest is rebuilt periodically (every hour). We choose to transfer cache digests as HTTP messages. –individual UDP messages are limited to a system's socket buffer size (typically ranging from KB). –there are numerous commercial cache products available which do not support ICP. –The peer cache's digests are requested on demand, cache digests are served as standard HTTP replies

Implementation and Methodology(2) Handle false hits –for parent relationships, this is not a problem because a proxy is allowed to cause a cache miss at its parent. –for sibling relationships, false hits need to be dealt with properly. –use the only-if-cached directive. Our problem is that Squid is not fully programmed to properly handle a 504 reply and re-f orward the request to a parent cache or directly to the origin server.

Implementation and Methodology(4)