Presentation on theme: "Squirrel: A peer-to-peer web cache Sitaram Iyer Joint work with Ant Rowstron (MSRC) and Peter Druschel."— Presentation transcript:
Squirrel: A peer-to-peer web cache Sitaram Iyer Joint work with Ant Rowstron (MSRC) and Peter Druschel
Peer-to-peer Computing Decentralize a distributed protocol: – Scalable – Self-organizing – Fault tolerant – Load balanced Not automatic!!
Web Caching 1. Latency, 2. External bandwidth, 3. Server load. ISPs, Corporate network boundaries, etc. Cooperative Web Caching: group of web caches tied together and acting as one web cache.
Web Cache Browser Cache Browser Cache Centralized Web Cache Web Server Sharing ! LAN Internet
Decentralized Web Cache Browser Cache Browser Cache Web Server LAN Internet Why? How?
Why peer-to-peer ? 1. Cost of dedicated web cache No additional hardware 2. Administrative costs Self-organizing 3. Scaling needs upgrading Resources grow with clients 4. Single point of failure Fault-tolerant by design
Setting Corporate LAN 100 - 100,000 desktop machines Single physical location Each node runs an instance of Squirrel Sets it as the browser’s proxy
Pastry Peer-to-peer object location and routing substrate Distributed Hash Table: reliably map an object key to a live node Routes in log 2 b (N) steps (e.g. 3-4 steps for 100,000 nodes, with b=16 )
Home-store model client home LAN Internet URL hash
Home-store model client home … that’s how it works!
Directory model Client nodes always store objects in local caches. Main difference between the two schemes: whether the home node also stores the object. In the directory model, it only stores pointers to recent clients, and forwards requests to them.
Directory model client delegate home rando m entry
(skip) Full directory protocol dir server e : cGET req origin other req home req client req 2 b : not-modified 3 e 3 2 1 c,e : req c,e : object 1 4 a, d 2 a, d : req 1 a : no dir, go to origin. Also d 2 3 1 not-modified object or dele- gate
Recap Two endpoints of design space, based on the choice of storage location. At first sight, both seem to do about as well. (e.g. hit ratio, latency).
Quirk Consider a – Web page with many images, or – Heavily browsing node In the Directory scheme, Many home nodes pointing to one delegate Home-store: natural load balancing.. evaluation on trace-based workloads..
Trace characteristics RedmondCambridge Total duration1 day31 days Number of clients36,782105 Number of HTTP requests16.41 million0.971 million Peak request rate606 req/sec186 req/sec Number of objects5.13 million0.469 million Number of cacheable objects2.56 million0.226 million Mean cacheable object reuse5.4 times3.22 times
Total external bandwidth 85 90 95 100 105 0.0010.010.1110100 Total external bandwidth (in GB) [lower is better] Per-node cache size (in MB) Directory Home-store No web cache Centralized cache Redmond
Total external bandwidth 5.5 5.6 5.7 5.8 5.9 6 6.1 0.0010.010.1110100 Total external bandwidth (in GB) [lower is better] Per-node cache size (in MB) Directory Home-store No web cache Centralized cache Cambridge
LAN Hops 0% 20% 40% 60% 80% 100% 012345 Fraction of cacheable requests Total hops within the LAN CentralizedHome-storeDirectory Cambridge
Load in requests per sec 1 10 100 1000 10000 100000 01020304050 Number of such seconds Max objects served per-node / second Home-store Directory Redmond
Load in requests per sec 1 10 100 1000 10000 100000 1e+06 1e+07 01020304050 Number of such seconds Max objects served per-node / second Home-store Directory Cambridge
Load in requests per min 1 10 100 050100150200250300350 Number of such minutes Max objects served per-node / minute Home-store Directory Redmond
Load in requests per min 1 10 100 1000 10000 020406080100120 Number of such minutes Max objects served per-node / minute Home-store Directory Cambridge
Conclusion Possible to decentralize web caching Performance comparable to centralized cache Is better in terms of cost, administration, scalability and fault tolerance.
(backup) Storage utilization Redmond Home-storeDirectory Total 97641 MB61652 MB Mean per-node 2.6 MB1.6 MB Max per-node 1664 MB
(backup) Fault tolerance Home-storeDirectory Equations Mean H/O Max H max /O Mean (H+S)/O Max max(H max,S max )/O Redmond Mean 0.0027% Max 0.0048% Mean 0.198% Max 1.5% Cambridge Mean 0.95% Max 3.34% Mean 1.68% Max 12.4%
(backup) Full home-store protocol server client other req home req a : object or notmod from home b : object or notmod from origin 3 1 b 2 (WAN) (LAN) origin b : req
(backup) Full directory protocol dir server e : cGET req origin other req home req client req 2 b : not-modified 3 e 3 2 1 c,e : req c,e : object 1 4 a, d 2 a, d : req 1 a : no dir, go to origin. Also d 2 3 1 not-modified object or dele- gate