CacheCast: Eliminating Redundant Link Traffic for Single Source Multiple Destination Transfers Piotr Srebrny, Thomas Plagemann, Vera Goebel Department of Informatics, University of Oslo Andreas Mauthe Computing Department, Lancaster University
Outline Problem statement Idea Design Related work CacheCast evaluation Conclusions
What is the Problem? The Internet provides only a mechanism for single source to single destination datagram transmission (unicast) This is expressed in the IP header IP PAYLOAD …Destination AddressSource Address…
What is the Problem? (cont.) The Internet – a network of routers and links AP BP CP Redundancy! S A B C
What has been done? “Datagram Routing for Internet Multicasting” L. Aguilar – explicit list of destinations in the IP header ◦ Follow-ups: XCast, Small Group Multicast “Host Extension for IP Multicasting” S. Deering – destination address denotes a group of host “A Case for End System Multicast” Y. hua Chu et al. – application level multicast
CacheCast CacheCast is a network layer caching mechanism that eliminates redundant data transmissions S A B C AP B C BP CP
CacheCast Idea
CacheCast Idea (cont.) 4. 5.
Link Cache Point-to-point logical links Caching is done per link: ◦ Cache Management Unit (CMU) ◦ Cache Store (CS)
Link Cache Requirements Simple processing ◦ ~72ns to process a minimum size packet on a 10Gbps link and ~18ns on a 40Gbps link Modern DDR r/w cycle ~20ns Modern SRAM r/w cycle ~5ns Small cache size ◦ A link queue scaled to 250ms of the link traffic
Source Requirements Simple cache processing ◦ A source provides information on payload ID, payload size Minimise cache size ◦ A source batches request for the same data and transmits it within the minimum amount of time
Cacheable Packets CacheCast packet carries a metadata describing packet payload ◦ Payload ID ◦ Payload size ◦ Index Only packets with the metadata are cached
CMU & CS Index Index Payload ID Payload - AP1x A 0 Cache miss CMU tableCache store
CMU & CS (cont.) Index Index Payload ID P1 P2 P3 Payload - BP2x P1 B1 P2 P3 Cache hit P2 CMU tableCache store
Estimating Cache Size Concept of packet train It is sufficient to hold payload in the CS for the packet train duration time How many packet headers can send a source send within the time?
Estimating Cache Size (cont.) Back-of-the-envelope calculations ~10ms caches are sufficient Source uplink speed Packet train time 2ms10ms50ms 512Kbps2840 1Mbps Mbps Mbps
Implication of the Small Size 10ms cache size on a 10Gbps link: ~12.8MB for the CS storage space ~13000 entries in the CMU table What about 100Mbps LAN? ~130KB for CS ~130 entries in the CMU table We can afford that!
Related Work Packet Caches on Routers: The Implications of Universal Redundant Traffic Elimination. Ashok Anand et al. ◦ Per link cache ◦ Universal redundancy elimination ◦ No server support
Evaluation Bandwidth consumption ◦ CacheCast vs. IP Multicast Unique packet headers Finite cache sizes Incremental deployment ◦ Benefits of partial cache deployment Congestion control ◦ CacheCast impact on TFRC throughput
Bandwidth Consumption Multicast efficiency metric: Example: L m – total amount of multicast links L u – total amount of unicast links C AB S
Bandwidth Consumption (cont.) CacheCast unicast header part (h) and multicast payload part (p) Thus: E.g.: using packets which s p =1416B and s h =84B we experience reduction of 5%
Finite Cache Size The more destination the higher efficiency E.g. ◦ 512Kbps – 8 headers in 10ms, e.g. 12 destinations Slow sources transmitting to many destinations cannot achieve the maximum efficiency APBCDEFGHIPJKL
Finite Cache Size (cont.) UplinkPH 512Kbps8 1Mbps16 10Mbps Mbps1561 Sources with different uplink speed transmitting to the growing number of destinations
S Incremental Deployment The CS and CMU deployed incrementally
Incremental Deployment
Bottleneck Link Test ns2 implementation 100 TCP flows competing with 100 TFRC flows on a bottleneck link
Bottleneck Link Test (cont.) Both TCP and TRFC benefit from CacheCast
CacheCast Implementation Router part ◦ Click Modular Router CMU and CS elements - in total ~400 lines of code Server part ◦ Linux kernel – system call msend(fd_set *write_to, fd_set *written, char *buf, int len) ◦ Paraslash tools – a streaming server that uses the msend system call and a receiver
Testbed Testbed setup: ◦ Paraslash server (S) ◦ Click Modular Router (R) ◦ Paraslash receivers (A,B)
Testbed Results Bandwidth consumed by packet header transmission msend overhead negligible
Conclusions CacheCast is: ◦ A valid solution for single source multiple destinations transfers ◦ Simple and reliable ◦ Fully distributed ◦ Incrementally deployable
Thank You for Your Attention!