Presentation is loading. Please wait.

Presentation is loading. Please wait.

CoBlitz: A Scalable Large-file Transfer Service (COS 461) KyoungSoo Park Princeton University.

Similar presentations

Presentation on theme: "CoBlitz: A Scalable Large-file Transfer Service (COS 461) KyoungSoo Park Princeton University."— Presentation transcript:

1 CoBlitz: A Scalable Large-file Transfer Service (COS 461) KyoungSoo Park Princeton University

2 KyoungSoo Park2 Large-file Distribution Increasing demand for large files Movies or software release On-line movie/ downloads Linux distribution Files are 100MB ~ tens of GB One-to-many downloads How to serve large files to many clients? Content Distribution Network(CDN)? Peer-to-peer system?

3 KyoungSoo Park3 What CDNs Are Optimized For Most Web files are small (1KB ~ 100KB)

4 KyoungSoo Park4 Why Not Web CDNs? Whole file caching in participating proxy Optimized for 10KB objects 2GB = 200,000 x 10KB Memory pressure Working sets do not fit in memory Disk access is 1000 times slower Waste of resources More servers needed Provisioning is a must

5 KyoungSoo Park5 Peer-to-Peer? BitTorrent takes up ~30% Internet BW 1. Download a torrent file 2. Contact the tracker 3. Enter the swarm network 4. Chunk exchange policy - Rarest chunk first or random - Tit-for-tat: incentive to upload - Optimistic unchoking 5. Validate the checksums torrent tracker peers up down Benefit: extremely good use of resources!

6 KyoungSoo Park6 Peer-to-Peer? Custom software Deployment is a must Configurations needed Companies may want managed service Handles flash crowds Handles long-lived objects Performance problem Hard to guarantee the service quality Others are discussed later

7 KyoungSoo Park7 What Wed Like Is Large-file service with No custom client No custom server No prepositioning No rehosting No manual provisoning

8 KyoungSoo Park8 CoBlitz: Scalable Large-file CDN Reducing the problem to small-file CDN Split large-files into chunks Distribute chunks at proxies Aggregate memory/cache HTTP needs no deployment Benefits Faster than BitTorrent by 55-86% (~500%) One copy from origin serves 43-55 nodes Incremental build on existing CDNs

9 KyoungSoo Park9 How It Works AgentCDNClient Only reverse proxy(CDN) caches the chunks! CDN ClientAgent CDN chunk1 chunk 2 chunk 3 chunk 2 chunk 5 chunk 1 chunk 4 chunk 5 chunk 4 chunk1chunk2 chunk 3 chunk5 chunk4 CDN = Redirector + Reverse Proxy DNS Origin Server HTTP RANGE QUERY

10 KyoungSoo Park10 Smart Agent Preserves HTTP semantics Parallel chunk requests Client sliding window of chunks done HTTP CDN no action CDN no action waiting done waiting done waiting Agent

11 KyoungSoo Park11 Chunk Indexing: Consistent Hashing Static hashing f(x) = some_f(x) % n But n is dynamic for servers - node can go down - new node can join CDN node (proxy) Problem: How to find the node responsible for a specific chunk? X k : Chunk request X1 Consistent Hashing F(x) = some_F(x) % N (N is a large but fixed number) Find a live node k, where |F(k) – F(URL) | is minimum … N-1 0 … X2 X3

12 KyoungSoo Park12 Operation & Challenges Provides public service over 2.5 years Challenges Scalability & robustness Peering set difference Load to the origin server

13 KyoungSoo Park13 Unilateral Peering Independent proximity-aware peering Pick n close nodes around me Cf. BitTorrent picks n nodes randomly Motivation Partial network connectivity Internet2, CANARIE nodes Routing disruption Isolated nodes Benefits No synchronized maintenance problem Improve both scalability & robustness

14 KyoungSoo Park14 Peering Set Difference No perfect clustering by design Assumption Close nodes shares common peers Both can reach Only can reach

15 KyoungSoo Park15 Peering Set Difference Highly variable App-level RTTs 10 x times variance than ICMP High rate of change in peer set Close nodes share less than 50% Low cache hit Low memory utility Excessive load to the origin

16 KyoungSoo Park16 Peering Set Difference How to fix? Avg RTT min RTT Increase # of samples Increase # of peers Hysteresis Close nodes share more than 90%

17 KyoungSoo Park17 Reducing Origin Load Still have peering set difference Critical in traffic to origin Proximity-based routing Converge exponentially fast 3-15% do one more hop Implicit overlay tree Result Origin load reduction by 5x Origin server Rerun hashing

18 KyoungSoo Park18 Scale Experiments Use all live PlanetLab nodes as clients 380~400 live nodes at any time Simultaneous fetch of 50MB file Test scenarios Direct BitTorrent Total/Core CoBlitz uncached/cached/staggered Out-of-order numbers in paper

19 KyoungSoo Park19 Throughput Distribution 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0200040006000800010000 Throughput(Kbps) Fraction of Nodes <= X (CDF) Direct BT-total BT-core In-order uncached In-order staggered In-order cached 55-86% Out-of-order staggered BT-Core

20 KyoungSoo Park20 Downloading Times 95% percentile: 1000+ secs faster

21 KyoungSoo Park21 Why Is BitTorrent Slow? In the experiments No locality – randomly choose peers Chunk indexing – extra communication Trackerless BitTorrent – Kademlia DHT In practice Upload capacity of typical peers is low 10 to a few 100 Kbps for cable/DSL users Tit for tat may not be fair A few high-capacity uploaders help the most BitTyrant[NSDI07]

22 KyoungSoo Park22 Synchronized Workload Congestion Origin Server

23 KyoungSoo Park23 Addressing Congestion Proximity-based multi-hop routing Overlay tree for each chunk Dynamic chunk-window resizing Increase by 1/log(x), (where x is win size) if chunk finishes < average Decrease by 1 if retry kills the first chunk

24 KyoungSoo Park24 Number of Failures

25 KyoungSoo Park25 Performance After Flash Crowds BitTorrent: 20% > 5Mbps CoBlitz:70+% > 5Mbps

26 KyoungSoo Park26 Data Reuse 7 fetches for 400 nodes, 98% cache hit

27 KyoungSoo Park27 Real-world Usage 1-2 Terabytes/day Fedora Core official mirror US-East/West, England, Germany, Korea, Japan CiteSeer repository (50,000+ links) University Channel (podcast/video) Public lecture distribution by PU OIT Popular game patch distribution PlanetLab researchers Stork(U of Arizona) + ~10 others

28 KyoungSoo Park28 Fedora Core 6 Release October 24th, 2006 Peak Throughput 1.44Gbps Release point 10am 1 G Origin Server 30-40Mbps

29 KyoungSoo Park29 On Fedora Core Mirror List Many people complained about I/O Performing peak 500Mbps out of 2Gbps 2 Sun x4200 w/Dual Operons, 2G mem 2.5 TB Sata-based SAN All ISOs in disk cache or in-memoy FS CoBlitz uses 100MB mem per node Many PL node disks are IDEs Most nodes are BW capped at 10Mpbs

30 KyoungSoo Park30 Conclusion Scalable large-file transfer service Evolution under real traffic Up and running 24/7 for over 2.5 years Unilateral peering, multi-hop routing, window size adjustment Better performance than P2P Better throughput, download time Far less origin traffic

31 KyoungSoo Park31 Thank you! More information: How to use:* *Some content restrictions apply See Web site for details Contact me if you want full access!

Download ppt "CoBlitz: A Scalable Large-file Transfer Service (COS 461) KyoungSoo Park Princeton University."

Similar presentations

Ads by Google