Download presentation
Presentation is loading. Please wait.
Published byRoger Willis Modified over 8 years ago
1
Inter-Datacenter Bulk Transfers with NetStitcher Nikolaos Laoutaris Telefonica Research joint work with: Michael Sirivianos, Xiaoyuan Yang and Pablo Rodriguez
2
Additional applications Big data important for business, science, society at large Densification of IT with datacenters/cloud fuels the big data mill
3
Value proposition NetStitcher is a solution for moving Petabytes across the Internet TCP, single-path routing, and the end-to-end principle not good 4 bulk It is cost-effective because it uses leftover bandwidth Time for bulk
5
Trend 1 – Fault tolerance to catastrophic failures
6
Trend 2 – PoP replication for improved user QoS Start thinking long-tail …
8
Additional applications Scientific computing Distributed production/delivery of movies And things will get worse
9
A system for carrying bulk data for large customers — Volume ~ Tbytes / Pbytes — Delivery time ~ hours / days Main idea — Peak load dimensioning & backup paths lots of leftover bandwidth Create a volume service for interconnecting datacenters X TBs from A to B within the next Y hours NetStitcher in a nutshell time
10
sender receiver Leftover b/w appears whenever and wherever You may have guessed already: store & forward is the solution
11
Stitching together leftover bandwidth is tricky Time zone A Time zone B
12
A storage overlay that is aware of leftover bandwidth leftover network bandwidth leftover edge bandwidth
13
NetStitcher’s bag of tricks No in-network constraints and time-aligned sender and receiver bandwidth availability — NetStitcher can perform end-to-end transfer
14
NetStitcher’s bag of tricks In-network constraints and time-aligned sender and receiver bandwidth availability — NetStitcher can perform multi-path overlay routing
15
NetStitcher’s bag of tricks In-network constraints and misaligned sender, receiver and intermediate node bandwidth availability — NetStitcher can perform multi-path and multi-hop store and forward
16
How do we schedule around all these constraints ?
17
Time expansion of a dynamic graph Src(1)Src(3) I 1 (1) I 1 (3) I 1 (2) I 2 (1) I 2 (3)I 2 (2)I2I2 I1I1 Dst(1) Dst(3) Dst(2)Dst N Src-I1 N I1-I2 N I2-Dst N Src-I1(1) N Src-I1(2) N I1-I2(2) N I2-Dst(2) N Src-I1(3) SrcSrc(2) Source Sink S Src (2) S Src (1) S I1 (1) S Dst (1) Network constraint Storage Constraint Intermediate node Destination Uplink & downlink constraints Source
18
S Src (2) Uplink & downlink constraints S Src (1) S I1 (1) I 1 (2) Src(2) N Src-I1(2) N I1-I2(2)
19
S Src (2) Uplink & downlink constraints S Src (1) I 1 (2)- Src(2)* U Src(2) D I1(2) Src(2)+ N Src-I1(2) I 1 (2)* S I1 (1) Src(2) I 1 (2) Src uplink constraint Network constraint I 1 downlink constraint
20
But we need to predict the future (of bandwidth) International backbone traffic
21
Prediction is easy when data are bulk 1. Periodic patterns 2. We care about VOLUMES not RATES — VOLUME = RATE(t) d(t) In our NetStitcher implementation we use: A simple Sparse Periodic Auto-regression Predictor (Chen et al., NSDI’08) Recomputation of transmission schedule End-game “pull mode” to handle occasional churn/prediction failure
22
Case study 1: Equinix datacenters 22 62 datacenter at 22 locations all over North America
23
How much data can we backup? 3 hours used for backup (3-6 am local time at the datacenter) 1 Gbps network access capacity NetStitcher can move ×5 more bytes
24
Case study 2: Telefonica CDN 49 servers in Europe, Latin America and USA. GMT-1 to GMT-8 Need to send a 4.2 TB file over 24h. Beyond leftover 95 th -percentile pricing with $7/Mbps/month Storage cost: $0.055 GB/month 1 1 1 1 3 3 1 1 2 2 2 2 1 1 29 1 1 8 8 1 1 2 2 Miami 1 1 Washington Dallas Palo Alto Colombia Peru Chile Argentina Spain UK Ger Brazil 1 1 4 4 1 1 1 1 4 4 cds 1 1 Cz Rep 10 1 1 FR USA New York LatAm Argentin a Colombi a Chile Peru Brazil TIWs Total Phase I 2010 4 10 4 8 0 45 71 16 15 20 91 173 Phase II 2011 TIWS End Point Phase I TIWS Entry Point Phase I Phase I OTF End Point Phase I OTF Entry Point Phase I cds Service Center Phase I Phase II TIWS End Point Phase II cds Service Center Phase II NetStitcher 80-90% cheaper between Europe & US
25
Conclusion: A practical application of DTNs The utilization of a network can be improved but for this we need: 1. Delay elastic traffic to go into off-peak hours 2. In-network storage 3. High-level knowledge of traffic behaviour around the day
26
More info at: http://people.tid.es/Nikolaos.Laoutaris
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.