New Algorithms for Planning Bulk Transfer via Internet and Shipping Networks Brian Cho Indranil Gupta University of Illinois at Urbana-Champaign.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

Symantec 2010 Windows 7 Migration EMEA Results. Methodology Applied Research performed survey 1,360 enterprises worldwide SMBs and enterprises Cross-industry.
Symantec 2010 Windows 7 Migration Global Results.
You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
1 Yell / The Law and Special Education, Second Edition Copyright © 2006 by Pearson Education, Inc. All rights reserved.
EE384y: Packet Switch Architectures
Feichter_DPG-SYKL03_Bild-01. Feichter_DPG-SYKL03_Bild-02.
Greening Backbone Networks Shutting Off Cables in Bundled Links Will Fisher, Martin Suchara, and Jennifer Rexford Princeton University.
Multicriteria Decision-Making Models
Chapter 1 The Study of Body Function Image PowerPoint
Cognitive Radio Communications and Networks: Principles and Practice By A. M. Wyglinski, M. Nekovee, Y. T. Hou (Elsevier, December 2009) 1 Chapter 12 Cross-Layer.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 38.
Chapter 1 Image Slides Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Page 1 Approximately Maximum Bandwidth Routing for Slotted Wireless Ad Hoc Networks Approximately Maximum Bandwidth Routing for Slotted Wireless Ad Hoc.
1 Mixing Public and private clouds a Practical Perspective Maarten Koopmans Nordunet Conference 2009 Maarten Koopmans Nordunet Conference 2009.
UNITED NATIONS Shipment Details Report – January 2006.
1 Multi-Channel Wireless Networks: Capacity and Protocols Nitin H. Vaidya University of Illinois at Urbana-Champaign Joint work with Pradeep Kyasanur Chandrakanth.
and 6.855J Cycle Canceling Algorithm. 2 A minimum cost flow problem , $4 20, $1 20, $2 25, $2 25, $5 20, $6 30, $
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Addition Facts
Year 6 mental test 5 second questions
Around the World AdditionSubtraction MultiplicationDivision AdditionSubtraction MultiplicationDivision.
£1 Million £500,000 £250,000 £125,000 £64,000 £32,000 £16,000 £8,000 £4,000 £2,000 £1,000 £500 £300 £200 £100 Welcome.
ZMQS ZMQS
VARUN GUPTA Carnegie Mellon University 1 Partly based on joint work with: Anshul Gandhi Mor Harchol-Balter Mike Kozuch (CMU) (CMU) (Intel Research)
1 Outline relationship among topics secrets LP with upper bounds by Simplex method basic feasible solution (BFS) by Simplex method for bounded variables.
Solve Multi-step Equations
Break Time Remaining 10:00.
Find the total of 5 hours 10 minutes 3 hours 23 minutes 6 hours 39 minutes Click for solution + hours minutes7214 More than 60 minutes? YES: convert 60.
1 Challenge the future Subtitless On Lightweight Design of Submarine Pressure Hulls.
1 Column Generation. 2 Outline trim loss problem different formulations column generation the trim loss problem master problem and subproblem in column.
ABC Technology Project
Shadow Prices vs. Vickrey Prices in Multipath Routing Parthasarathy Ramanujam, Zongpeng Li and Lisa Higham University of Calgary Presented by Ajay Gopinathan.
Time Slicing in Mobile TV Broadcast Networks with Arbitrary Channel Bit Rates Cheng-Hsin Hsu Joint work with Mohamed Hefeeda April 23, 2009 Simon Fraser.
An Application of Linear Programming Lesson 12 The Transportation Model.
COMP 482: Design and Analysis of Algorithms
Capacity Planning For Products and Services
Simultaneous Routing and Resource Allocation in Wireless Networks Mikael Johansson Signals, Sensors and Systems, KTH Joint work with Lin Xiao and Stephen.
15. Oktober Oktober Oktober 2012.
Making Time-stepped Applications Tick in the Cloud Tao Zou, Guozhang Wang, Marcos Vaz Salles*, David Bindel, Alan Demers, Johannes Gehrke, Walker White.
8.6 Linear Programming. Linear Program: a mathematical model representing restrictions on resources using linear inequalities combined with a function.
We are learning how to read the 24 hour clock
Routing and Congestion Problems in General Networks Presented by Jun Zou CAS 744.
Adding Up In Chunks.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Sets Sets © 2005 Richard A. Medeiros next Patterns.
SIMOCODE-DP Software.
Benjamin Banneker Charter Academy of Technology Making AYP Benjamin Banneker Charter Academy of Technology Making AYP.
Addition 1’s to 20.
25 seconds left…...
Subtraction: Adding UP
: 3 00.
Types of clocks. Types of clocks Sand clock or Hourglass clock.
Abhigyan, Aditya Mishra, Vikas Kumar, Arun Venkataramani University of Massachusetts Amherst 1.
1 Scheduling Crossbar Switches Who do we chose to traverse the switch in the next time slot? N N 11.
Week 1.
We will resume in: 25 Minutes.
Fundamentals of Cost Analysis for Decision Making
Clock will move after 1 minute
A SMALL TRUTH TO MAKE LIFE 100%
PSSA Preparation.
Other Dynamic Programming Problems
Select a time to count down from the clock above
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
Delay Analysis and Optimality of Scheduling Policies for Multihop Wireless Networks Gagan Raj Gupta Post-Doctoral Research Associate with the Parallel.
Presentation transcript:

New Algorithms for Planning Bulk Transfer via Internet and Shipping Networks Brian Cho Indranil Gupta University of Illinois at Urbana-Champaign

Motivation: Ad-hoc Data Processing Data-intensive research on OpenCirrus – Federated cloud: diverse geographic locations – Data scale of TBs Limited wide area bandwidth is a big bottleneck : Can take days or weeks to transfer over internet [Garfinkel 07] Success story: Washington Post – Hillary Clinton White House schedule Released as 17,481 pages non-searchable PDF images Convert to searchable text and deliver to newsroom within the same news cycle – Done within 26 hours with Amazon AWS Pay for bandwidth and computer usage 2

Pandora (People and networks moving data around) – First ever solution to transfer data cooperatively between multiple sources with internet and shipping edges – Produce optimal transfer plans that obey time deadlines and minimize dollar cost Better than internet-only and shipping-only strategies Bulk Transfer Options Internet Transfer – Grid: [GridFTP] – PlanetLab: [CoBlitz 06] Disk Shipping Transfer – [Jim Gray 03] – [PostManet 04] – [DOT 06] – Amazon AWS Import/Export 3

5-20 Mbps 1TB: 5-20 days 5-20 Mbps 1TB: 5-20 days Data Source (Illinois) Option 1: Internet Transfer Computation Provider (Amazon) Computation Provider (Amazon) Data Source (CMU) Data Source (CMU) $0.10 per GB 4 No Cost

Disk Interface 40 MB/s Disk Interface 40 MB/s Overnight: $60 per Disk Two-Day: $30 per Disk Ground: $10 per Disk Overnight: $60 per Disk Two-Day: $30 per Disk Ground: $10 per Disk Data Source (Illinois) Data Source (Illinois) Option 2: Disk Shipping Transfer Computation Provider (Amazon) Computation Provider (Amazon) Data Source (CMU) Data Source (CMU) 5 Overnight: $50 per Disk Two-Day: $25 per Disk Ground: $5 per Disk Overnight: $50 per Disk Two-Day: $25 per Disk Ground: $5 per Disk $0.02 per GB $80 per Disk $0.02 per GB $80 per Disk Overnight: $40 per Disk Two-Day: $15 per Disk Ground: $5 per Disk Overnight: $40 per Disk Two-Day: $15 per Disk Ground: $5 per Disk

Cooperative Transfer Solutions Good solutions – Meet deadlines – Minimize dollar cost Complexity – Global scale – Many strategies – Collaboration helps How to find the best solution? 6 Open Cirrus Sites

15 Days Data Source A Data Source A No Cost Data Source B Data Source B Example: Minimize Dollar Cost Cloud Service Provider Cloud Service Provider 0.8 TB 1.2 TB Loading: $40 Handling: $80 Loading: $40 Handling: $80 Total Cost: $125 Total Time: 20 Days Total Cost: $125 Total Time: 20 Days 5 Days. Ground: $5 14 hours 7

Data Source A Data Source A 1 Day Overnight: $40 Data Source B Data Source B Example: Meet Deadline (3 days) while Minimizing Dollar Cost Cloud Service Provider Cloud Service Provider 0.8 TB 1.2 TB Loading: $40 Handling: $80 Loading: $40 Handling: $80 Total Cost: $210 Total Time: 3 Days Total Cost: $210 Total Time: 3 Days 1 Day. Overnight: $ hours 6 hours 8

Outline Motivation Problem Formulation – Graph Model – Flow Over Time Solution: Pandora Experimental Results Conclusion 9

Graph Model: Internet Links 10 inet_out inet_in inet_out inet_in Incoming/ Outgoing BW Incoming/ Outgoing BW Capacity (Mb/s) Cost ($/GB) Transit time (almost instantaneous) Capacity (Mb/s) Cost ($/GB) Transit time (almost instantaneous) Site ASite B

Graph Model: Shipment Links 11 inet_out inet_in ship_in inet_out inet_in ship_in Incoming/ Outgoing BW Incoming/ Outgoing BW Disk Interface BW e.g., 40 MB/s Cost: Loading ($/GB) Disk Interface BW e.g., 40 MB/s Cost: Loading ($/GB) Capacity (Mb/s) Cost ($/GB) Transit time (almost instantaneous) Capacity (Mb/s) Cost ($/GB) Transit time (almost instantaneous) Capacity (almost infinite) Cost: Shipping and Handling ($/Disk) Transit time (Hrs) Capacity (almost infinite) Cost: Shipping and Handling ($/Disk) Transit time (Hrs) Site ASite B

Data Transfer Over Time Goal: Meet time deadline T while minimizing dollar cost C Hard problem on graph with both Internet and Shipment links – NP-Hard – Formal problem and proof in paper Solution: Pandora computes optimal and approximate solutions 12

Solution: Pandora Overview Transform into static time-expanded network – Decomposition of shipping edges Solve min-cost flow on static network – Mixed Integer Program – Optimizations to reduce computation time 13

Time-expanded Network Intuitively, incorporate time into graph to create an extended graph representation Make T=deadline copies of each vertex Draw edges according to transit time Draw holdover edges [Ford Fulkerson 58] Disk shipment represented as time-expanded network 14 τ = 1 τ = 3 T = 5 time

Decomposed Shipping Edges Decompose shipping edges to fixed cost edges 1.Transit time 2.Fixed cost 3.Capacity 15 cost = $130 capacity = 2 TB cost = $110 capacity = 2 TB cost = $100 cap = 2 TB

Fixed-cost edges make min-cost flow calculation NP-Hard Mixed-Integer Program (MIP) – Binary variable y e defined on fixed-cost edges Goal: Minimize dollar cost Subject to – Capacity constraints (flow e capacity e y e ) – Conservation of flow – Demands of sources and sink Proof of NP-Hardness and formal MIP in paper Solution: Min-cost Flow Calculation using Mixed-Integer Program 16

Optimizations: Overview Size of MIP grows linearly with deadline T – Worst-case running time grows exponentially with T Reduce size of the MIP – Reduce number of shipment edges – Δ -condensed time-expanded networks More optimizations in paper 17

Optimizations: Reduce number of shipment edges Can remove redundant shipment edges Example: – Overnight shipment sent anytime before 4pm will arrive at destination at 8am 18 8am 4pm 3pm 2pm 1pm noon 7am

Optimization: Δ-condensed Time-expanded Network Each batch of consecutive Δ time units condensed into one virtual time unit Solution has – Minimum cost – Deadline approximation depending on Δ More details in paper [Fleischer Skutella 07] 19 Δ = 2

Experimental Setup Trace-driven – Wrote scripts to communicate with FedEx web services: queried package rates and destination time – Internet BW from PlanetLab measurements GNU Linear Programming Kit (GLPK) 20

Experimental Results: 8 sources, 0.25 TB per node, Heterogeneous BW 21 Direct Internet – Cost: $200 – Time: 280 hrs – Cannot take advantage of heterogeneous bandwidth Direct Overnight – Cost: $1,500 – Time: 38 hrs – Cannot fill disks to capacity t t 0.25 TB x 8 Width proportional to BW

Experimental Results: 8 sources, 0.25 TB per node, Heterogeneous BW t t TB 0.14 TB 0.06 TB 0.08 TB Direct Internet – Cost: $200 – Time: 280 hrs – Cannot take advantage of heterogeneous bandwidth Direct Overnight – Cost: $1,500 – Time: 38 hrs – Cannot fill disks to capacity Pandora Deadline=96hrs – Cost: $183 – Time: < 96 hrs

Experimental Results: Optimizations Reducing shipment edges decreases computation time Using Δ-condensed time-expanded networks decreases computation time – Deadlines met in our experiments 23 2 sources 1 source

Conclusion First ever solution to transfer data cooperatively between multiple sources with internet and shipping edges Produce optimal transfer plans that obey time deadlines and minimize dollar cost Better than internet-only and shipping-only strategies Reasonable computation time by using optimizations 24