Presentation is loading. Please wait.

Presentation is loading. Please wait.

Transparent and Flexible Network Management for Big Data Processing in the Cloud Anupam Das Curtis Yu Cristian Lumezanu Yueping Zhang Vishal Singh Guofei.

Similar presentations


Presentation on theme: "Transparent and Flexible Network Management for Big Data Processing in the Cloud Anupam Das Curtis Yu Cristian Lumezanu Yueping Zhang Vishal Singh Guofei."— Presentation transcript:

1 Transparent and Flexible Network Management for Big Data Processing in the Cloud Anupam Das Curtis Yu Cristian Lumezanu Yueping Zhang Vishal Singh Guofei Jiang

2

3

4 Data processing Network

5 Schedule computation

6 Schedule communication 33% of average job running time

7 FlowComb network management framework for Big Data processing  1. what is the traffic demand?  2. which path to choose? 3. how to change the path?

8 Demand prediction Use application semantics information to effectively and transparently infer network transfers (possibly before they start)

9 Demand prediction Agents on Hadoop nodes analyze Hadoop logs, query nodes and predict data transfers. Hadoop node Parses TaskTracker logs to identify reducers and size of map output Parses JobTracker logs to identify finished mappers Agent

10 Flow scheduling Reroute flows on paths with sufficient available bandwidth

11 Flow scheduling Where?Centralized decision engine Which flows? FIFO Reroute? If congestion on default path Which path? First with available bandwidth

12 Flow control Use OpenFlow to install new forwarding rules in the network and enforce the new paths

13 System Architecture Master Slaves 1 1 Hadoop Cluster PFS Analyze Hadoop logs 2 2 Extract flow information 5 5 Install routing rules 3 3 Schedule upcoming flows 4 4 Set up flow paths FlowComb Middleware OpenFlow Controller OpenFlow Controller FlowComb agent NEC Confidential13

14 Experiments

15 Does the network matter? Link capacity (Mbps)Avg. processing time (min) (x1.3) 2567 (x1.7) (x3.7) 4 times slower !!!

16 Can FlowComb predict transfers? 28% of transfers detected before they start (and 56% before they end)

17 How quickly can FlowComb change paths? 10%70%20% 60% before transfer midpoint

18 Can FlowComb reduce processing time? 36% faster than Hadoop without FlowComb (and 28% faster than Hadoop with ECMP)

19 FlowComb Network management platform for Big Data processing that is transparent to applications and quick and accurate in detecting their demand uses application semantics to detect data transfers (sometimes before they even start)

20 Testbed

21 OpenFlow network Controller

22 Hadoop sort performance FlowComb baseline Time (s) Avg utilization (MBps)


Download ppt "Transparent and Flexible Network Management for Big Data Processing in the Cloud Anupam Das Curtis Yu Cristian Lumezanu Yueping Zhang Vishal Singh Guofei."

Similar presentations


Ads by Google