Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Fully Automated Fault- tolerant System for Distributed Video Processing and Off­site Replication George Kola, Tevfik Kosar and Miron Livny University.

Similar presentations


Presentation on theme: "A Fully Automated Fault- tolerant System for Distributed Video Processing and Off­site Replication George Kola, Tevfik Kosar and Miron Livny University."— Presentation transcript:

1 A Fully Automated Fault- tolerant System for Distributed Video Processing and Off­site Replication George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison June 2004

2 2/20 What is the talk about ? You heard about streaming videos and switching video quality to mitigate congestion You heard about streaming videos and switching video quality to mitigate congestion This talk is about getting these videos This talk is about getting these videos Encoding/Processing videos using commodity clusters/grid resources Encoding/Processing videos using commodity clusters/grid resources Replicating videos over wide-area network Replicating videos over wide-area network Insights into issues in the above process Insights into issues in the above process

3 3/20 Motivation Education Research, Bio-medical engineering, … have a large amount of videos Education Research, Bio-medical engineering, … have a large amount of videos Digital Libraries: Videos need to be processed Digital Libraries: Videos need to be processed WCER ‘Transana’ WCER ‘Transana’ Collaboration => Videos need to be shared Collaboration => Videos need to be shared Conventional approach: Mail tapes Conventional approach: Mail tapes Work to load tape into collaborator digital library Work to load tape into collaborator digital library High turn-around time High turn-around time Desire for full electronic solution Desire for full electronic solution

4 4/20 Hardware Encoding Issues Products do not support tape robot. Users need multiple formats Products do not support tape robot. Users need multiple formats Need a lot of operators! Need a lot of operators! Non-availability of hardware miniDV to MPEG-4 encoders Non-availability of hardware miniDV to MPEG-4 encoders Lack of flexibility. No video processing support Lack of flexibility. No video processing support Night video, want to change white balance Night video, want to change white balance Some hardware encoders are essentially PCs, but cost a lot more ! Some hardware encoders are essentially PCs, but cost a lot more !

5 5/20 Our Goals Fully electronic solution Fully electronic solution Shorter turn-around time Shorter turn-around time Full automation Full automation No need for operators. More reliable No need for operators. More reliable Flexible software based solution Flexible software based solution Use idle CPUs in commodity clusters/grid resources Use idle CPUs in commodity clusters/grid resources Cost effective Cost effective

6 6/20 Issues 1 hour DV video is ~13 GB 1 hour DV video is ~13 GB A typical educational research video uses 3 cameras => 39 GB for 1 hour A typical educational research video uses 3 cameras => 39 GB for 1 hour Transferring these videos over the network Transferring these videos over the network Intermittent network outage => retransfer whole file =>may not complete Intermittent network outage => retransfer whole file =>may not complete Need for fault-tolerance and failure handling Need for fault-tolerance and failure handling Software/Machine crash Software/Machine crash downtime for upgrades (we do not control the machines!) downtime for upgrades (we do not control the machines!) Problems with existing distributed scheduling systems Problems with existing distributed scheduling systems

7 7/20 Problems with Existing Systems Couple data movement and computation. Failure of either results in redo of both Couple data movement and computation. Failure of either results in redo of both Do not schedule data movement Do not schedule data movement 100+ nodes, each picking up a different 13GB file 100+ nodes, each picking up a different 13GB file Server thrashing Server thrashing Some transfers may never complete Some transfers may never complete 100+ nodes, each writing a 13GB file to server 100+ nodes, each writing a 13GB file to server Distributed Denial of Service Distributed Denial of Service Server crash Server crash

8 8/20 Our Approach Decouple data placement and computation Decouple data placement and computation =>Fault isolation =>Fault isolation Make data placement full-fledged job Make data placement full-fledged job Improved failure handling Improved failure handling Alternate task failure recovery /Protocol switching Alternate task failure recovery /Protocol switching Schedule data placement Schedule data placement Prevents thrashing and crashing due to overload Prevents thrashing and crashing due to overload Can optimize schedule using storage server and end host characteristics Can optimize schedule using storage server and end host characteristics Can tune TCP buffers at run-time Can tune TCP buffers at run-time Can optimize for full-system throughput Can optimize for full-system throughput

9 9/20 Fault Tolerance Small network outages were most common failures Small network outages were most common failures Data placement scheduler made fault aware and retries to success. Data placement scheduler made fault aware and retries to success. Can tolerate system upgrade during processing Can tolerate system upgrade during processing Software had to be upgraded during operation Software had to be upgraded during operation Avoiding bad compute nodes Avoiding bad compute nodes Persistent logging to resume from whole system crash Persistent logging to resume from whole system crash

10 10/20 Three Designs Some clusters have a stage area Some clusters have a stage area Design 1. Stage to cluster stage area Design 1. Stage to cluster stage area Some clusters have a stage area and allow running computation there Some clusters have a stage area and allow running computation there Design 2. Run Hierarchical buffer server in stage area Design 2. Run Hierarchical buffer server in stage area No cluster stage area No cluster stage area Design 3. Direct staging to compute node Design 3. Direct staging to compute node

11 11/20 Design 1 & 2: Using Stage Area Source Compute Nodes Stage Area Wide Area

12 12/20 Design 1 versus Design 2 Design 1 Design 1 uses the default transfer protocol to transfer data from stage area to compute node uses the default transfer protocol to transfer data from stage area to compute node Not scheduled Not scheduled Problems when number of concurrent transfers increases Problems when number of concurrent transfers increases Design 2 Design 2 uses a hierarchical buffer server at the stage node. Client runs at the compute node to pick up the data uses a hierarchical buffer server at the stage node. Client runs at the compute node to pick up the data Scheduled Scheduled Hierarchical buffer server crashes need to be handled Hierarchical buffer server crashes need to be handled 25%-30% performance improvement in our current setting 25%-30% performance improvement in our current setting

13 13/20 Design 3: Direct Staging Source Compute Nodes Wide Area

14 14/20 Design 3 Applicable when there is no stage area Applicable when there is no stage area Most flexible Most flexible CPU wasted during data transfer/Need additional features CPU wasted during data transfer/Need additional features Optimization possible if transfer/compute times can be estimated Optimization possible if transfer/compute times can be estimated

15 15/20 SRB Server @SDSC Staging Site @UW Condor Pool @UW WCER Video Pipeline WCER Input Data flow Output Data flow Processing Split files

16 16/20 WCER Video Pipeline Data transfer protocols had 2 GB file size limit Data transfer protocols had 2 GB file size limit Split files and rejoin them Split files and rejoin them File size limits with Linux video decoders File size limits with Linux video decoders Picked up new decoder from CVS Picked up new decoder from CVS File system performance issues File system performance issues Flaky network connectivity Flaky network connectivity Got network administrators to fix it Got network administrators to fix it

17 17/20 WCER Video Pipeline EncodingResolution File Size Average Time MPEG-1 Half (320 x 240) 600 MB 2 hours MPEG-2 Full (720x480) 2 GB 8 hours MPEG-4 Half (320 x 240) 250 MB 4 hours Started processing in Jan 2004 DV video encoded to MPEG-1, MPEG-2 and MPEG-4 Has been a good test for data intensive distributed computing Fault tolerance issues were the most important In a real system, downtime for software upgrades should be taken into account

18 18/20 How can I use this ? Stork data placement scheduler Stork data placement scheduler http://www.cs.wisc.edu/stork http://www.cs.wisc.edu/stork Dependency manager (DAGMan) enhanced with DaP support) Dependency manager (DAGMan) enhanced with DaP support) Condor/Condor-G distributed scheduler Condor/Condor-G distributed scheduler http://www.cs.wisc.edu/condor http://www.cs.wisc.edu/condor Flexible DAG generator Flexible DAG generator Pick our tools and you can perform data intensive computing on commodity cluster/grid resources Pick our tools and you can perform data intensive computing on commodity cluster/grid resources

19 19/20 Conclusion & Future Work Successfully processed and replicated several terabytes of video Successfully processed and replicated several terabytes of video Working on extending design 3 Working on extending design 3 Building a client-centric data-aware distributed scheduler Building a client-centric data-aware distributed scheduler Deployment of the new scheduler inside existing schedulers Deployment of the new scheduler inside existing schedulers Idea from virtual machines Idea from virtual machines

20 20/20Questions Contact Contact George Kola kola@cs.wisc.edu George Kola kola@cs.wisc.edukola@cs.wisc.edu http://www.cs.wisc.edu/condor/didc http://www.cs.wisc.edu/condor/didc


Download ppt "A Fully Automated Fault- tolerant System for Distributed Video Processing and Off­site Replication George Kola, Tevfik Kosar and Miron Livny University."

Similar presentations


Ads by Google