Presentation is loading. Please wait.

Presentation is loading. Please wait.

SALSASALSA Twister: A Runtime for Iterative MapReduce Jaliya Ekanayake Community Grids Laboratory, Digital Science Center Pervasive Technology Institute.

Similar presentations


Presentation on theme: "SALSASALSA Twister: A Runtime for Iterative MapReduce Jaliya Ekanayake Community Grids Laboratory, Digital Science Center Pervasive Technology Institute."— Presentation transcript:

1 SALSASALSA Twister: A Runtime for Iterative MapReduce Jaliya Ekanayake Community Grids Laboratory, Digital Science Center Pervasive Technology Institute Indiana University HPDC – 2010 MAPREDUCE’10 Workshop, Chicago, 06/22/2010

2 SALSASALSA 2 Acknowledgements to: Co authors: Hui Li, Binging Shang, Thilina Gunarathne Seung-Hee Bae, Judy Qiu, Geoffrey Fox School of Informatics and Computing Indiana University Bloomington Team at IU

3 SALSASALSA Motivation Data Deluge MapReduce Classic Parallel Runtimes (MPI) Experiencing in many domains Data Centered, QoSEfficient and Proven techniques Input Output map Input map reduce Input map reduce iterations Pij Expand the Applicability of MapReduce to more classes of Applications Map-OnlyMapReduce Iterative MapReduce More Extensions

4 SALSASALSA 4 Features of Existing Architectures(1) Programming Model – MapReduce (Optionally “map-only”) – Focus on Single Step MapReduce computations (DryadLINQ supports more than one stage) Input and Output Handling – Distributed data access (HDFS in Hadoop, Sector in Sphere, and shared directories in Dryad) – Outputs normally goes to the distributed file systems Intermediate data – Transferred via file systems (Local disk-> HTTP -> local disk in Hadoop) – Easy to support fault tolerance – Considerably high latencies Google, Apache Hadoop, Sector/Sphere, Dryad/DryadLINQ (DAG based)

5 SALSASALSA 5 Features of Existing Architectures(2) Scheduling – A master schedules tasks to slaves depending on the availability – Dynamic Scheduling in Hadoop, static scheduling in Dryad/DryadLINQ – Naturally load balancing Fault Tolerance – Data flows through disks->channels->disks – A master keeps track of the data products – Re-execution of failed or slow tasks – Overheads are justifiable for large single step MapReduce computations – Iterative MapReduce

6 SALSASALSA A Programming Model for Iterative MapReduce Distributed data access In-memory MapReduce Distinction on static data and variable data (data flow vs. δ flow) Cacheable map/reduce tasks (long running tasks) Combine operation Support fast intermediate data transfers Reduce (Key, List ) Iterate Map(Key, Value) Combine (Map ) User Program Close() Configure() Static data δ flow

7 SALSASALSA 7 Twister Programming Model configureMaps(..) Two configuration options : 1.Using local disks (only for maps) 2.Using pub-sub bus configureReduce(..) runMapReduce(..) while(condition){ } //end while updateCondition() close() User program’s process space Combine() operation Reduce() Map() Worker Nodes Communications/data transfers via the pub-sub broker network Iterations May send pairs directly Local Disk Cacheable map/reduce tasks

8 SALSASALSA 8 Twister Architecture Worker Node Local Disk Worker Pool Twister Daemon Master Node Twister Driver Main Program B B B B Pub/sub Broker Network Worker Node Local Disk Worker Pool Twister Daemon Scripts perform: Data distribution, data collection, and partition file creation map reduce Cacheable tasks One broker serves several Twister daemons

9 SALSASALSA Input/Output Handling Data Manipulation Tool: – Provides basic functionality to manipulate data across the local disks of the compute nodes – Data partitions are assumed to be files (Contrast to fixed sized blocks in Hadoop) – Supported commands: mkdir, rmdir, put,putall,get,ls, Copy resources Create Partition File Node 0Node 1Node n A common directory in local disks of individual nodes e.g. /tmp/twister_data Data Manipulation Tool Partition File

10 SALSASALSA Partition file allows duplicates One data partition may reside in multiple nodes In an event of failure, the duplicates are used to re- schedule the tasks File NoNode IPDaemon NoFile partition path 4156.56.104.962/home/jaliya/data/mds/GD-4D-23.bin 5156.56.104.962/home/jaliya/data/mds/GD-4D-0.bin 6156.56.104.962/home/jaliya/data/mds/GD-4D-27.bin 7156.56.104.962/home/jaliya/data/mds/GD-4D-20.bin 8156.56.104.974/home/jaliya/data/mds/GD-4D-23.bin 9156.56.104.974/home/jaliya/data/mds/GD-4D-25.bin 10156.56.104.974/home/jaliya/data/mds/GD-4D-18.bin 11156.56.104.974/home/jaliya/data/mds/GD-4D-15.bin

11 SALSASALSA The use of pub/sub messaging Intermediate data transferred via the broker network Network of brokers used for load balancing – Different broker topologies Interspersed computation and data transfer minimizes large message load at the brokers Currently supports – NaradaBrokering – ActiveMQ Reduce() map task queues Map workers Broker network

12 SALSASALSA Scheduling Twister supports long running tasks Avoids unnecessary initializations in each iteration Tasks are scheduled statically – Supports task reuse – May lead to inefficient resources utilization Expect user to randomize data distributions to minimize the processing skews due to any skewness in data

13 SALSASALSA 13 Fault Tolerance Recover at iteration boundaries Does not handle individual task failures Assumptions: – Broker network is reliable – Main program & Twister Driver has no failures Any failures (hardware/daemons) result the following fault handling sequence – Terminate currently running tasks (remove from memory) – Poll for currently available worker nodes (& daemons) – Configure map/reduce using static data (re-assign data partitions to tasks depending on the data locality) – Re-execute the failed iteration

14 SALSASALSA Performance Evaluation Hardware Configurations We use the academic release of DryadLINQ, Apache Hadoop version 0.20.2, and Twister for our performance comparisons. Both Twister and Hadoop use JDK (64 bit) version 1.6.0_18, while DryadLINQ and MPI uses Microsoft.NET version 3.5. Cluster IDCluster-ICluster-II # nodes32230 # CPUs in each node62 # Cores in each CPU84 Total CPU cores7681840 Supported OSsLinux (Red Hat Enterprise Linux Server release 5.4 -64 bit) Windows (Windows Server 2008 - 64 bit) Red Hat Enterprise Linux Server release 5.4 -64 bit

15 SALSASALSA 15 Pagerank – An Iterative MapReduce Algorithm Well-known pagerank algorithm [1] Used ClueWeb09 [2] (1TB in size) from CMU Reuse of map tasks and faster communication pays off [1] Pagerank Algorithm, http://en.wikipedia.org/wiki/PageRankhttp://en.wikipedia.org/wiki/PageRank [2] ClueWeb09 Data Set, http://boston.lti.cs.cmu.edu/Data/clueweb09/http://boston.lti.cs.cmu.edu/Data/clueweb09/ M R Current Page ranks (Compressed) Partial Adjacency Matrix Partial Updates C Partially merged Updates Iterations

16 SALSASALSA Conclusions & Future Work Twister extends the MapReduce to iterative algorithms Several iterative algorithms we have implemented – K-Means Clustering – Pagerank – Matrix Multiplication – Multi dimensional scaling (MDS) – Breadth First Search Integrating a distributed file system Programming with side effects yet support fault tolerance


Download ppt "SALSASALSA Twister: A Runtime for Iterative MapReduce Jaliya Ekanayake Community Grids Laboratory, Digital Science Center Pervasive Technology Institute."

Similar presentations


Ads by Google