SALSASALSASALSASALSA IU Twister Supports Data Intensive Science Applications School of Informatics and Computing Indiana University.

Slides:

Advertisements

Similar presentations

SALSASALSA Twister: A Runtime for Iterative MapReduce Jaliya Ekanayake Community Grids Laboratory, Digital Science Center Pervasive Technology Institute.

Advertisements

Piccolo: Building fast distributed programs with partitioned tables Russell Power Jinyang Li New York University.

MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.

SALSA HPC Group School of Informatics and Computing Indiana University.

SALSASALSASALSASALSA Applying Twister for Scientific Applications NSF Cloud PI Workshop March 17, 2011 Judy Qiu School of Informatics.

Introduction to Programming Paradigms Activity at Data Intensive Workshop Shantenu Jha represented by Geoffrey Fox

Twister4Azure Iterative MapReduce for Windows Azure Cloud Thilina Gunarathne Indiana University Iterative MapReduce for Azure Cloud.

Hybrid MapReduce Workflow Yang Ruan, Zhenhua Guo, Yuduo Zhou, Judy Qiu, Geoffrey Fox Indiana University, US.

Piccolo – Paper Discussion Big Data Reading Group 9/20/2010.

SALSASALSASALSASALSA Chemistry in the Digital Age Workshop, Penn State University, June 11, 2009 Geoffrey Fox

SALSASALSASALSASALSA Using Cloud Technologies for Bioinformatics Applications MTAGS Workshop SC09 Portland Oregon November Judy Qiu

Thesis Defense, 12/20/2010 Student: Jaliya Ekanayake

Authors: Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox Publish: HPDC'10, June 20–25, 2010, Chicago, Illinois, USA ACM Speaker: Jia Bao Lin.

Distributed Computations

Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland Tuesday, June 29, 2010 This work is licensed.

Parallel Data Analysis from Multicore to Cloudy Grids Indiana University Geoffrey Fox, Xiaohong Qiu, Scott Beason, Seung-Hee.

Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.

Grids and Clouds for Cyberinfrastructure IIT June Geoffrey Fox

SALSASALSASALSASALSA High Performance Biomedical Applications Using Cloud Technologies HPC and Grid Computing in the Cloud Workshop (OGF27 ) October 13,

Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland MLG, January, 2014 Jaehwan Lee.

Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.

Panel Session The Challenges at the Interface of Life Sciences and Cyberinfrastructure and how should we tackle them? Chris Johnson, Geoffrey Fox, Shantenu.

Big Data Challenge Mega 10^6 Giga 10^9 Tera 10^12 Peta 10^15 Pig Latin.

Applying Twister to Scientific Applications CloudCom 2010 Indianapolis, Indiana, USA Nov 30 – Dec 3, 2010.

By: Jeffrey Dean & Sanjay Ghemawat Presented by: Warunika Ranaweera Supervised by: Dr. Nalin Ranasinghe.

MapReduce. Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture emerging: – Cluster of.

School of Informatics and Computing Indiana University

Cloud Data mining and FutureGrid SC10 New Orleans LA AIST Booth November Geoffrey Fox

Science in Clouds SALSA Team salsaweb/salsa Community Grids Laboratory, Digital Science Center Pervasive Technology Institute Indiana University.

MapReduce TG11 BOF FutureGrid Team (Geoffrey Fox) TG11 19 July 2011 Downtown Marriott Salt Lake City.

SALSASALSA Twister: A Runtime for Iterative MapReduce Jaliya Ekanayake Community Grids Laboratory, Digital Science Center Pervasive Technology Institute.

HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.

1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.

MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.

Portable Parallel Programming on Cloud and HPC: Scientific Applications of Twister4Azure Thilina Gunarathne Bingjing Zhang, Tak-Lon.

SALSASALSASALSASALSA MSR Internship – Final Presentation Jaliya Ekanayake School of Informatics and Computing Indiana University.

SALSASALSASALSASALSA Design Pattern for Scientific Applications in DryadLINQ CTP DataCloud-SC11 Hui Li Yang Ruan, Yuduo Zhou Judy Qiu, Geoffrey Fox.

Parallel Applications And Tools For Cloud Computing Environments Azure MapReduce Large-scale PageRank with Twister Twister BLAST Thilina Gunarathne, Stephen.

SALSASALSASALSASALSA CloudComp 09 Munich, Germany Jaliya Ekanayake, Geoffrey Fox School of Informatics and Computing Pervasive.

SALSA HPC Group School of Informatics and Computing Indiana University.

Cloud Technologies and Data Intensive Applications INGRID 2010 Workshop Poznan Poland May Geoffrey Fox

Cloud Technologies and Data Intensive Applications INGRID 2010 Workshop Poznan Poland May Geoffrey Fox

6th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS), Nov. 17, 2013 Judy Qiu SALSA hpc.indiana.edu.

SALSASALSASALSASALSA FutureGrid Venus-C June Geoffrey Fox

SALSA Group’s Collaborations with Microsoft SALSA Group Principal Investigator Geoffrey Fox Project Lead Judy Qiu Scott Beason,

SALSASALSASALSASALSA Clouds Ball Aerospace March Geoffrey Fox

X-Informatics MapReduce February Geoffrey Fox Associate Dean for Research.

Looking at Use Case 19, 20 Genomics 1st JTC 1 SGBD Meeting SDSC San Diego March Judy Qiu Shantenu Jha (Rutgers) Geoffrey Fox

Performance of MapReduce on Multicore Clusters

Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications Thilina Gunarathne, Tak-Lon Wu Judy Qiu, Geoffrey Fox School of Informatics,

SALSA Group Research Activities April 27, Research Overview  MapReduce Runtime  Twister  Azure MapReduce  Dryad and Parallel Applications 

PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.

Parallel Applications And Tools For Cloud Computing Environments CloudCom 2010 Indianapolis, Indiana, USA Nov 30 – Dec 3, 2010.

MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.

Memcached Integration with Twister Saliya Ekanayake - Jerome Mitchell - Yiming Sun -

SALSASALSASALSASALSA Data Intensive Biomedical Computing Systems Statewide IT Conference October 1, 2009, Indianapolis Judy Qiu

SALSASALSA Dynamic Virtual Cluster provisioning via XCAT on iDataPlex Supports both stateful and stateless OS images iDataplex Bare-metal Nodes Linux Bare-

SALSASALSA Harp: Collective Communication on Hadoop Judy Qiu, Indiana University.

SALSA HPC Group School of Informatics and Computing Indiana University Workshop on Petascale Data Analytics: Challenges, and.

Implementation of Classifier Tool in Twister Magesh khanna Vadivelu Shivaraman Janakiraman.

Our Objectives Explore the applicability of Microsoft technologies to real world scientific domains with a focus on data intensive applications Expect.

Applying Twister to Scientific Applications

MapReduce Simplied Data Processing on Large Clusters

MapReduce for Data Intensive Scientific Analyses

湖南大学-信息科学与工程学院-计算机与科学系

SC09 Doctoral Symposium, Portland, 11/18/2009

Scientific Data Analytics on Cloud and HPC Platforms

Clouds from FutureGrid’s Perspective

COS 518: Distributed Systems Lecture 11 Mike Freedman

MapReduce: Simplified Data Processing on Large Clusters

Presentation transcript:

SALSASALSASALSASALSA IU Twister Supports Data Intensive Science Applications School of Informatics and Computing Indiana University

SALSASALSA Application Classes 1 SynchronousLockstep Operation as in SIMD architectures SIMD 2 Loosely Synchronous Iterative Compute-Communication stages with independent compute (map) operations for each CPU. Heart of most MPI jobs MPP 3 AsynchronousComputer Chess; Combinatorial Search often supported by dynamic threads MPP 4 Pleasingly ParallelEach component independent – in 1988, Fox estimated at 20% of total number of applications Grids 5 MetaproblemsCoarse grain (asynchronous) combinations of classes 1)- 4). The preserve of workflow. Grids 6 MapReduce++It describes file(database) to file(database) operations which has subcategories including. 1)Pleasingly Parallel Map Only 2)Map followed by reductions 3)Iterative “Map followed by reductions” – Extension of Current Technologies that supports much linear algebra and datamining Clouds Hadoop/ Dryad Twister Old classification of Parallel software/hardware in terms of 5 (becoming 6) “Application architecture” Structures)

SALSASALSA Applications & Different Interconnection Patterns Map OnlyClassic MapReduce Ite rative Reductions MapReduce++ Loosely Synchronous CAP3 Analysis Document conversion (PDF -> HTML) Brute force searches in cryptography Parametric sweeps High Energy Physics (HEP) Histograms SWG gene alignment Distributed search Distributed sorting Information retrieval Expectation maximization algorithms Clustering Linear Algebra Many MPI scientific applications utilizing wide variety of communication constructs including local interactions - CAP3 Gene Assembly - PolarGrid Matlab data analysis - Information Retrieval - HEP Data Analysis - Calculation of Pairwise Distances for ALU Sequences - Kmeans - Deterministic Annealing Clustering - Multidimensional Scaling MDS - Solving Differential Equations and - particle dynamics with short range forces Input Output map Input map reduce Input map reduce iterations Pij Domain of MapReduce and Iterative ExtensionsMPI

SALSASALSA Motivation Data Deluge MapReduce Classic Parallel Runtimes (MPI) Experiencing in many domains Data Centered, QoSEfficient and Proven techniques Input Output map Input map reduce Input map reduce iterations Pij Expand the Applicability of MapReduce to more classes of Applications Map-OnlyMapReduce Iterative MapReduce More Extensions

SALSASALSA Twister(MapReduce++) Streaming based communication Intermediate results are directly transferred from the map tasks to the reduce tasks – eliminates local files Cacheable map/reduce tasks Static data remains in memory Combine phase to combine reductions User Program is the composer of MapReduce computations Extends the MapReduce model to iterative computations Data Split D MR Driver User Program Pub/Sub Broker Network D File System M R M R M R M R Worker Nodes M R D Map Worker Reduce Worker MRDeamon Data Read/Write Communication Reduce (Key, List ) Iterate Map(Key, Value) Combine (Key, List ) User Program Close() Configure() Static data Static data δ flow Different synchronization and intercommunication mechanisms used by the parallel runtimes

SALSASALSA Twister New Release

SALSASALSA TwisterMPIReduce Runtime package supporting subset of MPI mapped to Twister Set-up, Barrier, Broadcast, Reduce TwisterMPIReduce PairwiseClustering MPI Multi Dimensional Scaling MPI Generative Topographic Mapping MPI Other … Azure Twister (C# C++) Java Twister Microsoft Azure FutureGrid Local Cluster Amazon EC2

SALSASALSA Iterative Computations K-means Matrix Multiplication Performance of K-Means Performance Matrix Multiplication Smith Waterman

SALSASALSA A Programming Model for Iterative MapReduce Distributed data access In-memory MapReduce Distinction on static data and variable data (data flow vs. δ flow) Cacheable map/reduce tasks (long running tasks) Combine operation Support fast intermediate data transfers Reduce (Key, List ) Iterate Map(Key, Value) Combine (Map ) User Program Close() Configure() Static data Static data δ flow Twister Constraints for Side Effect Free map/reduce tasks Computation Complexity >> Complexity of Size of the Mutant Data (State)

SALSASALSA Iterative MapReduce using Existing Runtimes Focuses mainly on single step map->reduce computations Considerable overheads from: Reinitializing tasks Reloading static data Communication & data transfers Reduce (Key, List ) Iterate Map(Key, Value) Main Program Static Data Loaded in Every Iteration Static Data Loaded in Every Iteration Variable Data – e.g. Hadoop distributed cache Local disk -> HTTP -> Local disk Reduce outputs are saved into multiple files New map/reduce tasks in every iteration

SALSASALSA Features of Existing Architectures(1) Programming Model – MapReduce (Optionally “map-only”) – Focus on Single Step MapReduce computations (DryadLINQ supports more than one stage) Input and Output Handling – Distributed data access (HDFS in Hadoop, Sector in Sphere, and shared directories in Dryad) – Outputs normally goes to the distributed file systems Intermediate data – Transferred via file systems (Local disk-> HTTP -> local disk in Hadoop) – Easy to support fault tolerance – Considerably high latencies Google, Apache Hadoop, Sector/Sphere, Dryad/DryadLINQ (DAG based)

SALSASALSA Features of Existing Architectures(2) Scheduling – A master schedules tasks to slaves depending on the availability – Dynamic Scheduling in Hadoop, static scheduling in Dryad/DryadLINQ – Naturally load balancing Fault Tolerance – Data flows through disks->channels->disks – A master keeps track of the data products – Re-execution of failed or slow tasks – Overheads are justifiable for large single step MapReduce computations – Iterative MapReduce

SALSASALSA Iterative MapReduce using Twister Reduce (Key, List ) Iterate Map(Key, Value) Main Program Static Data Loaded only once Static Data Loaded only once Direct data transfer via pub/sub Combiner operation to collect all reduce outputs Long running map/reduce tasks (cached) Configure() Combine (Map ) Distributed data access Distinction on static data and variable data (data flow vs. δ flow) Cacheable map/reduce tasks (long running tasks) Combine operation Support fast intermediate data transfers

SALSASALSA Twister Architecture Worker Node Local Disk Worker Pool Twister Daemon Master Node Twister Driver Main Program B B B B Pub/sub Broker Network Worker Node Local Disk Worker Pool Twister Daemon Scripts perform: Data distribution, data collection, and partition file creation map reduce Cacheable tasks One broker serves several Twister daemons

SALSASALSA Twister Programming Model configureMaps(..) Two configuration options : 1.Using local disks (only for maps) 2.Using pub-sub bus configureReduce(..) runMapReduce(..) while(condition){ } //end while updateCondition() close() User program’s process space Combine() operation Reduce() Map() Worker Nodes Communications/data transfers via the pub-sub broker network Iterations May send pairs directly Local Disk Cacheable map/reduce tasks

SALSASALSA Twister API 1.configureMaps(PartitionFile partitionFile) 2.configureMaps(Value[] values) 3.configureReduce(Value[] values) 4.runMapReduce() 5.runMapReduce(KeyValue[] keyValues) 6.runMapReduceBCast(Value value) 7.map(MapOutputCollector collector, Key key, Value val) 8.reduce(ReduceOutputCollector collector, Key key,List values) 9.combine(Map keyValues) 1.configureMaps(PartitionFile partitionFile) 2.configureMaps(Value[] values) 3.configureReduce(Value[] values) 4.runMapReduce() 5.runMapReduce(KeyValue[] keyValues) 6.runMapReduceBCast(Value value) 7.map(MapOutputCollector collector, Key key, Value val) 8.reduce(ReduceOutputCollector collector, Key key,List values) 9.combine(Map keyValues)

SALSASALSA Input/Output Handling Data Manipulation Tool: – Provides basic functionality to manipulate data across the local disks of the compute nodes – Data partitions are assumed to be files (Contrast to fixed sized blocks in Hadoop) – Supported commands: mkdir, rmdir, put,putall,get,ls, Copy resources Create Partition File Node 0Node 1Node n A common directory in local disks of individual nodes e.g. /tmp/twister_data Data Manipulation Tool Partition File

SALSASALSA Partition file allows duplicates One data partition may reside in multiple nodes In an event of failure, the duplicates are used to re- schedule the tasks File NoNode IPDaemon NoFile partition path /home/jaliya/data/mds/GD-4D-23.bin /home/jaliya/data/mds/GD-4D-0.bin /home/jaliya/data/mds/GD-4D-27.bin /home/jaliya/data/mds/GD-4D-20.bin /home/jaliya/data/mds/GD-4D-23.bin /home/jaliya/data/mds/GD-4D-25.bin /home/jaliya/data/mds/GD-4D-18.bin /home/jaliya/data/mds/GD-4D-15.bin

SALSASALSA The use of pub/sub messaging Intermediate data transferred via the broker network Network of brokers used for load balancing – Different broker topologies Interspersed computation and data transfer minimizes large message load at the brokers Currently supports – NaradaBrokering – ActiveMQ 100 map tasks, 10 workers in 10 nodes Reduce() map task queues Map workers Broker network E.g. ~ 10 tasks are producing outputs at once

SALSASALSA Twister Applications Twister extends the MapReduce to iterative algorithms Several iterative algorithms we have implemented – Matrix Multiplication – K-Means Clustering – Pagerank – Breadth First Search – Multi dimensional scaling (MDS) Non iterative applications – HEP Histogram – Biology All Pairs using Smith Waterman Gotoh algorithm – Twister Blast

SALSASALSA 21 High Energy Physics Data Analysis Input to a map task: key = Some Id value = HEP file Name Output of a map task: key = random # (0<= num<= max reduce tasks) value = Histogram as binary data Input to a reduce task: > key = random # (0<= num<= max reduce tasks) value = List of histogram as binary data Output from a reduce task: value value = Histogram file Combine outputs from reduce tasks to form the final histogram An application analyzing data from Large Hadron Collider (1TB but 100 Petabytes eventually)

SALSASALSA 22 Reduce Phase of Particle Physics “Find the Higgs” using Dryad Combine Histograms produced by separate Root “Maps” (of event data to partial histograms) into a single Histogram delivered to Client This is an example using MapReduce to do distributed histogramming. Higgs in Monte Carlo

SALSASALSA All-Pairs Using DryadLINQ Calculate Pairwise Distances (Smith Waterman Gotoh) 125 million distances 4 hours & 46 minutes 125 million distances 4 hours & 46 minutes Calculate pairwise distances for a collection of genes (used for clustering, MDS) Fine grained tasks in MPI Coarse grained tasks in DryadLINQ Performed on 768 cores (Tempest Cluster) Moretti, C., Bui, H., Hollingsworth, K., Rich, B., Flynn, P., & Thain, D. (2009). All-Pairs: An Abstraction for Data Intensive Computing on Campus Grids. IEEE Transactions on Parallel and Distributed Systems, 21,

SALSASALSA Dryad versus MPI for Smith Waterman Flat is perfect scaling

SALSASALSA Pairwise Sequence Comparison using Smith Waterman Gotoh Typical MapReduce computation Comparable efficiencies Twister performs the best

SALSASALSA

SALSASALSA Points distributions in n dimensional space Identify a given number of cluster centers Use Euclidean distance to associate points to cluster centers Refine the cluster centers iteratively K-Means Clustering N- dimension space Euclidean Distance

SALSASALSA Map tasks calculates Euclidean distance from each point in its partition to each cluster center Map tasks assign points to cluster centers and sum the partial cluster center values Emit cluster center sums + number of points assigned Reduce task sums all the corresponding partial sums and calculate new cluster centers K-Means Clustering - MapReduce map reduce Main Program While(){ } n th cluster centers (n+1) th cluster centers Each map task processes a data partition

SALSASALSA Pagerank – An Iterative MapReduce Algorithm Well-known pagerank algorithm [1] Used ClueWeb09 [2] (1TB in size) from CMU Reuse of map tasks and faster communication pays off [1] Pagerank Algorithm, [2] ClueWeb09 Data Set, M R Current Page ranks (Compressed) Partial Adjacency Matrix Partial Updates C Partially merged Updates Iterations

SALSASALSA Multi-dimensional Scaling Maps high dimensional data to lower dimensions (typically 2D or 3D) SMACOF (Scaling by Majorizing of COmplicated Function)[1] [1] J. de Leeuw, "Applications of convex analysis to multidimensional scaling," Recent Developments in Statistics, pp , While(condition) { = [A] [B] C = CalcStress( ) } While(condition) { = [A] [B] C = CalcStress( ) } While(condition) { = MapReduce1([B], ) = MapReduce2([A], ) C = MapReduce3( ) } While(condition) { = MapReduce1([B], ) = MapReduce2([A], ) C = MapReduce3( ) }

SALSASALSA 343 iterations (768 CPU cores) 968 iterations (384 CPUcores ) 2916 iterations (384 CPUcores)

SALSASALSA 32 Future work of Twister  Integrating a distributed file system  Integrating with a high performance messaging system  Programming with side effects yet support fault tolerance

SALSASALSA 33 l

SALSASALSA University of Arkansas Indiana University University of California at Los Angeles Penn State Iowa State Univ.Illinois at Chicago University of Minnesota Michigan State Notre Dame University of Texas at El Paso IBM Almaden Research Center Washington University San Diego Supercomputer Center University of Florida Johns Hopkins July 26-30, 2010 NCSA Summer School Workshop Students learning about Twister & Hadoop MapReduce technologies, supported by FutureGrid.

SALSASALSA

SALSASALSA 36