Majd F. Sakr, Suhail Rehman and

Slides:



Advertisements
Similar presentations
Cluster Computing with Dryad Mihai Budiu, MSR-SVC LiveLabs, March 2008.
Advertisements

MAP REDUCE PROGRAMMING Dr G Sudha Sadasivam. Map - reduce sort/merge based distributed processing Best for batch- oriented processing Sort/merge is primitive.
epiC: an Extensible and Scalable System for Processing Big Data
Distributed Graph Processing Abhishek Verma CS425.
Spark: Cluster Computing with Working Sets
Distributed Computations
Dryad / DryadLINQ Slides adapted from those of Yuan Yu and Michael Isard.
GraphLab A New Parallel Framework for Machine Learning Carnegie Mellon Based on Slides by Joseph Gonzalez Mosharaf Chowdhury.
Distributed Systems CS Programming Models- Part V Replication and Consistency- Part I Lecture 18, Oct 29, 2014 Mohammad Hammoud 1.
GraphLab A New Framework for Parallel Machine Learning
Pregel: A System for Large-Scale Graph Processing
MapReduce.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
MapReduce. Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture emerging: – Cluster of.
Carnegie Mellon University GraphLab Tutorial Yucheng Low.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Cloud Computing Other High-level parallel processing languages Keke Chen.
Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert,
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
1 Dryad Distributed Data-Parallel Programs from Sequential Building Blocks Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly of Microsoft.
GraphLab: how I understood it with sample code Aapo Kyrola, Carnegie Mellon Univ. Oct 1, 2009.
CSE 486/586 CSE 486/586 Distributed Systems Graph Processing Steve Ko Computer Sciences and Engineering University at Buffalo.
Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and.
MapReduce M/R slides adapted from those of Jeff Dean’s.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.
Distributed Computations MapReduce/Dryad M/R slides adapted from those of Jeff Dean’s Dryad slides adapted from those of Michael Isard.
Dryad and DryaLINQ. Dryad and DryadLINQ Dryad provides automatic distributed execution DryadLINQ provides automatic query plan generation Dryad provides.
Distributed Systems CS
Data Structures and Algorithms in Parallel Computing
Pregel: A System for Large-Scale Graph Processing Nov 25 th 2013 Database Lab. Wonseok Choi.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.
Data Parallel and Graph Parallel Systems for Large-scale Data Processing Presenter: Kun Li.
Department of Computer Science, Johns Hopkins University Pregel: BSP and Message Passing for Graph Computations EN Randal Burns 14 November 2013.
MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.
EpiC: an Extensible and Scalable System for Processing Big Data Dawei Jiang, Gang Chen, Beng Chin Ooi, Kian Lee Tan, Sai Wu School of Computing, National.
Lecture 3 – MapReduce: Implementation CSE 490h – Introduction to Distributed Computing, Spring 2009 Except as otherwise noted, the content of this presentation.
TensorFlow– A system for large-scale machine learning
Some slides adapted from those of Yuan Yu and Michael Isard
Hadoop.
Distributed Programming in “Big Data” Systems Pramod Bhatotia wp
CSCI5570 Large Scale Data Processing Systems
Large-scale file systems and Map-Reduce
Spark Presentation.
Parallel Programming By J. H. Wang May 2, 2017.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn.
PREGEL Data Management in the Cloud
Distributed Computations MapReduce/Dryad
Abstract Major Cloud computing companies have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for.
Apache Spark Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing Aditya Waghaye October 3, 2016 CS848 – University.
Parallel Computing with Dryad
MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner
Database Applications (15-415) Hadoop Lecture 26, April 19, 2016
Distributed Systems CS
MapReduce Simplied Data Processing on Large Clusters
湖南大学-信息科学与工程学院-计算机与科学系
February 26th – Map/Reduce
Distributed Systems CS
Distributed Systems CS
Cse 344 May 4th – Map/Reduce.
Replication-based Fault-tolerance for Large-scale Graph Processing
CS 345A Data Mining MapReduce This presentation has been altered.
Distributed Systems CS
CS639: Data Management for Data Science
5/7/2019 Map Reduce Map reduce.
Computational Advertising and
COS 518: Distributed Systems Lecture 11 Mike Freedman
MapReduce: Simplified Data Processing on Large Clusters
CS639: Data Management for Data Science
Presentation transcript:

Majd F. Sakr, Suhail Rehman and Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Today… Last session Pregel Today’s session Dryad and GraphLab Announcement: Project Phases I-A and I-B are due today © Carnegie Mellon University in Qatar

Discussion on Programming Models Objectives Discussion on Programming Models Pregel, Dryad and GraphLab Pregel, Dryad and GraphLab MapReduce Message Passing Interface (MPI) Examples of parallel processing Traditional models of parallel programming Parallel computer architectures Why parallelism? Last 3 Sessions © Carnegie Mellon University in Qatar

Dryad © Carnegie Mellon University in Qatar

Dryad In this part, the following concepts of Dryad will be described: Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad © Carnegie Mellon University in Qatar

Dryad In this part, the following concepts of Dryad will be described: Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad © Carnegie Mellon University in Qatar

Dryad Dryad is a general purpose, high-performance, distributed computation engine Dryad is designed for: High-throughput Data-parallel computation Use in a private datacenter Computation is expressed as a directed-acyclic-graph (DAG) Vertices represent programs Edges represent data channels between vertices © Carnegie Mellon University in Qatar

Unix Pipes vs. Dryad DAG © Carnegie Mellon University in Qatar

Dryad Job Structure grep1000 | sed500 | sort1000 | awk500 | perl50 Channels Input files Stage Output files sort grep awk sed perl grep sort This is the basic Dryad terminology. awk sed grep sort Vertices (processes) © Carnegie Mellon University in Qatar

Dryad In this part, the following concepts of Dryad will be described: Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad © Carnegie Mellon University in Qatar

Dryad System Organization There are 3 roles for machines in Dryad Job Manager (JM) Name Server (NS) Daemon (D) The JM can run within the cluster or on a user’s workstation with network access to the cluster. It contains the app-specific code to construct the job’s communication graph along with library code to schedule the work across the available resources. All data is sent directly between vertices and thus the job manager is only responsible for control decisions and is not a bottleneck for any data transfers. NS can be used to enumerate all the available computers. NS also exposes the position of each computer within the network topology so that scheduling decisions can take account of locality. D is responsible for creating processes on behalf of the JM. The binary of V is sent from JM to D for execution. © Carnegie Mellon University in Qatar

Program Execution (1) The Job Manager (JM): Creates the job communication graph (job schedule) Contacts the NS to determine the number of Ds and the topology Assigns Vs to each D (using a simple task scheduler- not described) for execution Coordinates data flow through the data plane Data is distributed using a distributed storage system that shares with the Google File System some properties (e.g., data are split into chunks and replicated across machines) Dryad also supports the use of NTFS for accessing files locally © Carnegie Mellon University in Qatar

4. Query cluster resources Program Execution (2) 1. Build 2. Send .exe 7. Serialize vertices vertex code 5. Generate graph JM code Computation Staging Cluster services 6. Initialize vertices 3. Start JM 8. Monitor Vertex execution 4. Query cluster resources © Carnegie Mellon University in Qatar

Data Channels in Dryad X Items M Data items can be shuffled between vertices through data channels Data channels can be: Shared Memory FIFOs (intra-machine) TCP Streams (inter-machine) SMB/NTFS Local Files (temporary) Distributed File System (persistent) The performance and fault tolerance of these mechanisms vary Data channels are abstracted for maximum flexibility X Items M © Carnegie Mellon University in Qatar

Dryad In this part, the following concepts of Dryad will be described: Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad © Carnegie Mellon University in Qatar

Dryad Graph Description Language Here are some operators in the Dryad graph description language: n n B B B B C D n A A n n A A A A B AS = A^n (Cloning) AS >= BS (Pointwise Composition) AS >> BS (Bipartite Composition) (B>=C) || (B>=D) (Merge) © Carnegie Mellon University in Qatar

Example Program in Dryad (1) Skyserver SQL Query (Q18): Find all the objects in the database that have neighboring objects within 30 arc seconds such that at least one of the neighbors has a color similar to the primary object’s color There are two tables involved photoObjAll and it has 354,254,163 records Neighbors and it has 2,803,165,372 records For the equivalent Dryad computation, they extracted the columns of interest into two binary files, “ugriz.bin” and “neighbors.bin” H n Y Y L L 4n S S 4n M M n D D n X X U N U N © Carnegie Mellon University in Qatar

Example Program in Dryad (2) S Y H n X U N L Example Program in Dryad (2) [distinct] [merge outputs] (u.color,n.neighborobjid) [re-partition by n.neighborobjid] [order by n.neighborobjid] Took SQL plan Manually coded in Dryad Manually partitioned data select u.color,n.neighborobjid from u join n where u.objid = n.objid select u.objid from u join <temp> where u.objid = <temp>.neighborobjid and |u.color - <temp>.color| < d u: objid, color n: objid, neighborobjid [partition by objid] This is an explanation of the SkyServer Query plan. © Carnegie Mellon University in Qatar

Example Program in Dryad (3) S Y H n X U N L Here is the corresponding Dryad code: GraphBuilder XSet = moduleX^N; GraphBuilder DSet = moduleD^N; GraphBuilder MSet = moduleM^(N*4); GraphBuilder SSet = moduleS^(N*4); GraphBuilder YSet = moduleY^N; GraphBuilder HSet = moduleH^1; GraphBuilder XInputs = (ugriz1 >= XSet) || (neighbor >= XSet); GraphBuilder YInputs = ugriz2 >= YSet; GraphBuilder XToY = XSet >= DSet >> MSet >= SSet; for (i = 0; i < N*4; ++i) { XToY = XToY || (SSet.GetVertex(i) >= YSet.GetVertex(i/4)); } GraphBuilder YToH = YSet >= HSet; GraphBuilder HOutputs = HSet >= output; GraphBuilder final = XInputs || YInputs || XToY || YToH || HOutputs; © Carnegie Mellon University in Qatar

Dryad In this part, the following concepts of Dryad will be described: Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad © Carnegie Mellon University in Qatar

Fault Tolerance in Dryad (1) Dryad is designed to handle two types of failures: Vertex failures Channel failures Vertex failures are handled by the JM and the failed vertex is re-executed on another machine Channel failures cause the preceding vertex to be re-executed © Carnegie Mellon University in Qatar

Fault Tolerance in Dryad (2) X[0] X[1] X[3] X[2] X’[2] Slow vertex Duplicate vertex Completed vertices The handling of apparently very slow computation by duplication of vertices is handled by a stage manager. Duplication Policy = f(running times, data volumes) © Carnegie Mellon University in Qatar

GraphLab © Carnegie Mellon University in Qatar

GraphLab In this part, the following concepts of GraphLab will be described: Motivation for GraphLab GraphLab Data Model and Update Mechanisms Scheduling in GraphLab Consistency Models in GraphLab PageRank in GraphLab © Carnegie Mellon University in Qatar

GraphLab In this part, the following concepts of GraphLab will be described: Motivation for GraphLab GraphLab Data Model and Update Mechanisms Scheduling in GraphLab Consistency Models in GraphLab PageRank in GraphLab © Carnegie Mellon University in Qatar

Motivation for GraphLab Shortcomings of MapReduce Interdependent data computation difficult to perform Overheads of running jobs iteratively – disk access and startup overhead Communication pattern is not user definable/flexible Shortcomings of Pregel BSP model requires synchronous computation One slow machine can slow down the entire computation considerably Shortcomings of Dryad Very flexible but steep learning curve for the programming model © Carnegie Mellon University in Qatar

Update Functions and Scopes GraphLab GraphLab is a framework for parallel machine learning Scheduling Data Graph Shared Data Table Update Functions and Scopes © Carnegie Mellon University in Qatar

GraphLab In this part, the following concepts of GraphLab will be described: Motivation for GraphLab GraphLab Data Model and Update Mechanisms Scheduling in GraphLab Consistency Models in GraphLab PageRank in GraphLab © Carnegie Mellon University in Qatar

Data Graph A graph in GraphLab is associated with data at every vertex and edge Data Graph Arbitrary blocks of data can be assigned to vertices and edges © Carnegie Mellon University in Qatar

Update Functions The data graph is modified using update functions Sv The update function can modify a vertex v and its neighborhood, defined as the scope of v (Sv) Sv v © Carnegie Mellon University in Qatar

Shared Data Table Certain algorithms require global information that is shared among all vertices (Algorithm Parameters, Statistics, etc.) GraphLab exposes a Shared Data Table (SDT) SDT is an associative map between keys and arbitrary blocks of data T[Key] → Value The shared data table is updated using the sync mechanism Shared Data Table © Carnegie Mellon University in Qatar

Sync Mechanism Shared Data Table Similar to Reduce in MapReduce User can define fold, merge and apply functions that are triggered during the global sync mechanism Fold function allows the user to sequentially aggregate information across all vertices Merge optionally allows user to perform a parallel tree reduction on the aggregated data collected during the fold operation Apply function allows the user to finalize the resulting value from the fold/merge operations (such as normalization etc.) sync Shared Data Table © Carnegie Mellon University in Qatar

GraphLab In this part, the following concepts of GraphLab will be described: Motivation for GraphLab GraphLab Data Model and Update Mechanisms Scheduling in GraphLab Consistency Models in GraphLab PageRank in GraphLab © Carnegie Mellon University in Qatar

Scheduling in GraphLab (1) The scheduler determines the order that vertices are updated CPU 1 e f g k j i h d c b a b c Scheduler e f b a h i i j CPU 2 The process repeats until the scheduler is empty © Carnegie Mellon University in Qatar

Scheduling in GraphLab (2) An update schedule defines the order in which update functions are applied to vertices A parallel data-structure called the scheduler represents an abstract list of tasks to be executed in Graphlab Base (Vertex) schedulers in GraphLab Synchronous scheduler Round-robin scheduler Job Schedulers in GraphLab FIFO scheduler Priority scheduler Custom schedulers can be defined by the set scheduler Termination Assessment If the scheduler has no remaining tasks Or, a termination function can be defined to check for convergence in the data © Carnegie Mellon University in Qatar

GraphLab In this part, the following concepts of GraphLab will be described: Motivation for GraphLab GraphLab Data Model and Update Mechanisms Scheduling in GraphLab Consistency Models in GraphLab PageRank in GraphLab © Carnegie Mellon University in Qatar

Need for Consistency Models How much can computation overlap? © Carnegie Mellon University in Qatar

Consistency Models in GraphLab GraphLab guarantees sequential consistency Guaranteed to give the same result as a sequential execution of the computational steps User-defined consistency models Full Consistency Vertex Consistency Edge Consistency Full Consistency Edge Consistency Vertex Consistency © Carnegie Mellon University in Qatar

GraphLab In this part, the following concepts of GraphLab will be described: Motivation for GraphLab GraphLab Data Model and Update Mechanisms Scheduling in GraphLab Consistency Models in GraphLab PageRank in GraphLab © Carnegie Mellon University in Qatar

PageRank (1) PageRank is a link analysis algorithm The rank value indicates an importance of a particular web page A hyperlink to a page counts as a vote of support A page that is linked to by many pages with high PageRank receives a high rank itself A PageRank of 0.5 means there is a 50% chance that a person clicking on a random link will be directed to the document with the 0.5 PageRank PageRank is a probability distribution used to represent the likelihood that a person randomly clicking on links will arrive at any particular page. © Carnegie Mellon University in Qatar

PageRank (2) Iterate: Where: α is the random reset probability L[j] is the number of links on page j 1 2 3 4 5 6 © Carnegie Mellon University in Qatar

PageRank Example in GraphLab PageRank algorithm is defined as a per-vertex operation working on the scope of the vertex pagerank(i, scope){ // Get Neighborhood data (R[i], Wij, R[j]) scope; // Update the vertex data // Reschedule Neighbors if needed if R[i] changes then reschedule_neighbors_of(i); } Dynamic computation © Carnegie Mellon University in Qatar

How MapReduce, Pregel, Dryad and GraphLab Compare Against Each Other? © Carnegie Mellon University in Qatar

Comparison of the Programming Models MapReduce Pregel Dryad GraphLab Programming Model Fixed Functions – Map and Reduce Supersteps over a data graph with messages passed DAG with program vertices and data edges Data graph with shared data table and update functions Parallelism Concurrent execution of tasks within map and reduce phases Concurrent execution of user functions over vertices within a superstep Concurrent execution of vertices during a stage Concurrent execution of non-overlapping scopes, defined by consistency model Data Handling Distributed file system Flexible data channels: Memory, Files, DFS etc. Undefined – Graphs can be in memory or on disk Task Scheduling Fixed Phases – HDFS Locality based map task assignment Partitioned Graph and Inputs assigned by assignment functions Job and Stage Managers assign vertices to avaiabled daemons Pluggable schedulers to schedule update functions Fault Tolerance DFS replication + Task reassignment / Speculative execution of Tasks Checkpointing and superstep re-execution Vertex and Edge failure recovery Synchronous and asychronous snapshots MapReduce Pregel Dryad GraphLab Programming Model Fixed Functions – Map and Reduce Supersteps over a data graph with messages passed DAG with program vertices and data edges Data graph with shared data table and update functions Parallelism Concurrent execution of tasks within map and reduce phases Concurrent execution of user functions over vertices within a superstep Concurrent execution of vertices during a stage Concurrent execution of non-overlapping scopes, defined by consistency model Data Handling Distributed file system Flexible data channels: Memory, Files, DFS etc. Undefined – Graphs can be in memory or on disk Task Scheduling Fixed Phases – HDFS Locality based map task assignment Partitioned Graph and Inputs assigned by assignment functions Job and Stage Managers assign vertices to available daemons Pluggable schedulers to schedule update functions Fault Tolerance DFS replication + Task reassignment / Speculative execution of Tasks Checkpointing and superstep re-execution Vertex and Edge failure recovery Synchronous and asychronous snapshots Developed by Google Microsoft Carnegie Mellon MapReduce Pregel Dryad GraphLab Programming Model Fixed Functions – Map and Reduce Supersteps over a data graph with messages passed DAG with program vertices and data edges Data graph with shared data table and update functions Parallelism Concurrent execution of tasks within map and reduce phases Concurrent execution of user functions over vertices within a superstep Concurrent execution of vertices during a stage Concurrent execution of non-overlapping scopes, defined by consistency model Data Handling Distributed file system Flexible data channels: Memory, Files, DFS etc. Undefined – Graphs can be in memory or on disk Task Scheduling Fixed Phases – HDFS Locality based map task assignment Partitioned Graph and Inputs assigned by assignment functions Job and Stage Managers assign vertices to avaiabled daemons Pluggable schedulers to schedule update functions MapReduce Pregel Dryad GraphLab Programming Model Fixed Functions – Map and Reduce Supersteps over a data graph with messages passed DAG with program vertices and data edges Data graph with shared data table and update functions Parallelism Concurrent execution of tasks within map and reduce phases Concurrent execution of user functions over vertices within a superstep Concurrent execution of vertices during a stage Concurrent execution of non-overlapping scopes, defined by consistency model Data Handling Distributed file system Flexible data channels: Memory, Files, DFS etc. Undefined – Graphs can be in memory or on disk MapReduce Pregel Dryad GraphLab Programming Model Fixed Functions – Map and Reduce Supersteps over a data graph with messages passed DAG with program vertices and data edges Data graph with shared data table and update functions Parallelism Concurrent execution of tasks within map and reduce phases Concurrent execution of user functions over vertices within a superstep Concurrent execution of vertices during a stage Concurrent execution of non-overlapping scopes, defined by consistency model MapReduce Pregel Dryad GraphLab Programming Model Fixed Functions – Map and Reduce Supersteps over a data graph with messages passed DAG with program vertices and data edges Data graph with shared data table and update functions © Carnegie Mellon University in Qatar

References This presentation has elements borrowed from various papers and presentations: Papers: Pregel: http://kowshik.github.com/JPregel/pregel_paper.pdf Dryad: http://research.microsoft.com/pubs/63785/eurosys07.pdf GraphLab: http://www.select.cs.cmu.edu/publications/paperdir/uai2010-low-gonzalez-kyrola-bickson-guestrin-hellerstein.pdf Presentations: Dryad Presentation at Berkeley by M. Budiu: http://budiu.info/work/dryad-talk-berkeley09.pptx GraphLab1 Presentation: http://graphlab.org/uai2010_graphlab.pptx GraphLab2 Presentation: http://graphlab.org/presentations/nips-biglearn-2011.pptx © Carnegie Mellon University in Qatar

Distributed File Systems Next Class Distributed File Systems © Carnegie Mellon University in Qatar