Parallel Programming Models Basic question: what is the “right” way to write parallel programs –And deal with the complexity of finding parallelism, coarsening.

Slides:



Advertisements
Similar presentations
Lecture 12: MapReduce: Simplified Data Processing on Large Clusters Xiaowei Yang (Duke University)
Advertisements

MAP REDUCE PROGRAMMING Dr G Sudha Sadasivam. Map - reduce sort/merge based distributed processing Best for batch- oriented processing Sort/merge is primitive.
Mapreduce and Hadoop Introduce Mapreduce and Hadoop
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
Computations have to be distributed !
Distributed Computations
MapReduce: Simplified Data Processing on Large Clusters Cloud Computing Seminar SEECS, NUST By Dr. Zahid Anwar.
Google’s Map Reduce. Commodity Clusters Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture.
MapReduce Dean and Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, Vol. 51, No. 1, January Shahram.
Homework 2 In the docs folder of your Berkeley DB, have a careful look at documentation on how to configure BDB in main memory. In the docs folder of your.
Google’s Map Reduce. Commodity Clusters Web data sets can be very large – Tens to hundreds of terabytes Standard architecture emerging: – Cluster of commodity.
MapReduce Simplified Data Processing on Large Clusters Google, Inc. Presented by Prasad Raghavendra.
Distributed Computations MapReduce
7/14/2015EECS 584, Fall MapReduce: Simplied Data Processing on Large Clusters Yunxing Dai, Huan Feng.
L22: SC Report, Map Reduce November 23, Map Reduce What is MapReduce? Example computing environment How it works Fault Tolerance Debugging Performance.
MapReduce Simplified Data Processing On large Clusters Jeffery Dean and Sanjay Ghemawat.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang
Parallel Programming Models Basic question: what is the “right” way to write parallel programs –And deal with the complexity of finding parallelism, coarsening.
SIDDHARTH MEHTA PURSUING MASTERS IN COMPUTER SCIENCE (FALL 2008) INTERESTS: SYSTEMS, WEB.
Hadoop Ida Mele. Parallel programming Parallel programming is used to improve performance and efficiency In a parallel program, the processing is broken.
MapReduce.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
By: Jeffrey Dean & Sanjay Ghemawat Presented by: Warunika Ranaweera Supervised by: Dr. Nalin Ranasinghe.
MapReduce. Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture emerging: – Cluster of.
Google MapReduce Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat Google, Inc. Presented by Conroy Whitney 4 th year CS – Web Development.
MapReduce: Simplified Data Processing on Large Clusters 컴퓨터학과 김정수.
Introduction to MapReduce Amit K Singh. “The density of transistors on a chip doubles every 18 months, for the same cost” (1965) Do you recognize this.
Süleyman Fatih GİRİŞ CONTENT 1. Introduction 2. Programming Model 2.1 Example 2.2 More Examples 3. Implementation 3.1 ExecutionOverview 3.2.
Map Reduce and Hadoop S. Sudarshan, IIT Bombay
Take a Close Look at MapReduce Xuanhua Shi. Acknowledgement  Most of the slides are from Dr. Bing Chen,
Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.
MapReduce – An overview Medha Atre (May 7, 2008) Dept of Computer Science Rensselaer Polytechnic Institute.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
Map Reduce: Simplified Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat Google, Inc. OSDI ’04: 6 th Symposium on Operating Systems Design.
MAP REDUCE : SIMPLIFIED DATA PROCESSING ON LARGE CLUSTERS Presented by: Simarpreet Gill.
Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and.
MapReduce How to painlessly process terabytes of data.
Google’s MapReduce Connor Poske Florida State University.
MapReduce M/R slides adapted from those of Jeff Dean’s.
CSC 660: Advanced Operating SystemsSlide #1 CSC 660: Advanced OS Concurrent Programming.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
Lecture 2 – MapReduce: Theory and Implementation CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of.
Information Retrieval Lecture 9. Outline Map Reduce, cont. Index compression [Amazon Web Services]
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
Chapter 5 Ranking with Indexes 1. 2 More Indexing Techniques n Indexing techniques:  Inverted files - best choice for most applications  Suffix trees.
MapReduce : Simplified Data Processing on Large Clusters P 謝光昱 P 陳志豪 Operating Systems Design and Implementation 2004 Jeffrey Dean, Sanjay.
C-Store: MapReduce Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 22, 2009.
MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.
MapReduce: simplified data processing on large clusters Jeffrey Dean and Sanjay Ghemawat.
MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
Lecture 3 – MapReduce: Implementation CSE 490h – Introduction to Distributed Computing, Spring 2009 Except as otherwise noted, the content of this presentation.
Map Reduce.
MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner
MapReduce Simplied Data Processing on Large Clusters
COS 418: Distributed Systems Lecture 1 Mike Freedman
湖南大学-信息科学与工程学院-计算机与科学系
February 26th – Map/Reduce
Map reduce use case Giuseppe Andronico INFN Sez. CT & Consorzio COMETA
Cse 344 May 4th – Map/Reduce.
CS-4513 Distributed Computing Systems Hugh C. Lauer
Distributed System Gang Wu Spring,2018.
Cloud Computing MapReduce, Batch Processing
5/7/2019 Map Reduce Map reduce.
COS 518: Distributed Systems Lecture 11 Mike Freedman
MapReduce: Simplified Data Processing on Large Clusters
Distributed Systems and Concurrency: Map Reduce
Presentation transcript:

Parallel Programming Models Basic question: what is the “right” way to write parallel programs –And deal with the complexity of finding parallelism, coarsening granularity, distributing computation and data, synchronizing, optimizing, etc. Oh, and we want to achieve high performance And make the program portable across different architectures

Parallelizing with Functional Languages It’s “easy”---the user does nothing but write the sequential program (Pure) Functional languages do not have side effects –As soon as the arguments are available to a function, the function can be executed –A function can operate only on local data (again, assuming a pure functional language) –Therefore, determining what executes in parallel is a trivial problem…but limiting the parallelism (i.e., coarsening granularity) becomes a problem (as in the Adaptive Quadrature program) –Also, functional languages do not handle data distribution, communication, optimization

Parallelizing with Functional Languages One approach was a single-assignment language –Note that functional languages are basically also single assignment Best known was SISAL, developed to be a competitor to Fortran+libraries Every variable can be written to at most once –Can express as: x = old x, meaning that conceptually there is a new variable on the left-hand side (otherwise you would violate single assignment)

Parallelizing with Functional Languages Think about Jacobi/Red-Black…an array is a variable, so every assignment (in the inner loop!) is to a new entire array –Compiler must determine when it can update in place. double A[N,N] for (red points) { A[i][j] = 0.25 * (old A[i][j+1] + old A[i][j-1] + old A[i+1][j] + old A[i-1][j]); }

Parallelizing with MapReduce Allows massive parallelization on large numbers of nodes Functional programming model –Restricts application domain –Map(f, nil) = nil –Map(f, cons(e, L)) = cons(f(e), map(f, L)) –Reduce(f, z, nil) = z –Reduce(f, z, cons(e, L)) = f(e, reduce(f, z, L)) e, z are elements; L is a list; cons is list constructor

Parallelizing with MapReduce Google’s MapReduce involves: –First, applying a map function to each logical record in the input Produces a set of intermediate key/value pairs –Then, applying a reduce function to all values that have a common key Again, note that this is a functional model, so there are no side effects (will become quite important in the implementation)

MapReduce Example (dumb!) Map(String key, String value): For each word w in value: EmitIntermediate(w, “1”); Reduce(String key, Iterator values): Int result = 0; For each v in values: result += ParseInt(v); Emit(AsString(result)); key: document name value: document contents key: word values: list of counts (1’s) Note: key is different in map and in reduce

Other MapReduce Examples URL Access Frequency (same as WordCount) –Map outputs for each URL –Reduce adds for same URL Reverse Web-Link Graph –Map outputs for each link to a target URL found in source –Reduce concatenates source URLs for each target Inverted Index –Map outputs –Reduce outputs

MapReduce Implementation

Basic idea (for clusters of commodity machines): – 1. Split input files into M chunks – 2. Use Master/Worker paradigm; one master. Master assigns M map tasks and R reduce tasks to workers. – 3. Split reduce tasks into R pieces (hash on key and apply “mod R”, for example) – 4. Workers doing a map task write key/value lists into different files per R piece If hashing the key (and applying mod R) is equal to I, then write the key/value lists to file "i". Pass back this file info to master, who tells reduce tasks

MapReduce Implementation –5. Reduce worker grabs its data from all local disks via RPC, sorts, and reduces. Sorting occurs because many keys may be assigned to each reduce task One call to user’s reduce function per unique key; parameter is list of all values for that key Appends output into final output file. – 6. When everything done, wake up MapReduce call. –Master identifies a state per task: idle, in process, or completed

Fault Tolerance with MapReduce Critical to handle fault tolerance because there will be thousands of machines involved –Lessons learned from RAID Master keeps track of each map and reduce task –Marks it as idle, in progress, or completed –Master assumed to never fail (could checkpoint to alleviate this problem)

Fault Tolerance with MapReduce For workers: –Ping periodically; if no response, mark worker as “failed”; mark worker’s task as idle –If map worker fails, completed map tasks are re- executed because the worker’s local disk is assumed inaccessible –Notify reduce tasks if a map task changes hands, so reduce tasks know where to read from –If reduce worker fails, assign new worker and re-do

Fault Tolerance with MapReduce Fault tolerance is simple because MapReduce is a functional model –Can have duplicate tasks, for example---no side effects (and use “backup” tasks when nearly done) –Note: depends on atomic commits of map and reduce task outputs Map task sends completion message after writing R files; includes names in message (to master) Reduce task writes to temp file, then renames to final output file---depends on atomic rename operation from file system in case many machines execute rename on same reduce task

MapReduce “Optimizations” There are a host of these –Many of them are completely obvious E.g., “Combiner function” to avoid sending large amounts of easily reducible data For WordCount, there will be many (“the”, 1) records; may as well just add them all up locally Also can “roll your own” partitioning function –All “optimizations” affect only performance, not correctness

MapReduce Performance