Presentation is loading. Please wait.

Presentation is loading. Please wait.

MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)

Similar presentations


Presentation on theme: "MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)"— Presentation transcript:

1 MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)

2 Jeff Dean Sanjay Ghemawat

3 Important programming model for large- scale data-parallel application Introduce

4 Motivation - Parallel applications Widely used Special purpose applications - Common functionality Parallelize computation Distribute data Handle failures - Large Scale(Big Data) Data Processing

5 MapReduce? -Programming Model Parallel Generic Scalable -Data Map(Key-Value) pair -Implementation Commodity clusters  Commodity PC

6 # map(key, val) is run on each item in set  emits new-key / new-val pairs  # reduce(key, vals) is run for each unique key emitted by map()  emits final output MapReduce? # User define function

7 Example # Distributed Grep (Global / Regular Expression / Print ) # Count of URL Access Frequency (logs of webpage request) map reduce # Reverse Web-Link Graph map<target(linked url), source(web page) reduce

8 # Inverted Index map reduce # Distributed Sort map reduce (emits all pairs unchanged) # Term-Vector per Host ( a list of pair) map reduce (throwing away infrequent terms, and emits a final) Example

9 Execution overview

10 Typical cluster # Machines are typically 100s or 1000s of 2-CPU x86 machines(dual-processor x86 processors) running Linux, with 2-4 GB of memory # NetWork 100 megabits/second or 1 gigabit/second # Storage Storage is on local IDE disks # GFS GFS: distributed file system manages data # Job scheduling system - jobs made up of tasks - scheduler assigns tasks to machines # Language C++ library linked into user programs

11 Distributed-1? #1 - Split input file into M pieces (16M ~ 64M)(user via optional parameter) - start up many copies of the program on a cluster of machines #2 - Master(1) – on e of the copies of the program is special - worker(n) – assigned work by the master - Map task(M) / Reduce tasks(R) #3 - Map task reads the content (from input split) - pares (key/value pair)  user define map function - buffered in memory #5 Reduce workers - it uses remote procedure calls to read the buffered data from the local disks of the map workers #4 Map workers - Periodically, the buffered pairs are written to local disk - the local disk are passed back to the master - who is responsible for forwarding these locations to the reduce workers

12 #6 - reduce worker iterates(unique intermediate key encountered) - start up many copies of the program on a cluster of machines - The output of the Reduce function is appended to a finnal output le for this reduce partition. Distributed-2? #7 - When all map tasks and reduce tasks have been completed - the master wakes up the user program - the MapReduce call in the user program returns back to the user code. #8 - After successful completion - R output files(reduce)(file names as specied by the user) - the MapReduce call in the user program returns back to the user code.

13 Master Data Structures #Status  Idle( 비가동 )  in-progress( 가동 )  completed( 완료 )

14 Fault Tolerance( 결함의 허용 범위 ) #Worker Failure - The master pings every worker periodically - MapReduce is resilient to large-scale worker failures #Master Failure  mapreduce stop - It is easy to make the master write periodic checkpoints of the master data structures described above. - If the master task dies, a new copy can be started from the last checkpointed state. - Clients can check for this condition and retry the MapReduce operation if they desire. #Semantics in the Presence of Failures ( 실패의 의미 )

15 Locality( 지역성 ) #GFS 저장  네트워크 대역폭 절약 GFS divides each file into 64 MB blocks, and stores several copies of each block (typically 3 copies) on different machines. #When running largeMapReduce operations on a signicant fraction of the workers in a cluster, most input data is read locally and consumes no network bandwidth.

16 Task Granularity # 이상적인 : Map (M), Reduce(R) M,R > Machines - 동적 로드벨런싱 향상 - worker failure  복구시간 향상 #Master O(M+R) 개의 스캐줄링 생성  O(M+R) 개의 상태가 메모리에 유지  실질적인 허용 범위가 존재함  O(M+R) 의 상태는 최소 1byte 로 구성됨 #reduce(r) 사용자 로부터 제약을 받음 ( 각각의 시스템에서 처리 됨으로 ) #M=200,000 개 R=5,000 개 (Machines)Worker=2000 환경에서 MapReduce 연산을 수행

17 Backup Tasks # ”Straggler”  낙오자  Machines 전체 연산 중 가장 나중에 수행 되는 매우 처리가 오래 걸리는 map or reduce task # When a MapReduce operation is close to completion, the master schedules backup executions of the remaining in-progress tasks. #The task is marked as completed whenever either the primary or the backup execution completes.

18

19 Combiner Function Master Map Task Map Task Reduce Task Reduce Task Reduce Task Map Task Network Traffic CPU Performance N1 N3 N2

20 Status Infomation #The master runs an internal HTTP server and exports a set of status pages for human consumption #how many tasks have been completed #how many are in progress, bytes of input, bytes of intermediate data, bytes of output, processing rates # The user can use this data to predict how long the computation will take

21 Conclusions #First, the model is easy to use, even for programmers without experience with parallel and distributed systems, # Second, a large variety of problems are easily expressible as MapReduce computations # Third, we have developed an implementation of MapReduce that scales to large clusters of machines comprising thousands of machines # First, restricting the programming model makes it easy to parallelize and distribute computations and to make such computations fault-tolerant. # Second, network bandwidth is a scarce resource. # Third, redundant execution can be used to reduce the impact of slow machines, and to handle machine failures and data loss.


Download ppt "MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)"

Similar presentations


Ads by Google