Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cluster Computing with DryadLINQ Mihai Budiu Microsoft Research, Silicon Valley Cloudera, February 12, 2010.

Similar presentations


Presentation on theme: "Cluster Computing with DryadLINQ Mihai Budiu Microsoft Research, Silicon Valley Cloudera, February 12, 2010."— Presentation transcript:

1 Cluster Computing with DryadLINQ Mihai Budiu Microsoft Research, Silicon Valley Cloudera, February 12, 2010

2 Goal 2

3 Design Space 3 ThroughputLatency Internet Private data center Data- parallel Shared memory

4 Execution Application Data-Parallel Computation 4 Storage Language Parallel Databases Map- Reduce GFS BigTable Cosmos Azure SQL Server Dryad DryadLINQ Scope Sawzall Hadoop HDFS S3 Pig, Hive SQL≈SQLLINQ, SQLSawzall Cosmos, HPC, Azure

5 SQL Software Stack 5 Windows Server Cosmos Cosmos FS Dryad Distributed Shell PSQL DryadLINQ SQL server Windows Server C++ NTFS legacy code SSIS Scope C# Machine Learning.NetDistributed Data Structures Graphs Data mining Applications Azure XComputeWindows HPC Azure XStoreSQL Server Analytics Tidy FS Optimi- zation

6 Introduction Dryad DryadLINQ Building on DryadLINQ Conclusions 6

7 Dryad Continuously deployed since 2006 Running on >> 10 4 machines Sifting through > 10Pb data daily Runs on clusters > 3000 machines Handles jobs with > 10 5 processes each Platform for rich software ecosystem Used by >> 100 developers Written at Microsoft Research, Silicon Valley 7

8 Dryad = Execution Layer 8 Job (application) Dryad Cluster Pipeline Shell Machine ≈

9 2-D Piping Unix Pipes: 1-D grep | sed | sort | awk | perl Dryad: 2-D grep 1000 | sed 500 | sort 1000 | awk 500 | perl 50 9

10 Virtualized 2-D Pipelines 10

11 Virtualized 2-D Pipelines 11

12 Virtualized 2-D Pipelines 12

13 Virtualized 2-D Pipelines 13

14 Virtualized 2-D Pipelines 14 2D DAG multi-machine virtualized

15 Dryad Job Structure 15 grep sed sort awk perl grep sed sort awk Input files Vertices (processes) Output files Channels Stage

16 Channels 16 X M Items Finite streams of items distributed filesystem files (persistent) SMB/NTFS files (temporary) TCP pipes (inter-machine) memory FIFOs (intra-machine)

17 Dryad System Architecture 17 Files, TCP, FIFO, Network job schedule data plane control plane NS, Sched PD V VV Job managercluster

18 Fault Tolerance

19 Policy Managers 19 RR XXXX Stage R RR Stage X Job Manager R managerX Manager R-X Manager Connection R-X

20 X[0]X[1]X[3]X[2]X’[2] Completed vertices Slow vertex Duplicate vertex Dynamic Graph Rewriting Duplication Policy = f(running times, data volumes)

21 Cluster network topology rack top-of-rack switch top-level switch

22 SSSS AAA SS T SSSSSS T # 1# 2# 1# 3 # 2 # 3# 2# 1 static dynamic rack # Dynamic Aggregation 22

23 Policy vs. Mechanism 23 Application-level Most complex in C++ code Invoked with upcalls Need good default implementations DryadLINQ provides a comprehensive set Built-in Scheduling Graph rewriting Fault tolerance Statistics and reporting

24 Introduction Dryad DryadLINQ Building on DryadLINQ Conclusions 24

25 LINQ 25 Dryad => DryadLINQ

26 26 LINQ =.Net+ Queries Collection collection; bool IsLegal(Key); string Hash(Key); var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value};

27 Collections and Iterators 27 class Collection : IEnumerable ; public interface IEnumerable { IEnumerator GetEnumerator(); } public interface IEnumerator { T Current { get; } bool MoveNext(); void Reset(); }

28 DryadLINQ Data Model 28 Partition Collection.Net objects

29 Collection collection; bool IsLegal(Key k); string Hash(Key); var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value}; 29 DryadLINQ = LINQ + Dryad C# collection results C# Vertex code Query plan (Dryad job) Data

30 Demo 30

31 Example: Histogram 31 public static IQueryable Histogram( IQueryable input, int k) { var words = input.SelectMany(x => x.line.Split(' ')); var groups = words.GroupBy(x => x); var counts = groups.Select(x => new Pair(x.Key, x.Count())); var ordered = counts.OrderByDescending(x => x.count); var top = ordered.Take(k); return top; } “A line of words of wisdom” [“A”, “line”, “of”, “words”, “of”, “wisdom”] [[“A”], [“line”], [“of”, “of”], [“words”], [“wisdom”]] [ {“A”, 1}, {“line”, 1}, {“of”, 2}, {“words”, 1}, {“wisdom”, 1}] [{“of”, 2}, {“A”, 1}, {“line”, 1}, {“words”, 1}, {“wisdom”, 1}] [{“of”, 2}, {“A”, 1}, {“line”, 1}]

32 Histogram Plan 32 SelectMany Sort GroupBy+Select HashDistribute MergeSort GroupBy Select Sort Take MergeSort Take

33 Map-Reduce in DryadLINQ 33 public static IQueryable MapReduce ( this IQueryable input, Func > mapper, Func keySelector, Func,S> reducer) { var map = input.SelectMany(mapper); var group = map.GroupBy(keySelector); var result = group.Select(reducer); return result; }

34 Map-Reduce Plan 34 M R G M Q G1G1 R D MS G2G2 R staticdynamic X X M Q G1G1 R D MS G2G2 R X M Q G1G1 R D G2G2 R X M Q G1G1 R D M Q G1G1 R D G2G2 R X M Q G1G1 R D G2G2 R X M Q G1G1 R D G2G2 R G2G2 R map sort groupby reduce distribute mergesort groupby reduce mergesort groupby reduce consumer map partial aggregation reduce SSSS AAA SS T dynamic

35 Distributed Sorting Plan 35 O DS H D M S H D M S D H D M S D M S M S staticdynamic

36 Expectation Maximization lines 3 iterations shown

37 Probabilistic Index Maps 37 Images features

38 Language Summary 38 Where Select GroupBy OrderBy Aggregate Join Apply Materialize

39 LINQ System Architecture 39 Local machine.Net program (C#, VB, F#, etc) LINQ Provider Execution engine Query Objects LINQ-to-obj PLINQ LINQ-to-SQL LINQ-to-WS DryadLINQ Flickr Oracle LINQ-to-XML Your own

40 The DryadLINQ Provider 40 DryadLINQ Client machine (11) Distributed query plan.Net Query Expr Data center Output Tables Results Input Tables Invoke Query Output DryadTable Dryad Execution.Net Objects Dryad JM ToCollection foreach Vertex code Con- text

41 Combining Query Providers 41 PLINQ Local machine.Net program (C#, VB, F#, etc) LINQ Provider Execution engines Query Objects SQL Server DryadLINQ LINQ Provider LINQ-to-obj

42 Using PLINQ 42 Query DryadLINQ PLINQ Local query

43 LINQ to SQL Using LINQ to SQL Server 43 Query DryadLINQ Query LINQ to SQL

44 Using LINQ-to-objects 44 Query DryadLINQ Local machine Cluster LINQ to obj debug production

45 Introduction Dryad DryadLINQ Building on/for DryadLINQ – System monitoring with Artemis – Privacy-preserving query language (PINQ) – Machine learning Conclusions 45

46 Artemis: measuring clusters 46 Cosmos Cluster HPC Cluster Azure Cluster Cluster/Job State API DryadLINQ Log collection Cluster browser/ manager Job browser Visualization Statistics DB Plug-ins

47 DryadLINQ job browser 47

48 Automated diagnostics 48

49 Job statistics: schedule and critical path 49

50 Running time distribution 50

51 Performance counters 51

52 CPU Utilization 52

53 Load imbalance: rack assignment 53

54 PINQ 54 Privacy-sensitive database Queries (LINQ) Answer

55 PINQ = Privacy-Preserving LINQ “Type-safety” for privacy Provides interface to data that looks very much like LINQ. All access through the interface gives differential privacy. Analysts write arbitrary C# code against data sets, like in LINQ. No privacy expertise needed to produce analyses. Privacy currency is used to limit per-record information released. 55

56 Example: search logs mining 56 Distribution of queries about “Cricket” // Open sensitive data set with state-of-the-art security PINQueryable visits = OpenSecretData(password); // Group visits by patient and identify frequent patients. var patients = visits.GroupBy(x => x.Patient.SSN).Where(x => x.Count() > 5); // Map each patient to their post code using their SSN. var locations = patients.Join(SSNtoPost, x => x.SSN, y => y.SSN, (x,y) => y.PostCode); // Count post codes containing at least 10 frequent patients. var activity = locations.GroupBy(x => x).Where(x => x.Count() > 10); Visualize(activity); // Who knows what this does???

57 PINQ Download Implemented on top of DryadLINQ Allows mining very sensitive datasets privately Code is available Frank McSherry, Privacy Integrated Queries, SIGMOD 2009Privacy Integrated Queries 57

58 Natal Training 58

59 Natal Problem 59 Recognize players from depth map At frame rate Minimal resource usage

60 Learn from Data 60 Motion Capture (ground truth) Classifier Training examples Machine learning Rasterize

61 Running on Xbox 61

62 Learning from data 62 Classifier Training examples Dryad DryadLINQ Machine learning

63 Highly efficient parallellization 63

64 Introduction Dryad DryadLINQ Building on DryadLINQ Conclusions 64

65 Lessons Learned Complete separation of storage / execution / language Using LINQ +.Net (language integration) Static typing – No protocol buffers (serialization code) Allowing flexible and powerful policies Centralized job manager: no replication, no consensus, no checkpointing Porting (HPC, Cosmos, Azure, SQL Server) 65

66 Conclusions 66 =

67 “What’s the point if I can’t have it?” Dryad+DryadLINQ available for download – Academic license – Commercial evaluation license Runs on Windows HPC platform Dryad is in binary form, DryadLINQ in source Requires signing a 3-page licensing agreement 67

68 Backup Slides 68

69 What does DryadLINQ do? 69 public struct Data { … public static int Compare(Data left, Data right); } Data g = new Data(); var result = table.Where(s => Data.Compare(s, g) < 0); public static void Read(this DryadBinaryReader reader, out Data obj); public static int Write(this DryadBinaryWriter writer, Data obj); public class DryadFactoryType__0 : LinqToDryad.DryadFactory DryadVertexEnv denv = new DryadVertexEnv(args); var dwriter__2 = denv.MakeWriter(FactoryType__0); var dreader__3 = denv.MakeReader(FactoryType__0); var source__4 = DryadLinqVertex.Where(dreader__3, s => (Data.Compare(s, ((Data)DryadLinqObjectStore.Get(0))) < ((System.Int32)(0))), false); dwriter__2.WriteItemSequence(source__4); Data serialization Data factory Channel writer Channel reader LINQ code Context serialization

70 Ongoing Dryad/DryadLINQ Research Performance modeling Scheduling and resource allocation Profiling and performance debugging Incremental computation Hardware acceleration High-level programming abstractions Many domain-specific applications 70

71 71 Sample applications written using DryadLINQClass Distributed linear algebraNumerical Accelerated Page-Rank computationWeb graph Privacy-preserving query languageData mining Expectation maximization for a mixture of GaussiansClustering K-meansClustering Linear regressionStatistics Probabilistic Index MapsImage processing Principal component analysisData mining Probabilistic Latent Semantic IndexingData mining Performance analysis and visualizationDebugging Road network shortest-path preprocessingGraph Botnet detectionData mining Epitome computationImage processing Neural network trainingStatistics Parallel machine learning framework infer.netMachine learning Distributed query cachingOptimization Image indexingImage processing Web indexing structureWeb graph

72 JM code vertex code Staging 1. Build 2. Send.exe 3. Start JM 5. Generate graph 7. Serialize vertices 8. Monitor Vertex execution 4. Query cluster resources Cluster services 6. Initialize vertices

73 Bibliography 73 Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly European Conference on Computer Systems (EuroSys), Lisbon, Portugal, March 21-23, 2007 DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep Kumar Gunda, and Jon Currey Symposium on Operating System Design and Implementation (OSDI), San Diego, CA, December 8-10, 2008 SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets Ronnie Chaiken, Bob Jenkins, Per-Åke Larson, Bill Ramsey, Darren Shakib, Simon Weaver, and Jingren Zhou Very Large Databases Conference (VLDB), Auckland, New Zealand, August Hunting for problems with Artemis Hunting for problems with Artemis Gabriela F. Creţu-Ciocârlie, Mihai Budiu, and Moises Goldszmidt USENIX Workshop on the Analysis of System Logs (WASL), San Diego, CA, December 7, 2008 DryadInc: Reusing work in large-scale computations DryadInc: Reusing work in large-scale computations Lucian Popa, Mihai Budiu, Yuan Yu, and Michael Isard Workshop on Hot Topics in Cloud Computing (HotCloud), San Diego, CA, June 15, 2009 Distributed Aggregation for Data-Parallel Computing: Interfaces and ImplementationsDistributed Aggregation for Data-Parallel Computing: Interfaces and Implementations, Yuan Yu, Pradeep Kumar Gunda, and Michael Isard, ACM Symposium on Operating Systems Principles (SOSP), October 2009 Quincy: Fair Scheduling for Distributed Computing Clusters Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder, Kunal Talwar, and Andrew Goldberg ACM Symposium on Operating Systems Principles (SOSP), October 2009

74 Incremental Computation … … Goal: Reuse (part of) prior computations to: - Speed up the current job - Increase cluster throughput - Reduce energy and costs Goal: Reuse (part of) prior computations to: - Speed up the current job - Increase cluster throughput - Reduce energy and costs Outputs Inputs Distributed Computation Append-only data

75 Propose Two Approaches 1. Reuse Identical computations from the past (like make or memoization) 2. Do only incremental computation on the new data and Merge results with the previous ones (like patch )

76 Context Implemented for Dryad – Dryad Job = Computational DAG Vertex: arbitrary computation + inputs/outputs Edge: data flows Simple Example: Record Count C I2 C A Add Outputs Inputs (partitions) Count I1

77 Identical Computation Record Count C I2 C A Add Outputs Inputs (partitions) Count I1 First execution DAG

78 Identical Computation Second execution DAG Record Count C I2 C A Add Outputs Inputs (partitions) Count I1 I3 C New Input

79 IDE – IDEntical Computation Second execution DAG Record Count C I2 C A Add Outputs Inputs (partitions) Count I1 I3 C Identical subDAG

80 Identical Computation IDE Modified DAG Replaced with Cached Data Replace identical computational subDAG with edge data cached from previous execution Replace identical computational subDAG with edge data cached from previous execution A Add Outputs Inputs (partitions) Count I3 C

81 Identical Computation IDE Modified DAG Use DAG fingerprints to determine if computations are identical A Add Outputs Inputs (partitions) Count I3 C Replace identical computational subDAG with edge data cached from previous execution Replace identical computational subDAG with edge data cached from previous execution

82 Semantic Knowledge Can Help C I2 C A I1 Reuse Output

83 Semantic Knowledge Can Help C I2 C A I1 C I3 A Merge (Add) Previous Output Incremental DAG

84 Mergeable Computation C I2 C A I1 C I3 A Merge (Add) Automatically Inferred Automatically Built User-specified

85 Mergeable Computation C I2 C A I1 A C I2 C A I1 I3 C Empty Save to Cache Incremental DAG – Remove Old Inputs Merge Vertex


Download ppt "Cluster Computing with DryadLINQ Mihai Budiu Microsoft Research, Silicon Valley Cloudera, February 12, 2010."

Similar presentations


Ads by Google