Jingren Zhou Microsoft Corp.. Large-scale Distributed Computing Large data centers (x1000 machines): storage and computation Key technology for search.

Jingren Zhou Microsoft Corp.

Large-scale Distributed Computing Large data centers (x1000 machines): storage and computation Key technology for search (Bing, Google, Yahoo) Web data analysis, user log analysis, relevance studies, etc.... ……… … … How to program the beast?

Map-Reduce / GFS GFS / Bigtable provide distributed storage The Map-Reduce programming model Good abstraction of group-by-aggregation operations Map function -> grouping Reduce function -> aggregation Very rigid: every computation has to be structured as a sequence of map-reduce pairs Not completely transparent: users still have to use a parallel mindset Error-prone and suboptimal: writing map-reduce programs is equivalent to writing physical execution plans in DBMS

Pig Latin / Hadoop Hadoop: distributed file system and map-reduce execution engine Pig Latin: a dataflow language using a nested data model Imperative programming style Relational data manipulation primitives and plug-in code to customize processing New syntax – need to learn a new language Queries are mapped to map-reduce engine

SCOPE / Cosmos Cosmos Storage System Append-only distributed file system for storing petabytes of data Optimized for sequential I/O Data is compressed and replicated Cosmos Execution Environment Flexible model: a job is a DAG (directed acyclic graph) Vertices -> processes, edges -> data flows The job manager schedules and coordinates vertex execution Provides runtime optimization, fault tolerance, resource management

An Example: QCount Compute the popular queries that have been requested at least 1000 times Data model: a relational rowset with well-defined schema

Input and Output SCOPE works on both relational and nonrelational data sources EXTRACTOUTPUT EXTRACT and OUTPUT commands provide a relational abstraction of underlying data sources Built-in/customized extractors and outputters (C# classes) EXTRACT column[: ] [, …] FROM USING [(args)] [HAVING ] OUTPUT [ ] TO [USING [(args)]] public class LineitemExtractor : Extractor { … public override Schema Produce(string[] requestedColumns, string[] args) { … } public override IEnumerable Extract(StreamReader reader, Row outputRow, string[] args) { … } } public class LineitemExtractor : Extractor { … public override Schema Produce(string[] requestedColumns, string[] args) { … } public override IEnumerable Extract(StreamReader reader, Row outputRow, string[] args) { … } }

Select and Join Supports different Agg functions: COUNT, COUNTIF, MIN, MAX, SUM, AVG, STDEV, VAR, FIRST, LAST. No subqueries (but same functionality available because of outer join) SELECT [DISTINCT] [TOP count] select_expression [AS ] [, …] FROM { USING | { [ […]]} [, …] } [WHERE ] [GROUP BY [, …] ] [HAVING ] [ORDER BY [ASC | DESC] [, …]] joined input: JOIN [ON ] join_type: [INNER | {LEFT | RIGHT | FULL} OUTER]

Deep Integration with.NET (C#) SCOPE supports C# expressions and built-in.NET functions/library User-defined scalar expressions User-defined aggregation functions R1 = SELECT A+C AS ac, B.Trim() AS B1 FROM R WHERE StringOccurs(C, “xyz”) > 2 #CS public static int StringOccurs(string str, string ptrn) { … } #ENDCS

User Defined Operators PROCESS REDUCECOMBINE SCOPE supports three highly extensible commands: PROCESS, REDUCE, and COMBINE Complements SELECT for complicated analysis Easy to customize by extending built-in C# components Easy to reuse code in other SCOPE scripts

Process PROCESS PROCESS command takes a rowset as input, processes each row, and outputs a sequence of rows PROCESS [ ] USING [ (args) ] [PRODUCE column [, …]] [WHERE ] [HAVING ] public class MyProcessor : Processor { public override Schema Produce(string[] requestedColumns, string[] args, Schema inputSchema) { … } public override IEnumerable Process(RowSet input, Row outRow, string[] args) { … } } public class MyProcessor : Processor { public override Schema Produce(string[] requestedColumns, string[] args, Schema inputSchema) { … } public override IEnumerable Process(RowSet input, Row outRow, string[] args) { … } }

Reduce REDUCE REDUCE command takes a grouped rowset, processes each group, and outputs zero, one, or multiple rows per group REDUCE [ [PRESORT column [ASC|DESC] [, …]]] ON grouping_column [, …] USING [ (args) ] [PRODUCE column [, …]] [WHERE ] [HAVING ] public class MyReducer : Reducer { public override Schema Produce(string[] requestedColumns, string[] args, Schema inputSchema) { … } public override IEnumerable Reduce(RowSet input, Row outRow, string[] args) { … } } public class MyReducer : Reducer { public override Schema Produce(string[] requestedColumns, string[] args, Schema inputSchema) { … } public override IEnumerable Reduce(RowSet input, Row outRow, string[] args) { … } } Map/ReduceProcess/Reduce Map/Reduce can be easily expressed by Process/Reduce

Combine COMBINE command takes two matching input rowsets, combines them in some way, and outputs a sequence of rows COMBINE [AS ] [PRESORT …] WITH [AS ] [PRESORT …] ON USING [ (args) ] PRODUCE column [, …] [HAVING ] public class MyCombiner : Combiner { public override Schema Produce(string[] requestedColumns, string[] args, Schema leftSchema, string leftTable, Schema rightSchema, string rightTable) { … } public override IEnumerable Combine(RowSet left, RowSet right, Row outputRow, string[] args) { … } } public class MyCombiner : Combiner { public override Schema Produce(string[] requestedColumns, string[] args, Schema leftSchema, string leftTable, Schema rightSchema, string rightTable) { … } public override IEnumerable Combine(RowSet left, RowSet right, Row outputRow, string[] args) { … } } COMBINE S1 WITH S2 ON S1.A==S2.A AND S1.B==S2.B AND S1.C==S2.C USING MyCombiner PRODUCE D, E, F

Importing Scripts Combines the benefits of virtual views and stored procedures in SQL Enables modularity and information hiding Improves reusability and allows parameterization Provides a security mechanism IMPORT [PARAMS = [,…]]

Life of a SCOPE Query... ……… Scope Queries Parser / Compiler / Security

Optimizer and Runtime Transformation Engine Optimization Rules Logical Operators Physical operators Cardinality Estimation Cost Estimat ion Scope Queries (Logical Operator Trees) Optimal Query Plans (Vertex DAG) SCOPE optimizer Transformation-based optimizer Reasons about plan properties (partitioning, grouping, sorting, etc.) Chooses an optimal plan based on cost estimates Vertex DAG: each vertex contains a pipeline of operators SCOPE Runtime Provides a rich class of composable physical operators Operators are implemented using the iterator model Executes a series of operators in a pipelined fashion

Example Query Plan (QCount) 1. Extract the input cosmos file 2. Partially aggregate at the rack level 3. Partition on “query” 4. Fully aggregate 5. Apply filter on “count” 6. Sort results in parallel 7. Merge results 8. Output as a cosmos file SELECT query, COUNT(*) AS count FROM “search.log” USING LogExtractor GROUP BY query HAVING count> 1000 ORDER BY count DESC; OUTPUT TO “qcount.result” SELECT query, COUNT(*) AS count FROM “search.log” USING LogExtractor GROUP BY query HAVING count> 1000 ORDER BY count DESC; OUTPUT TO “qcount.result”

TPC-H Query 2 // Extract region, nation, supplier, partsupp, part … RNS_JOIN = SELECT s_suppkey, n_name FROM region, nation, supplier WHERE r_regionkey == n_regionkey AND n_nationkey == s_nationkey; RNSPS_JOIN = SELECT p_partkey, ps_supplycost, ps_suppkey, p_mfgr, n_name FROM part, partsupp, rns_join WHERE p_partkey == ps_partkey AND s_suppkey == ps_suppkey; SUBQ = SELECT p_partkey AS subq_partkey, MIN(ps_supplycost) AS min_cost FROM rnsps_join GROUP BY p_partkey; RESULT = SELECT s_acctbal, s_name, p_partkey, p_mfgr, s_address, s_phone, s_comment FROM rnsps_join AS lo, subq AS sq, supplier AS s WHERE lo.p_partkey == sq.subq_partkey AND lo.ps_supplycost == min_cost AND lo.ps_suppkey == s.s_suppkey ORDER BY acctbal DESC, n_name, s_name, partkey; OUTPUT RESULT TO "tpchQ2.tbl"; // Extract region, nation, supplier, partsupp, part … RNS_JOIN = SELECT s_suppkey, n_name FROM region, nation, supplier WHERE r_regionkey == n_regionkey AND n_nationkey == s_nationkey; RNSPS_JOIN = SELECT p_partkey, ps_supplycost, ps_suppkey, p_mfgr, n_name FROM part, partsupp, rns_join WHERE p_partkey == ps_partkey AND s_suppkey == ps_suppkey; SUBQ = SELECT p_partkey AS subq_partkey, MIN(ps_supplycost) AS min_cost FROM rnsps_join GROUP BY p_partkey; RESULT = SELECT s_acctbal, s_name, p_partkey, p_mfgr, s_address, s_phone, s_comment FROM rnsps_join AS lo, subq AS sq, supplier AS s WHERE lo.p_partkey == sq.subq_partkey AND lo.ps_supplycost == min_cost AND lo.ps_suppkey == s.s_suppkey ORDER BY acctbal DESC, n_name, s_name, partkey; OUTPUT RESULT TO "tpchQ2.tbl";

Sub Execution Plan to TPCH Q2 1. Join on suppkey 2. Partially aggregate at the rack level 3. Partition on group-by column 4. Fully aggregate 5. Partition on partkey 6. Merge corresponding partitions 7. Partition on partkey 8. Merge corresponding partitions 9. Perform join

A Real Example

Conclusions SCOPE: a new scripting language for large-scale analysis Strong resemblance to SQL: easy to learn and port existing applications Very extensible Fully benefits from.NET library Supports built-in C# templates for customized operations Highly composable Supports a rich class of physical operators Great reusability with views, user-defined operators Improves productivity High-level declarative language Implementation details (including parallelism, system complexity) are transparent to users Allows sophisticated optimization Good foundation for performance study and improvement

Current/Future Work

Jingren Zhou Microsoft Corp.. Large-scale Distributed Computing Large data centers (x1000 machines): storage and computation Key technology for search.

Similar presentations

Presentation on theme: "Jingren Zhou Microsoft Corp.. Large-scale Distributed Computing Large data centers (x1000 machines): storage and computation Key technology for search."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Jingren Zhou Microsoft Corp.. Large-scale Distributed Computing Large data centers (x1000 machines): storage and computation Key technology for search.

Similar presentations

Presentation on theme: "Jingren Zhou Microsoft Corp.. Large-scale Distributed Computing Large data centers (x1000 machines): storage and computation Key technology for search."— Presentation transcript:

Similar presentations

About project

Feedback