Presentation is loading. Please wait.

Presentation is loading. Please wait.

Software and Services Group SQL (92 and Beyond) Support for Hive Jason Dai Principal Engineer Intel SSG (Software and Services Group)

Similar presentations


Presentation on theme: "Software and Services Group SQL (92 and Beyond) Support for Hive Jason Dai Principal Engineer Intel SSG (Software and Services Group)"— Presentation transcript:

1 Software and Services Group SQL (92 and Beyond) Support for Hive Jason Dai Principal Engineer Intel SSG (Software and Services Group)

2 2 Software and Services Group What SQL support is needed? More SQL-92 support for analytics Complete SQL data type system –Data types (e.g., Datetime, fixed precision numbers), type conversion rules & function (CAST), Datetime expressions and functions (e.g. extract, +/- interval), etc. Full subquery support –Subquery in WHERE clauses, correlated subquery, scalar subquery, etc. –New expressions (EXISTS, ALL, ANY, etc.) Complete Set operators –DISTINCT UNION, INTERSECT, EXCEPT, etc. Multiple-table SELECT statement Update/delete? –On HBase only? (Almost) SQL-92 compliance? How about transaction? 2

3 3 Software and Services Group What SQL support is needed (continued)? Additional analytics support (beyond SQL-92) Advanced OLAP functions for analysis & reporting –E.g., rank, rollup, cube, window function (SQL 2003), etc. Advanced SQL syntax –E.g. WITH clause (SQL-99) Procedural extensions –E.g., Begin, End, If…Then...Else, Loop/Exit/Continue, etc. 3

4 4 Software and Services Group Workload Analysis 4 TPC-HTPC-DS Complex SubqueryYY Multiple-table SELECTYY Set operatorsY SQL data types (especially Datetime) YY Advanced OLAP functions (e.g., rank, grouping and window functions) Y WITH clause (SQL-99)Y UPDATE/DELETEY

5 5 Software and Services Group Let’s Get Our Hands Dirty 5 Parser Semantic Analyzer (Optimizer) Execution Query AST (Abstract Syntax Tree) Execution Plan (Almost) SQL-compliant Hive parser A lot of work: SQL much more complex than HiveQL –HiveQL grammar file: ~61KB with 2487 lines –SQL (with PL/SQL extensions) grammar file: ~524KB with 8583 lines Also complex: many existing Hive grammar rules need to be changed –To support more complex SQL constructs (e.g., subquery) UDF/UDAF/UDTF For some operators (e.g., rank)

6 6 Software and Services Group Let’s Get Our Hands Dirty 6 Parser Semantic Analyzer (Optimizer) Execution Query AST (Abstract Syntax Tree) Execution Plan Analysis, transformation & optimization SQL data type system Subquery support (incl. subquery unnestting) Multiple-table SELECT Set operations Advanced OLAP functions …

7 7 Software and Services Group Project Panthera: Our open source efforts to enable better analytics capabilities on Hadoop/HBase https://github.com/intel-hadoop/project-panthera How to Leverage Existing Works? 7 *https://github.com/porcelli/plsql-parserhttps://github.com/porcelli/plsql-parser Hive Parser Hive-AST HiveQL Driver Query (Open Source) SQL Parser* SQL- AST SQL-AST Analyzer & Translator Multi-Table SELECT Subquery Unnesting … Hive Semantic Analyzer INTERSECT Support MINUS Support … Hadoop MR SQL Hive- AST A SQL engine for Hive MapReduce Goal: full analytical SQL support for OLAP  Subquery in WHERE clause  Correlated subquery  Multiple-table SELECT statement  …

8 8 Software and Services Group NextR Hive UDFs https://github.com/nexr/hive-udf UDFs for Oracle db extensions (rank, decode, nvl, etc.) SQL windowing functions for Hive https://github.com/hbutani/SQLWindowing How to Leverage Existing Works? 8

9 9 Software and Services Group 9


Download ppt "Software and Services Group SQL (92 and Beyond) Support for Hive Jason Dai Principal Engineer Intel SSG (Software and Services Group)"

Similar presentations


Ads by Google