Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1.

Similar presentations


Presentation on theme: "Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1."— Presentation transcript:

1 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1

2 Oracle In-Database MapReduce: When Hadoop Meets Exadata Kuassi Mensah Director Product Management

3 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 3 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

4 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 4 Agenda Big Data & In-Database MapReduce  SQL Map Reduce  In-Database Container for Hadoop  Oracle’s Big Data Solution

5 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 5

6 6 Big Data Concept Any Data RDBMS DataMining (phase II) MapReduce (phase I) MapReduce Convention: Process Data Locally MapReduce Infrastructure

7 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 7 Big Data In Real Life, Today RDBMS DataMining (phase II) MapReduce (phase I) MapReduce Infrastructure Unstructured Data (HDFS, NoSQL, etc) Structured Data RDBMS

8 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 8 Problems with Big Data Today  Shipping Data from RDBMS to MapReduce Infrastructure – Too Big to Move – Operational Issues – Data Correctness/Loss – Lack of Enterprise Class Security on MapReduce Infrastructure – Breaking MapReduce Convention – Cost of MapReduce Infrastructure or Storage – Lack of MapReduce Development Skills – Lack of MapReduce Deployment Skills

9 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 9 Big Data with In-Database MapReduce Hadoop Cluster Unstructured Data (HDFS, NoSQL, etc) Structured Data (RDBMS) RDBMS DataMining MapReduce DataMining In-Database MapReduce

10 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 10 In-Database MapReduce Trends  Hybrid Platforms: DBMS + MapReduce  Projects/Products/Initiatives – DataStax: Cassandra + Hadoop – Hadapt HadoopDB: Postgress + Hadoop – Greenplum HD – MongoDB MapReduce: JavaScript – Aster Data / TeraData  Limitations – Dependency on a Hadoop infrastructure in addition to DBMS – Source compatibility: Need to rewrite Hadoop jobs in different lang.

11 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 11 Oracle’s Big Data Strategy MapReduce APIs Across Data Infrastructure Hadoop, R, SQL Weblogs Sales Records RDBMS ( In-Database MapReduce) Big Data Appliance

12 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 12 Oracle In-Database MapReduce Integration with Oracle Big Data Solution

13 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 13 Oracle In-Database MapReduce In-Database Container for Hadoop (currently Beta) Feature of Oracle database 12c releases SQL MapReduce (12.1.0.1)

14 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 14 Agenda  Big Data & In-Database MapReduce SQL Map Reduce  In-Database Container for Hadoop  Oracle’s Big Data Solution

15 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 15 Collection of Existing and New Features  SQL Analytic functions  User-defined Aggregates functions  Parallel Pipelined Table Functions  SQL Pattern Matching MATCH_RECOGNIZE -- new! SQL MapReduce Declarative MR Analytics

16 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 16 SQL Pattern Matching SQL Pattern Matching provides expressive syntax and fast execution for pattern matching New SQL construct: MATCH_RECOGNIZE Define patterns using regular expression syntax Find event A (“privilege revoked”) followed by 3 or more occurences of event B (“attempted login”) within 1 minute 1 912 19 days Stock price Find 10-day periods where a stock price has “double-bottomed”

17 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 17 SQL Pattern Matching SELECT user_id, session_id start_time, no_of_events, duration FROM Events MATCH_RECOGNIZE ( PARTITION BY User_ID ORDER BY Time_Stamp MEASURES match_number() session_id, count(*) as no_of_events, first(time_stamp) start_time, last(time_stamp) - first(time_stamp) duration PATTERN (b s*) DEFINE s as (s.Time_Stamp - prev(Time_Stamp) <= 10) ) ORDER BY user_id, session_id; Sessionization

18 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 18 DEMO

19 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 19 Agenda  Big Data & In-Database MapReduce  SQL Map Reduce In-Database Container for Hadoop  Oracle’s Big Data Solution

20 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 20 Vanilla Hadoop Mappers Reducers Materialization of Intermediate data Hadoop Cluster Physical partitions (DataNodes)

21 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 21 In-Database Container for Hadoop  Apache Hadoop  Task execution: In-Database JVM  Data partitioning & task scheduling: PQ engine  Data storage: Table, external table, object view.  Data type mapping: TableReader, TableWriter Components

22 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 22 In-Database Container For Hadoop Mappers processes Reducers processes Pipelining Intermediate data Table partitions Parallel DML RDBMS Server

23 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 23 In-DB Cont. 4 Hadoop vs Vanilla Hadoop Mappers Reducers Materialization vs Pipelining Intermediate data Physical vs Logical data partitions Parallel DML RDBMS Server Hadoop Cluster

24 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 24  A “Hadoop container” in the RDBMS engine: no Hadoop cluster required.  Data processing in-situ: no need to ship data to a separate infrastructure.  API and Source-compatibility: accept Hadoop Mappers and Reducers as-is  Java interface: invoke Hadoop jobs a-la vanilla Hadoop  SQL interface: Map & Reduce steps in SQL statements In-Database Container for Hadoop Summary

25 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 25 In-Database Container for Hadoop SQL and Java interfaces SELECT * FROM TABLE (HREDUCE_JP_WORDCOUNT(:ConfKey, CURSOR(SELECT * FROM TABLE (HMAP_JP_WORDCOUNT(:ConfKey, CURSOR(SELECT * from InTable)))))) public class WordCount { public static void main() throws Exception { /* Setup the parameters and run the job */ …… job.init(); job.run(); }

26 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 26 DEMO

27 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 27 Pipelining Hadoop Jobs Pipelining Hadoop steps without intermediate materialization select * from table (HREDUCE_JP_JOB2 (:Confkey2,.... (HMAP_JP_JOB2 (:ConfKey2,.... (HREDUCE_JP_JOB1 (:ConfKey1,.... (HMAP_JP_JOB1 (:ConfKey1,...), )))); Through the SQL Interface

28 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 28 In-Database Container for Hadoop  Reuse Mappers & Reducers (including R-generated)  Dynamic Data Partitioning  Apache Hadoop API 2.00  Custom Writables Hadoop types  Serialized Data Formats  InputFormats: HDFS, HBase, Others  Java interface (Similar to Vanilla Hadoop Driver).  SQL interface: Hadoop Job Steps in SQL queries  Mahout Projected Features

29 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 29 Develop/Deploy with In-Db Cont. 4 Hadoop Develop Hadoop Mappers & Reducers from scratch Create or Update Hadoop Job Configuration file Reuse existing Mappers & Reducers Load all Java code in RDBMS and create Call Specs Invoke Hadoop job via Java or SQL interfaces. Populate output table with parallel INSERT

30 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 30 Agenda  Big Data & In-Database MapReduce  SQL Map Reduce  In-Database Container for Hadoop Oracle’s Big Data Solution

31 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 31 Oracle’s Big Data Solution Oracle Exalytics InfiniBand Oracle Real-Time Decisions Oracle Big Data Appliance Oracle Exadata InfiniBand AcquireOrganize AnalyzeDecide Oracle Endeca Information Discovery

32 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 32 Oracle In-Database MapReduce Summary  Declarative Analytics (SQL MapReduce)  Programmatic Analytics (Complex Algorithms, Hadoop)  MapReduce Jobs steps in SQL Queries.  Custom extensions (InputFormats)  RDBMS QoS (e.g., Enterprise Class Security)  Developers and DBAs friendly  Seamless integration with Oracle’s Big Data solution

33 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 33

34 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 34


Download ppt "Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1."

Similar presentations


Ads by Google