Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. What’s New in Oracle Database 12c Graph Database Xavier Lopez, Ph.D. Senior Director.

Similar presentations


Presentation on theme: "Copyright © 2014 Oracle and/or its affiliates. All rights reserved. What’s New in Oracle Database 12c Graph Database Xavier Lopez, Ph.D. Senior Director."— Presentation transcript:

1 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. What’s New in Oracle Database 12c Graph Database Xavier Lopez, Ph.D. Senior Director Zhe Wu, Ph.D. Architect

2 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Agenda Graph Database Strategy Customer Use Cases Oracle Spatial and Graph RDF Graph Features Future Plans

3 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Graph Database Strategy Support Graph Data Types……On all enterprise platforms Oracle Database Oracle NoSQL Database Oracle Big Data Appliance Oracle Cloud

4 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. What Sets Us Apart? Scalability: Trillions of triples Transactional: Concurrent loading and updates with ACID properties Security: OLS security labels at “triple” level (OLS). Standards based: W3C Manageable: Use existing DB tools, utilities and expertise Multi-type support: graph, relational, search, geospatial … Multi-platform: Relational database, NoSQL, Hadoop

5 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. RDF Graph v. Property Graph RDF Semantic Graphs Property Graphs Use Case: – Social network analysis Analytics: – Clustering, centrality, page rank, path finding Analytics Execution – In-memory, In-database Use Case: – Linked data, semantic metadata layer Analytics: – pattern matching, Inferencing Analytics Execution – In-database

6 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. RDF Semantic Graph feature of Oracle Spatial and Graph For Oracle Database 12c

7 Copyright © 2014 Oracle and/or its affiliates. All rights reserved.  Find related content & relations by navigating connected entities  “Reason” across entities  Find related content & relations by navigating connected entities  “Reason” across entities Two Application Use Cases Linked DataEntity Analytics Unified metadata model for distributed data sources Flexible model for sparse and evolving data Validate semantic and structural consistency  SPARQL pattern matching  Detecting related entities across large, sparse, disparate collections of data  Inferencing: Applying rules on asserted data

8 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Graph-based Metadata Layer Linked Data in Support of Distributed Data – W3C standard, flexible model for sparse and evolving data – Common vocabulary enables data integration & app development – Relational data stays in place, apps don’t need to change Database Server HR Database Sales Database Inventory Database HR Schema Inventory Schema Sales Schema Mid-Tier Server Application 1 Application 2 Application 3 SQL RDF Graph Inventory Graph Sales Graph Shared Ontologies SPARQL

9 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Linked Data in Enterprise Index Content Mgmt BI Server Data Warehouse Machine Generated Data Semantic Graph model Transaction Systems Hadoop Appliance Subscription Services Human Sourced Information Social Media Event Server Data Servers Data Sources / Types Access & Presentation Layer

10 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Linked Data / Enterprise Metadata Life Sciences Finance Media Networks & Communications Defense & Intelligence Police Industries Hutchinson 3G Austria

11 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Business Challenge Link database information on genes, proteins, metabolic pathways, compounds, ligands, etc. to original sources. Increase productivity for accessing, sharing, searching, navigating, cross-linking, analyzing internal /external data Novartis Institutes for BioMedical Research (NIBR) Solution Semantic integration layer on RDF graph Rich domain-specific terminology (biology, chemistry and medicine) 1.6 M terms Terminology Hub: 8 GB of referential data that cross-references between data repositories.

12 Copyright © 2014 Oracle and/or its affiliates. All rights reserved.  Find related content & relations by navigating connected entities  “Reason” across entities  Find related content & relations by navigating connected entities  “Reason” across entities RDF Semantic Graph-based Applications Linked DataEntity Analytics Unified metadata model for distributed data sources Flexible model for sparse and evolving data Validate semantic and structural consistency  SPARQL pattern matching  Detecting related entities across large, sparse, disparate collections of data  Inferencing: Applying rules on asserted data

13 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Knowledge Management in Intelligence Domain Data Sources Contents Repository Databases Web resources Blogs, Mails, news, RSS feeds Information Extraction Feature Extraction, Term Extraction Extracted Entities & Relationships RDF Intelligence Ontologies SQL/SPARQL Search, Presentation, Report, Visualization, Query National Intelligence Scenario Enterprise Data SpatialDocuments Person: Abduwali Abdukhadir Muse Nationality: Somalian Country: UK Group: Al Shabab Ideology: Islamist Person: ? Nationality: Pakistani Country: Pakistan Group: ? Person: Chehab Abdouljamid Bouyaly Country: Morocco Group: al Qaeda Currently resides Member of Currently resides Member of Supports Link ? Member of Currently resides Has images

14 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Oracle Spatial and Graph RDF Semantic Graph Features

15 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Oracle Database 12c RDF Semantic Graph Database Exadata ready Compression & partitioning Parallel load, inference, query High availability Label security: triple-level W3C standards compliance Semantic Indexing of text Enterprise Manager Native RDF graph data store Manages billions of triples Optimized storage architecture SPARQL-Jena/Joseki, Sesame SQL/graph query, B-tree indexing Ontology assisted SQL query SPARQL-Jena/Joseki, Sesame SQL/graph query, B-tree indexing Ontology assisted SQL query RDFS, OWL2 RL, EL, SKOS User-defined rules Incremental, parallel reasoning User-defined inferencing Plug-in architecture RDFS, OWL2 RL, EL, SKOS User-defined rules Incremental, parallel reasoning User-defined inferencing Plug-in architecture Load / Storage Query Reasoning Semantic indexing framework Integration with OBIEE, Oracle R Enterprise Oracle Data Mining Semantic indexing framework Integration with OBIEE, Oracle R Enterprise Oracle Data Mining Analytics

16 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Support for Apache Jena and OpenRDF Sesame Provides application developers with: Easy-to-use Java APIs to access Oracle databases and RDF files A standard-compliant SPARQL web service endpoint (Joseki, Fuseki) Data loading (RDF/XML, N-TRIPLES, N-QUADS, TriG,Turtle) JSON output Oracle-specific extensions for query execution control and management Leverage existing investments in open source frameworks

17 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. RDF views on relational tables Enables SPARQL query on distributed resources Views: Automatic and custom Aligns with W3C RDB2RDF standard No duplication of data and storage RDB to RDF Mapping Relational to RDF Mapping

18 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Oracle Label Security Data Classification Fine grained security through integration with Oracle Label Security Model level security through GRANT/REVOKE privileges Oracle Label Security - mandatory access control Labels assigned to both users and data Data labels determine the sensitivity of the rows or the rights a person must posses in order to read or write the data. User labels indicate their access rights to the data records. 18

19 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Core Inferencing Features Forward-chaining based inference engine in the database Native rulebases: RDFS, OWL 2 RL, OWL 2 EL, SKOS Validation of inferred data Proof generation User defined inferencing - Temporal reasoning, Spatial reasoning Ladder Based Inference - Fine grained security for inference graph Integration with external OWL 2 reasoners (TrOWL, Pellet)

20 Copyright © 2014 Oracle and/or its affiliates. All rights reserved.20 RDF Semantic Graph: Graph Visualization & Modeling Support Cytoscape Graph Visualization Semantic Modeling Protégé Oracle Confidential – Internal/Restricted/Highly Restricted

21 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Analyzing RDF with Oracle BI and Oracle Advanced Analytics Oracle BI Oracle Advanced Analytics

22 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Oracle Partner Tools: (IO Informatics)

23 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Oracle Partner Tools: Tom Sawyer Social Network Analysis

24 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Manageability of RDF Semantic Graph Built in support from Oracle Database utilities and tools Control query execution: in database & Jena client Create & monitor graph w/ SQL Developer: Semantic Network Models, virtual models Btree indexes Rule bases Entailments Security data labels Semantic index policies Tune / Analyze Ingest / Replicate / Recover Manage Tune load/ query/ inference : Parallelism Btree indexing triple/quad Typed literals indexing SPARQL query hints Statistics gathering Dynamic Sampling Analyze performance: Enterprise Manager: view optimizer plans, monitor execution / resource usage Bulk load: Apache Jena bulk loader Oracle external tables & SQL*Loader (Direct Path) w/ PL/SQL Bulk Load API Replicate & recover: Data Guard: physical standby Data Pump: staging tables Recovery Manager: RMAN

25 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Open Geospatial Consortium: GeoSPARQL Support Defines a Vocabulary for Spatial Query Patterns – Classes Spatial Object, Feature, Geometry – Properties Topological relations Links between features and geometries – Datatypes for geometry literals ogc:wktLiteral, ogc:gmlLiteral Query Functions – Topological relations, distance, buffer, intersection, …

26 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. RDF Graph support in Oracle NoSQL Database Enterprise Edition High performance Key Value store SPARQL 1.1 access to graph data Jena & Joseki SPARQL Web Services Massive horizontal scalability Support for World Wide Web Consortium (W3C) Semantic Web standards RDF Graph for Oracle NoSQL Graph Support on Oracle NoSQL DB Brings horizontal scalability to RDF graph applications

27 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. High volume, simple queries (low latency) Queries aggregating over most of the graph (e.g. what are the hobbies of the 100 most popular people in the network) Frequent, large-scale updates Large Data Centers RDF Graph for Oracle NoSQL When to Consider a NoSQL Database for Graphs Horizontal scalability, low query latency/cost, ease of install & management

28 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Quick Steps to Get Started

29 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Quick Steps to Get Started Install Oracle Database 12c or Use a Prebuilt VM from OTN Initialize - Creating a tablespace ‘ts’ - Run as SYS in SQL*Plus exec sem_apis.create_sem_network(‘ts’) - Run as SYS (for only) in SQL*Plus exec mdsys.enableGeoRaster; Configure Joseki/Fuseki web service endpoint Using Java APIs Load/Query/Inference through GraphOracleSem, DatasetGraphOracleSem, OracleBulkUpdateHandler, … Using SQL/PLSQL APIs exec create_sem_model insert/delete triples, bulk load, run SEM_MATCH, create_entailment, … SPARQL Query SPARQL Update REST APIs

30 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Quick Steps to Get Started Install Oracle Database 12c or Use a Prebuilt VM from OTN Initialize - Creating a tablespace ‘ts’ - Run as SYS in SQL*Plus exec sem_apis.create_sem_network(‘ts’) - Run as SYS (for only) in SQL*Plus exec mdsys.enableGeoRaster; Configure Joseki/Fuseki web service endpoint Using Java APIs Load/Query/Inference through GraphOracleSem, DatasetGraphOracleSem, OracleBulkUpdateHandler, … Using SQL/PLSQL APIs exec create_sem_model insert/delete triples, bulk load, run SEM_MATCH, create_entailment, … SPARQL Query SPARQL Update REST APIs

31 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Quick Steps to Get Started Install Oracle Database 12c or Use a Prebuilt VM from OTN Initialize - Creating a tablespace ‘ts’ - Run as SYS in SQL*Plus exec sem_apis.create_sem_network(‘ts’) - Run as SYS (for only) in SQL*Plus exec mdsys.enableGeoRaster; Configure Joseki/Fuseki web service endpoint Using Java APIs Load/Query/Inference through GraphOracleSem, DatasetGraphOracleSem, OracleBulkUpdateHandler, … Using SQL/PLSQL APIs exec create_sem_model insert/delete triples, bulk load, run SEM_MATCH, create_entailment, … SPARQL Query SPARQL Update REST APIs

32 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Performance

33 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Oracle Spatial and Graph - LUBM 200K on 3-Node RAC X2-4 Load, Inference and Query Performance  The LUBM 200K Graph has 48+ Billion triples (edges) – Original graph has 26.6 Billion unique triples (quads) – Inference produced another 21.4 Billion triples  Data Loading Performance – Triples Loaded and Indexed Per Second (TLIPS): 273K  Inference Performance – Triples Inferred and Indexed Per Second (TIIPS): 327K  SPARQL Query Performance – Query Results Per Second (QRPS): 459K Setup: Hardware: Sun Server X2-4, 3-node RAC - Each node configured with 1TB RAM, 4 CPU 2.4GHz 10-Core Intel E7-4870) - Storage: Dual Node 7420, both heads configured as: Sun ZFS Storage CPU 2.00GHz 8-Core (Intel E7-4820) 256G Memory 4x SSD SATA2 512G (READZ) 2x SATA 500G 10K. Four disk trays with 20 x 900GB 4x SSD 73GB (WRITEZ) Software: Oracle Database , SGA_TARGET=750G and PGA_AGGREGATE_TARGET=200G Note: Only one node in this RAC was used for performance test. Test performed in April Billion edges graph

34 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Oracle Spatial and Graph – LUBM 4400K on Exadata X4-2 Load, Inference and Query Performance Oracle Confidential – Internal34 Degrees of Parallelism Data setLoad (B triples/hr) OWL Inference ( B triples/hr) Query (B answers/hr) 256 * LUBM 4400K 605.4B / 115.2hrs B / 86hrs 30m 92.5B / 22.5 hrs Exadata X4-2 High capacity full rack ZS3-2 with 2 controllers, 8 trays of disk Eight compute nodes of Exadata Oracle DB standard install of Exadata * A mix of DOP used: 296, 256, 192 Open cursors = 1000 Processes = 1000 SGA = 132GB, PGA = 100GB 32K blocksize was given to all graph tablespaces TEMP group was created with 3 bigfile tablespaces Test performed in Aug/Sept Setup:  Data Loading Performance – Triples Loaded and Indexed Per Second (TLIPS): 1.420M  Inference Performance – Triples Inferred and Indexed Per Second (TIIPS): 1.527M  SPARQL Query Performance – Query Results Per Second (QRPS): 1.130M 1.08 Trillion edges graph

35 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Best Practices in Solving Performance Issues When there is an underperforming SQL in RDF data loading, inference, or query operations, check: Have you gathered statistics? APIs: export_model_stats,export_entailment_stats, export _network_stats, import_model_stats, import_entailment_stats, import_network_stats Have you tried parallel execution? Balanced hardware is key. Have you tried dynamic sampling? (Level 6, 8, 11) Is there a lack of indexes (including text index)? DO NOT just add indexes without careful & thorough testing

36 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. When there is an underperforming SQL in RDF data loading, inference, or query operations, check: Have you looked at the plan? Is it possible to write the same query in a different way? Is it possible to simplify? Simpler queries  Better chance to find more efficient ways to execute Tweak plan through hints Send a small, reproducible test case with the execution plan to Oracle Support or post it on the Forum Best Practices in Solving Performance Issues (2)

37 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Find the top thread(s) in Java VM Are there excessive GC activities? Try –XX:+UseParallelGC, -XX:+UseConcMarkSweepGC, … Has the heap size been set properly? Try larger heap size, analyze heap by performing a heap dump Send a small, reproducible test case with the thread dump to Oracle Support or post it on the Forum Best Practices in Solving Performance Issues (3)

38 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Cool Ongoing Activities: Enable Oracle Cloud Services: Oracle Social Network Integration with Oracle business applications and middleware Ongoing support for RDF Graph on all major platforms Relational Database NoSQL Database Big Data (Hadoop) Cloud

39 Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

40 Appendix

41 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. W3C Semantic Technology Stack Core Technologies URI Uniform resource identifier RDF Resource description framework RDFS RDF Schema OWL Web ontology language

42 Copyright © 2014 Oracle and/or its affiliates. All rights reserved. What is RDF  A graph data model for web resources and their relationships  The graph can be serialized into - RDF/XML, N3, N-TRIPLE, …  Construction unit: Triple (or assertion, or fact)  Quads (named graphs) add context, provenance, identification, etc. to assertions Subject Predicate Object “CA”


Download ppt "Copyright © 2014 Oracle and/or its affiliates. All rights reserved. What’s New in Oracle Database 12c Graph Database Xavier Lopez, Ph.D. Senior Director."

Similar presentations


Ads by Google