Presentation is loading. Please wait.

Presentation is loading. Please wait.

Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality

Similar presentations


Presentation on theme: "Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality"— Presentation transcript:

1

2 Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality
Souri Das, Ph.D. Architect, Oracle Feb 2012

3 Outline Fundamentals Overview Detailed look
Semantic Technologies in a nutshell Source for RDF data OWL Inferencing Primer Overview Architecture Core Functionality: Load, Infer, Query Enterprise Functionality Tools: OBIEE, Visualization Detailed look Storage Installation and configuration Loading Querying in SQL & SPARQL Inferencing Semantic Indexing of Unstructured Content Security 3

4 Semantic Technologies in a nutshell
Data (expressed in RDF) <entity, attr, value> triplets (similar to: unpivoted tables) entity typically is associated with a type (rdf:type <employee>) May be extended to quads: a grouping of triples BENEFIT: uniform structure => allows syntactic integration Schema / Ontology (expressed in OWL) extended with class and property hierarchies Many other such “rules” BENEFIT => allows discovery of implicit knowledge BENEFIT => allows semantic integration Query (expressed in SPARQL) BENEFIT => Suits the triple or quad-based structure of RDF

5 Source for RDF data Native RDF Converted to RDF Viewed as RDF
Example: Social network Converted to RDF Example: Tables (via unpivoting) Example: XML Example: Text (via NLP extraction) Example: Spatial (ogc:WKTLiteral) Example: Multimedia (via extraction) Viewed as RDF Example: Tables

6 OWL Inferencing: A short Primer
rdfs:subClassOf rdfs:subPropertyOf rdfs:domain rdfs:range owl:FunctionalProperty owl:InverseFunctionalProperty owl:SymmetricProperty owl:TransitiveProperty owl:inverseOf owl:equivalentClass owl:equivalentProperty owl:disjointWith owl:complementOf owl:sameAs owl:differentFrom owl:someValuesFrom owl:allValuesFrom owl:hasValue

7 Inference: Examples :hasMother rdf:type owl:FunctionalProperty
owl:InverseFunctionalProperty owl:SymmetricProperty owl:TransitiveProperty owl:inverseOf :hasMother rdf:type owl:FunctionalProperty :John :hasMother :Mary :John :hasMother :Maria => :Mary owl:sameAs :Maria :Maria owl:sameAs :Mary

8 Inference: Examples :hasSSN rdf:type owl:InverseFunctionalProperty
owl:FunctionalProperty owl:InverseFunctionalProperty owl:SymmetricProperty owl:TransitiveProperty owl:inverseOf :hasSSN rdf:type owl:InverseFunctionalProperty :John :hasSSN :Johny :hasSSN => :John owl:sameAs :Johny :Johny owl:sameAs :John

9 Inference: Examples :hasSibling rdf:type owl:SymmetricProperty
owl:FunctionalProperty owl:InverseFunctionalProperty owl:SymmetricProperty owl:TransitiveProperty owl:inverseOf :hasSibling rdf:type owl:SymmetricProperty :John :hasSibling :Mary => :Mary :hasSibling :John

10 Inference: Examples :hasAncestor rdf:type owl:TransitiveProperty
owl:FunctionalProperty owl:InverseFunctionalProperty owl:SymmetricProperty owl:TransitiveProperty owl:inverseOf :hasAncestor rdf:type owl:TransitiveProperty :John :hasAncestor :Mary :Mary :hasAncestor :Tom => :John :hasAncestor :Tom

11 Inference: Examples :hasParent owl:inverseOf :hasChild
owl:FunctionalProperty owl:InverseFunctionalProperty owl:SymmetricProperty owl:TransitiveProperty owl:inverseOf :hasParent owl:inverseOf :hasChild :John :hasParent :Mary => :Mary :hasChild :John

12 Inference: Examples :Male owl:disjointWith :Female
owl:equivalentClass owl:equivalentProperty owl:disjointWith owl:complementOf :Male owl:disjointWith :Female :John rdf:type :Male :Mary rdf:type :Female => :John owl:differentFrom :Mary :Mary owl:differentFrom :John

13 Inference: Examples :NonHuman owl:complementOf :Human
owl:equivalentClass owl:equivalentProperty owl:disjointWith owl:complementOf :NonHuman owl:complementOf :Human :Fish rdfs:subClassOf :NonHuman => :Fish owl:disjointWith :Human :Human owl:disjointWith :Fish

14 Inference: Examples :Player owl:equivalentClass _:c1
owl:someValuesFrom owl:allValuesFrom owl:hasValue :Player owl:equivalentClass _:c1 _:c owl:onProperty :participateIn _:c owl:someValuesFrom :Sports _:c rdf:type owl:Restriction :Soccer rdf:type :Sports :John :particpateIn :Soccer => :John rdf:type :Player

15 Inference: Examples :Vegetarian rdfs:subClassOf _:c1
owl:someValuesFrom owl:allValuesFrom owl:hasValue :Vegetarian rdfs:subClassOf _:c1 _:c owl:onProperty :eat _:c owl:allValuesFrom :Vegetable _:c rdf:type owl:Restriction :John rdf:type :Vegetarian :John :eat :bean => :bean rdf:type :Vegetable

16 Inference: Examples :EligibleToBePresident owl:equivalentClass _:c1
owl:someValuesFrom owl:allValuesFrom owl:hasValue :EligibleToBePresident owl:equivalentClass _:c1 _:c owl:onProperty :countryOfBirth _:c owl:hasValue :USA _:c rdf:type owl:Restriction :John :countryOfBirth :USA => :John rdf:type :EligibleToBePresident :Tom rdf:type :EligibleToBePresident => :Tom :countryOfBirth :USA

17 THE FOLLOWING IS INTENDED TO OUTLINE OUR GENERAL PRODUCT DIRECTION
THE FOLLOWING IS INTENDED TO OUTLINE OUR GENERAL PRODUCT DIRECTION. IT IS INTENDED FOR INFORMATION PURPOSES ONLY, AND MAY NOT BE INCORPORATED INTO ANY CONTRACT. IT IS NOT A COMMITMENT TO DELIVER ANY MATERIAL, CODE, OR FUNCTIONALITY, AND SHOULD NOT BE RELIED UPON IN MAKING PURCHASING DECISION. THE DEVELOPMENT, RELEASE, AND TIMING OF ANY FEATURES OR FUNCTIONALITY DESCRIBED FOR ORACLE'S PRODUCTS REMAINS AT THE SOLE DISCRETION OF ORACLE. 17

18 Importance of W3C & OGC Semantic Standards
Key W3C Web Semantic Activities: W3C RDF Working Group W3C SPARQL Working Group W3C RDB2RDF Working Group (Editors of R2RML) W3C OWL Working group W3C Semantic Web Education & Outreach (SWEO) W3C Health Care & Life Sciences Interest Group (HCLS) W3C Multimedia Semantics Incubator group W3C Semantic Web Rules Language (SWRL) OGC GeoSPARQL Standard Working Group (Tech. Editor) RDB2RDF will provide a standard for creating an RDF view on tables GeoSPARQL will provide a standard for including location predicates in SPARQL queries 18

19 Architectural Overview
SPARQL Endpoints Joseki / Sesame OBIEE via SPARQL Gateway Tools Visualizer Cytoscape-based Java API support SPARQL: Jena / Sesame JDBC Java Programs SQL Interface SQLplus PL/SQL SQLdev. Programming Interface Ontology-assisted Query of Enterprise Data Query RDF/OWL data and ontologies INFER LOAD RDF/S User-def. OWLsubsets Bulk-Load Incr. DML Core functionality QUERY (SPARQL in SQL) 3rd-Party Callouts Reasoners: Pellet NLP Extractors Enterprise (Relational) data RDF/OWL data and ontologies Rulebases: OWL, RDF/S, user-defined Inferred RDF/OWL data RDF/OWL Oracle DB Security: Oracle Label Security Semantic Indexes 19 19

20 Semantic Store: Core Entities
Sem. Network  Dictionary and data tables. New entities: models, rule bases, and rules indexes (entailments). OWL and RDFS rule bases are preloaded. Oracle Database Rulebases & Vocabularies OWL subset RDF / RDFS Rule base m A1 A2 An R herman scott Application Tables Model  A model holds an RDF graph (set of S-P-O triples) M1 M2 Mn Models X1 X2 Xp Entailments Rulebase  Built-in or user-defined rulebase is a set of rules used for inferencing. Entailments  An entailment stores triples derived via inferencing. Values Application Table  Contains col. of object type sdo_rdf_triple_s, associated with an RDF model, to allow DML on the RDF model. Triples Semantic Network (MDSYS)

21 Mapping Core Entities to DB objects
Each core entity is mapped to a view object in the database: Sem. Store entity type Database View name Model m mdsys.RDFM_m Rulebase rb mdsys.RDFR_rb Rules Index (entailment) x mdsys.RDFI_x Virtual Model vm mdsys.SEMV_vm (allows duplicates) mdsys.SEMU_vm (unique) SELECT privilege for a core entity is directly related to SELECT privilege for corresponding view object. 21

22 Core Functionality: Load / Query / Inference
Oracle Database Load  Bulk load Incremental load Rulebases & Vocabularies OWL subset RDF / RDFS Rule base m A1 A2 An R herman scott Application Tables M1 M2 Mn Models X1 X2 Xp Entailments Query and DML  SPARQL (from Java/endpoint) Inference  Native support for OWL 2 RL, SNOMED (OWL 2 EL subset), OWLprime, SKOSCORE, etc. Named Graph (Local/Global) Inference User-defined rules Values Triples Semantic Network (MDSYS)

23 Enterprise Functionality: SQL / Sem. Indexing / Security
SPARQL query (embedding) in SQL Allows joining SPARQL results with relational data Allows use of rich SQL operators (such as aggregates) Semantic indexing Index consists of RDF triples extracted from documents stored (directly or indirectly) in a table column Extraction done by one or more 3rd party information extractors Security: Fine-Grained Access Control (for each triple) Uses Oracle Label Security (OLS) Each RDF triple has an associated sensitivity label Querying Text and Spatial data using SPARQL The base extractor policy has no Rule bases and user-defined models. Virtual extractor policy has the name of the one base extractor policy and a set of rule bases and/or user defined models (uses virtual models).

24 Tools: OBIEE for RDF data (using SPARQL Gateway)
Easy integration of RDF data with Business Intelligence (OBIEE) through SPARQL Gateway

25 Tools: Visualization using Cytoscape and SEM_ANALYSIS
Detail Graph Whole graph or a subgraph Summary Graph Static Summaries Representative Instance Particular Instance Dynamic Summaries Summary for SPARQL-pattern based dynamic subgraph (Summary-Detail) Hybrid Graph 25

26 Example 1: Detail Graph

27 Example 2: Detail SubGraphs

28 Example 3: Representative & Particular Instance Summary Graphs
Note: The right graph was manually re-arranged to simplify comparison.

29 Example 4: Dynamic Subset-based Summary

30 Example 5: Particular Instance also an example of hybrid (detail-summary) graph
Detail Edge Aggregate Edge (can be expanded Using expand Property) Absent Edge

31 Installation and Configuration of Oracle Database Semantic Technologies
31

32 Installation and Configuration (1)
Load the PL/SQL packages and jar file cd $ORACLE_HOME/md/admin Login as sysdba Create a tablespace for semantic network create bigfile tablespace semts datafile '?/dbs/semts01.dat' size 512M reuse autoextend on next 512M maxsize unlimited extent management local segment space management auto; customize 32

33 Installation and Configuration (2)
Create a temporary tablespace create bigfile temporary tablespace semtmpts tempfile ‘?/dbs/semtmpts.dat' size 512M reuse autoextend on next 512M maxsize unlimited EXTENT MANAGEMENT LOCAL ; ALTER DATABASE DEFAULT TEMPORARY TABLESPACE semtmpts; Create an undo tablespace CREATE bigfile UNDO TABLESPACE semundots DATAFILE ‘?/dbs/semundots.dat' SIZE 512M REUSE AUTOEXTEND ON next 512M maxsize unlimited ALTER SYSTEM SET UNDO_TABLESPACE=semundots; 33

34 Installation and Configuration (3)
Create a semantic network As sysdba SQL> exec sem_apis.create_sem_network(‘semts’); Create a semantic model As scott (or other) SQL> create table test_tpl (triple sdo_rdf_triple_s) compress; SQL> exec sem_apis.create_sem_model(‘test’,’test_tpl’,’triple’); 34

35 Loading RDF triples 35

36 Incremental DMLs (small number of changes)
Loading Semantic Data Incremental DMLs (small number of changes) SQL: Insert SQL: Delete Java API (Jena): GraphOracleSem.add, delete Java API (Sesame): OracleSailConnection.addStatement, removeStatements Bulk load (adding many triples) PL/SQL: sem_apis.bulk_load_from_staging_table(…) Staging table may be populated using SQL*Loader or External Table Java API (Jena) OracleBulkUpdateHandler.addInBulk, prepareBulk, completeBulk Java API (Sesame) Recommended loading method for very small number of triples Recommended loading method for very large number of triples 36

37 Bulk load: Load Data into Staging Table
Create a staging table CREATE TABLE STAGE_TABLE ( RDF$STC_sub varchar2(4000) not null, RDF$STC_pred varchar2(4000) not null, RDF$STC_obj varchar2(4000) not null, RDF$STC_graph varchar2(4000) ) compress; -- RDF$STC_graph column is required if loading N-Quads Grant appropriate privileges to MDSYS GRANT SELECT, INSERT on STAGE_TABLE to MDSYS; -- INSERT privilege is required if using External Table (see below) Two ways for loading from file(s) Using External Table (for N-Triple or N-Quad format) Using SQL*Loader (for N-Triple format only) 37

38 Load Data into Staging Table: using External Table
Create an External Table and associate it with the data files BEGIN sem_apis.create_source_external_table( source_table => 'stage_table_source' ,def_directory => 'DATA_DIR' ,bad_file => 'CLOBrows.bad' ); END; / grant SELECT on "stage_table_source" to MDSYS; alter table "stage_table_source" location ('demo_datafile.nt'); Load content of External Table into the Staging Table BEGIN sem_apis.load_into_staging_table( staging_table => 'STAGE_TABLE' ,source_table => 'stage_table_source' ,input_format => 'N-QUAD'); END; / For large loads, consider parallel loading Distribute the input data into multiple files (associate with a single External Table) Use parallel=><n> when invoking sem_apis.load_into_staging_table 38

39 Load Data into Staging Table: using SQL*Loader
Create a control file for SQL*Loader UNRECOVERABLE LOAD DATA APPEND into table stage_table when (1) <> '#‘ (   RDF$STC_sub   CHAR(4000) terminated by whitespace,   RDF$STC_pred  CHAR(4000) terminated by whitespace,   RDF$STC_obj   CHAR(5000) “rtrim(:RDF$STC_obj,'. '||CHR(9)||CHR(10)||CHR(13))” ) Invoke SQL*Loader sqlldr userid=<DBuser>/<passwd> control=<control_file_name> data=<data_file_name> direct=true For large loads, consider parallel loading Distribute the input data into multiple files and invoke sqlldr from multiple sessions sqlldr userid=<DBuser>/<passwd> control=<control_file_name> data=<data_file_name> direct=true parallel=true & 39

40 Loading RDF model from Staging Table
Once Staging Table has been loaded, issue the following call BEGIN sem_apis.bulk_load_from_staging_table( model_name => ‘my_rdf_model' ,table_owner => ‘SCOTT' ,table_name => 'STAGE_TABLE' ,flags => ‘PARSE'); END; / For parallel loading of large data consider additional attributes in flags parameter PARALLEL=<n> MBV_METHOD=SHADOW 40

41 Load Data into Staging Table using prepareBulk
When you have many RDF/XML, N3, TriX or TriG files OracleSailConnection osc = oracleSailStore.getConnection(); store.disableAllAppTabIndexes(); for (int idx = 0; idx < szAllFiles.length; idx++) { osc.getBulkUpdateHandler().prepareBulk( fis, "http://abc", // baseURI RDFFormat.NTRIPLES, // dataFormat "SEMTS", // tablespaceName null, // flags null, // register a // StatusListener "STAGE_TABLE", // table name (Resource[]) null // Resource... for contexts ); osc.commit(); fis.close(); } The latest Jena Adapter has prepareBulk and completeBulk APIs Can start multiple threads and load files in parallel 41

42 More Data Loading Choices (1)
Use External Table to load data into Staging Table CREATE TABLE stable_ext( RDF$STC_sub varchar2(4000), RDF$STC_pred varchar2(4000), RDF$STC_obj varchar2(4000)) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY tmp_dir ACCESS PARAMETERS( RECORDS DELIMITED by NEWLINE PREPROCESSOR bin_dir:'uncompress.sh' FIELDS TERMINATED BY ' ' ) LOCATION (‘data1.nt.gz',‘data2.nt.gz',…,‘data_4.nt.gz') ) REJECT LIMIT UNLIMITED ; Multiple files is critical to performance 42

43 More Data Loading Choices (2)
Load directly using Jena Adapter Oracle oracle = new Oracle(szJdbcURL, szUser, szPasswd); Model model = ModelOracleSem.createOracleSemModel( oracle, szModelName); InputStream in = FileManager.get().open("./univ.owl" ); model.read(in, null); More loading examples using Jena Adapter Examples 7-2, 7-3, and 7-12 (SPARUL) [1] Loading RDFa graphOracleSem.getBulkUpdateHandler().prepareBulk( rdfaUrl, … ) 43 [1]: Oracle® Database Semantic Technologies Developer's Guide

44 More Data Loading Choices (3)
Load directly using Sesame Adapter OraclePool op = new OraclePool( OraclePool.getOracleDataSource(jdbcUrl, user, password)); OracleSailStore store = new OracleSailStore(op, model); SailRepository sr = new SailRepository(store); RepositoryConnection repConn = sr.getConnection(); repConn.setAutoCommit(false); repConn.add(new File(trigFile), "http://my.com/", RDFFormat.TRIG); repConn.commit(); More loading examples using Sesame Adapter Examples 8-5, 8-7, 8-8, 8-9, and 8-10 [1] 44 [1]: Oracle® Database Semantic Technologies Developer's Guide

45 Utility APIs SEM_APIS.remove_duplicates SEM_APIS.merge_models
e.g. exec sem_apis.remove_duplicates(’graph_model’); SEM_APIS.merge_models Can be used to clone model as well. e.g. exec sem_apis.merge_models(’model1’,’model2’); SEM_APIS.swap_names e.g. exec sem_apis.swap_names(’production_model’,’prototype_model’); SEM_APIS.alter_model (entailment) e.g. sem_apis.alter_model(’m1’, ’MOVE’, ’TBS_SLOWER’); SEM_APIS.rename_model/rename_entailment 45

46 Inference 46

47 Core Inference Features
Inference done using forward chaining Triples inferred and stored ahead of query time Removes on-the-fly reasoning and results in fast query times Various native rulebases provided E.g., RDFS, OWL 2 RL, SNOMED (EL+), SKOS Validation of inferred data User-defined rules Proof generation Shows one deduction path 47

48 OWL Subsets Supported OWL subsets for different applications
RDFS++ RDFS plus owl:sameAs and owl:InverseFunctionalProperty OWLSIF (OWL with IF semantics) Based on Dr. Horst’s pD* vocabulary¹ OWLPrime Includes RDFS++, OWLSIF with additional rules Jointly determined with domain experts, customers and partners OWL 2 RL W3C Standard Adds rules about keys, property chains, unions and intersections to OWLPrime SNOMED Choice of rulebases If ontology is in EL, choose SNOMED component If OWL 2 features (chains, keys) are not used, choose OWLPrime Choose OWL2RL otherwise. 48 1 Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary

49 11g Release 2 Inference Features
Richer semantics support OWL 2 RL, SKOS, SNOMED (subset of OWL 2 EL) Performance enhancements Large scale owl:sameAs handling Compact materialization of owl:sameAs closure Parallel inference Leverage native Oracle parallel query and parallel DML Incremental inference Efficient updates of inferred graph through additions Compact Data Structures Talk more about owl:sameAs, Parallel inference and Incremental inference here (since no other slides talk about them) 49

50 Semantics Characterized by Entailment Rules
RDFS has 14 entailment rules defined in the spec. E.g. rule : p rdfs:domain x . u p y  u rdf:type x . OWL 2 RL has 70+ entailment rules. E.g. rule : p rdf:type owl:FunctionalProperty . x p y1 . x p y  y1 owl:sameAs y2 . x owl:disjointWith y . a rdf:type x . b rdf:type y  a owl:differentFrom b . We have tuned each one of these rules to provide efficient implementation in the database These rules have efficient implementations in RDBMS 50

51 Inference APIs SEM_APIS.CREATE_ENTAILMENT( index_name
sem_models(‘GraphTBox’, ‘GraphABox’, …), sem_rulebases(‘OWL2RL), passes, inf_components, options ) Use “PROOF=T” to generate inference proof SEM_APIS.VALIDATE_ENTAILMENT( sem_models((‘GraphTBox’, ‘GraphABox’, …), sem_rulebases(‘OWLPrime’), criteria, max_conflicts, Jena Adapter API: GraphOracleSem.performInference() Typical Usage: First load RDF/OWL data Call create_entailment to generate inferred graph Query both original graph and inferred data Inferred graph contains only new triples! Saves time & resources Recommended API for inference Typical Usage: First load RDF/OWL data Call create_entailment to generate inferred graph Call validate_entailment to find inconsistencies Should we remove the “recommended API for inference” bubble? 51

52 Extending Semantics Supported by 11.2 OWL Inference
Option 1: add user-defined rules Both 10g and 11g RDF/OWL support user-defined rules in this form: Filter expressions are allowed ?x :hasAge ?age. ?age >  ?x :type :Adult. Antecedents Consequents ?x :parentOf ?y . ?z :brotherOf ?x . ?z :uncleOf ?y 52

53 TBox & Complete class tree Inference Engine in Oracle
Extending Semantics Supported by 11.2 OWL Inference Option 2: Separation in TBox and ABox reasoning through PelletDb (using Oracle Jena Adapter) TBox (schema related) tends to be small in size Generate a class subsumption tree using a complete DL reasoners like Pellet ABox (instance related) can be arbitrarily large Use the native inference engine in Oracle to infer new knowledge based on class subsumption tree from TBox TBox TBox & Complete class tree ABox DL reasoner Inference Engine in Oracle 53

54 Enabling Advanced Inference Capabilities
Parallel inference option EXECUTE sem_apis.create_entailment('M_IDX',sem_models('M'), sem_rulebases('OWLPRIME'), null, null, 'DOP=x'); Where ‘x’ is the degree of parallelism (DOP) Incremental inference option EXECUTE sem_apis.create_entailment ('M_IDX',sem_models('M'), sem_rulebases('OWLPRIME'),null,null, 'INC=T'); Enabling owl:sameAs option to limit duplicates EXECUTE Sem_apis.create _entailment('M_IDX',sem_models('M'), sem_rulebases('OWLPRIME'),null,null,'OPT_SAMEAS=T'); Compact data structures EXECUTE Sem_apis.create _entailment('M_IDX',sem_models('M'), sem_rulebases(‘OWLPRIME'),null,null, 'RAW8=T'); OWL2RL/SKOS inference EXECUTE Sem_apis.create_entailment('M_IDX',sem_models('M'), sem_rulebases(x),null,null…); x in (‘OWL2RL’,’SKOSCORE’) 54

55 Named Graph Based Global and Local Inference
Named Graph Based Global Inference (NGGI) Perform inference on just a subset of the triples Some usage examples Run NGGI on just the TBox Run NGGI on just a single named graph Run NGGI on just a single named graph and a TBox Named Graph Based Local Inference (NGLI) Perform local inference for each named graph (optionally with a common Tbox) Triples from different named graphs will not be mixed together. NGGI and NGLI together can achieve efficient named graph based inference maintenance 55

56 Querying Semantic Data
56

57 Semantic Operators Expand Terms for SQL SELECT
Scalable, efficient SQL operators to perform ontology-assisted query against enterprise relational data Query: “Find all entries in diagnosis column that are related to ‘Upper_Extremity_Fracture’” Syntactic query against relational table will not work! SELECT p_id, diagnosis FROM Patients  Zero Matches! WHERE diagnosis = ‘Upper_Extremity_Fracture; Rheumatoid_Arthritis 2 Hand_Fracture 1 DIAGNOSIS ID Patients diagnosis table Traditional Syntactic query against relational data New Semantic query against relational data (while consulting ontology) Finger_Fracture Arm_Fracture Upper_Extremity_Fracture Hand_Fracture Elbow_Fracture Forearm_Fracture rdfs:subClassOf SELECT p_id, diagnosis FROM Patients WHERE SEM_RELATED ( diagnosis, ‘rdfs:subClassOf’, ‘Upper_Extremity_Fracture’, ‘Medical_ontology’ = 1) AND SEM_DISTANCE() <= 2; SELECT p_id, diagnosis FROM Patients WHERE SEM_RELATED ( diagnosis, ‘rdfs:subClassOf’, ‘Upper_Extremity_Fracture’, ‘Medical_ontology’) = 1; 57

58 SPARQL Query Architecture
HTTP Standard SPARQL Endpoint Enhanced with query management control Jena API Jena Adapter Sesame API Sesame Adapter Java SPARQL-to-SQL Core Logic SQL SEM_MATCH 58

59 SEM_MATCH: Adding SPARQL to SQL
Extends SQL with SPARQL constructs Graph Patterns, OPTIONAL, UNION Dataset Constructs FILTER – including SPARQL built-ins Prologue Solution Modifiers Benefits: Allows SQL constructs/functions: JOINs with other object-relational data DDL Statements: create tables/views 59

60 SEM_MATCH: Adding SPARQL to SQL
SELECT n1, n2 FROM TABLE( SEM_MATCH( ‘PREFIX foaf: <http://...> SELECT ?n1 ?n2 FROM <http://g1> WHERE {?p foaf:name ?n1 OPTIONAL {?p foaf:knows ?f . ?f foaf:name ?n2 } FILTER (REGEX(?n1, “^A”)) } ORDER BY ?n1 ?n2’, SEM_MODELS(‘M1’),…));

61 SEM_MATCH: Adding SPARQL to SQL
SQL Table Function SELECT n1, n2 FROM TABLE( SEM_MATCH( ‘PREFIX foaf: <http://...> SELECT ?n1 ?n2 FROM <http://g1> WHERE {?p foaf:name ?n1 OPTIONAL {?p foaf:knows ?f . ?f foaf:name ?n2 } FILTER (REGEX(?n1, “^A”)) } ORDER BY ?n1 ?n2’, SEM_MODELS(‘M1’),…)); n1 n2 Alex Jerry Tom Alice Bill Jill John

62 SEM_MATCH: Adding SPARQL to SQL
Rewritable SQL Table Function SELECT n1, n2 FROM TABLE( SEM_MATCH( ‘PREFIX foaf: <http://...> SELECT ?n1 ?n2 FROM <http://g1> WHERE {?p foaf:name ?n1 OPTIONAL {?p foaf:knows ?f . ?f foaf:name ?n2 } FILTER (REGEX(?n1, “^A”)) } ORDER BY ?n1 ?n2’, SEM_MODELS(‘M1’),…)); ( SELECT v1.value AS n1, v2.value AS n2 FROM VALUES v1, VALUES v2 TRIPLES t1, TRIPLES t2, … WHERE t1.obj_id = v1.value_id AND t1.pred_id = 1234 AND … ) Get 1 declarative SQL query - Query optimizer sees 1 query - Get all the performance of Oracle SQL Engine - compression, indexes, parallelism, etc.

63 SEM_MATCH Table Function Arguments
‘SELECT ?a WHERE { ?a foaf:name ?b }’ SEM_MATCH( query, models, rulebases, options ); Basic unit of access control Container(s) for asserted quads Entailed triples + Built-in (e.g. OWL2RL) and user-defined rulebases ‘ALLOW_DUP=T STRICT_TERM_COMP=F’ 63 63

64 GovTrack in Oracle GovTrack RDF Data
RDF/OWL data about activities of US Congress Political Party Membership Voting Records Bill Sponsorship Committee Membership Offices and Terms GovTrack in Oracle GOV_TBOX GOV_PEOPLE GOV_ASSERT_VM GOV_BILLS_110 Asserted data only (2.8M triples) GOV_BILLS_111 OWL2RL GOV_VOTES_07 GOV_ALL_VM INFERENCE Asserted + Inferred (3.1M triples) GOV_VOTES_08 GOV_VOTES_09 Virtual Models Semantic Models Rulebases Entailments GOV_DISTRICTS (US Census) GOV_TRACK_OWL 64

65 create_virtual_model (vm_name, models, rulebases)
Virtual Models A virtual model is a logical RDF graph that can be used in a SEM_MATCH query. Result of UNION or UNION ALL of one or more models and optionally the corresponding entailment create_virtual_model (vm_name, models, rulebases) drop_virtual_model (vm_name) SEM_MATCH query accepts a single virtual model No other models or rulebases need to be specified DMLs on virtual models are not supported Creates a virtual model with the set of specified models and optionally the entailment (if rulebases are specified) 65 65

66 Virtual Model Example Creation Access Control
begin sem_apis.create_virtual_model('gov_assert_vm', sem_models('gov_tbox', 'gov_people', 'gov_votes_07', 'gov_votes_08', 'gov_votes_09', 'gov_bills_110', 'gov_bills_111', 'gov_districts')); sem_apis.create_virtual_model('gov_all_vm', 'gov_bills_111', 'gov_districts'), sem_rulebases('OWL2RL')); end; / Access Control grant select on mdsys.semv_gov_assert_vm to scott; grant select on mdsys.semv_gov_all_vm to scott; 66

67 Query Example 1: Basic Query
select fn, bday, g, t, hp, r from table(sem_match( 'SELECT ?fn ?bday ?g ?t ?hp ?r WHERE { ?s vcard:N ?n . ?n vcard:Family "Kennedy" . ?s foaf:name ?fn . ?s vcard:BDAY ?bday . ?s foaf:gender ?g . ?s foaf:title ?t . ?s foaf:homepage ?hp . ?s foaf:religion ?r }' ,sem_models('gov_all_vm'), null, null, null ,null,' ALLOW_DUP=T ' )); Find information about all Kennedys 67

68 Query Example 2: OPTIONAL Query
select fn, bday, g, t, hp, r from table(sem_match( 'SELECT ?fn ?bday ?g ?t ?hp ?r WHERE { ?s vcard:N ?n . ?n vcard:Family "Kennedy" . ?s foaf:name ?fn . ?s vcard:BDAY ?bday . ?s foaf:gender ?g . OPTIONAL { ?s foaf:title ?t . ?s foaf:homepage ?hp . ?s foaf:religion ?r } }' ,sem_models('gov_all_vm'), null, null, null, ,null, ' ALLOW_DUP=T ' )); Find information about all Kennedys, with title, homepage and religion optional 68

69 Query Example 3: Simple FILTER
select fname, lname from table(sem_match( 'SELECT ?fname ?lname WHERE { ?s rdf:type foaf:Person . ?s vcard:N ?vcard . ?vcard vcard:Given ?fname . ?vcard vcard:Family ?lname FILTER (STR(?lname) < "B") }' ,sem_models('gov_all_vm'), null, null, null ,null, ' ALLOW_DUP=T ' )); Find all people with a last name that starts with “A” 69

70 Query Example 4: Negation as Failure
select fn, bday, hp from table(sem_match( 'SELECT ?fn ?bday ?hp WHERE { ?s vcard:N ?n . ?n vcard:Family "Lincoln" . ?s vcard:BDAY ?bday . ?s foaf:name ?fn . FILTER (!BOUND(?hp)) OPTIONAL { ?s foaf:homepage ?hp } }' ,sem_models('gov_all_vm'), null, null, null ,null, ' ALLOW_DUP=T ' )); Find all Lincolns without a homepage 70

71 Query Example 5: UNION select title, dt from table(sem_match( 'SELECT ?title ?dt WHERE { ?s foaf:name "Barack Obama" . { ?b bill:sponsor ?s } UNION { ?b bill:cosponsor ?s } ?b dc:title ?title . ?b rdf:type bill:LegislativeDocument . ?b bill:introduced ?dt FILTER(" "^^xsd:date <= ?dt && ?dt < " "^^xsd:date ) }' ,sem_models('gov_all_vm'), null, null, null ,null, ' ALLOW_DUP=T ' )); Find all Legislative Documents introduced in February 2007 that were sponsored or cosponsored by Barack Obama 71

72 Query Example 6: Inference GovTrack Bill Types 72

73 Query Example 6: Inference
select title, dt, btype from table(sem_match( 'SELECT ?title ?dt ?btype WHERE { ?s foaf:name "Barack Obama" . ?b bill:sponsor ?s . ?b dc:title ?title . ?b rdf:type ?btype . ?b bill:introduced ?dt FILTER(" "^^xsd:date <= ?dt && ?dt < " "^^xsd:date ) }' ,sem_models('gov_assert_vm'), null, null, null ,null, ' ALLOW_DUP=T ' )); Find all Legislative Documents (and their types) sponsored by Barack Obama 73

74 Query Example 7: SQL Constructs (temporal interval computations)
select * from ( select fn, bday, tsfrom, (to_date(tsfrom,'YYYY-MM-DD') – to_date(bday,'YYYY-MM-DD')) YEAR(3) TO MONTH from table(sem_match( 'SELECT ?fn ?bay ?tsfrom WHERE { ?s vcard:BDAY ?bday . ?s foaf:name ?fn . ?s pol:hasRole ?role . ?role pol:forOffice ?office . ?role time:from ?tfrom . ?tfrom time:at ?tsfrom FILTER (?tsfrom >= " "^^xsd:date) }' ,sem_models('gov_all_vm'), null, null, null)) order by (to_date(tsfrom,'YYYY-MM-DD') – to_date(bday,'YYYY-MM-DD')) asc) where rownum <= 1; Find the youngest person to take office since 2000 74 74

75 Oracle Extensions for Text and Spatial
75

76 Full Text Indexing with Oracle Text
Filters graph patterns based on text search string Indexes all RDF Terms URIs, Literals, Language Tags, etc. Provide SPARQL extension function orardf:textContains(?var, “Oracle text search string”) Search String Group Operators: AND, OR, NOT, NEAR, … Term Operators: stem($), soundex(!), wildcard(%) SQL> exec sem_apis.add_datatype_index( 'http://xmlns.oracle.com/rdf/text');

77 Text Query Example Find all bills about Children and Taxes
select s, title, dt from table(sem_match( 'SELECT ?s ?title ?dt WHERE { ?b bill:sponsor ?s . ?s foaf:name ?n . ?b dc:title ?title . ?b bill:introduced ?dt FILTER (orardf:textContains(?title, "$children AND $taxes"))}' ,sem_models('gov_all_vm'), null, null, null ,null, ' ALLOW_DUP=T ' )); Find all bills about Children and Taxes

78 Spatial Support with Oracle Spatial
Support geometries encoded as orageo:WKTLiterals :semTech2011 orageo:hasPointGeometry "POINT( )"^^orageo:WKTLiteral . Provide library of spatial query functions SELECT ?s WHERE { ?s orageo:hasPointGeometry ?geom FILTER(orageo:withinDistance(?geom, "POINT( )"^^orageo:WKTLiteral, "distance=10 unit=KM"))

79 orageo:WKTLiteral Datatype
Optional leading Spatial Reference System URI followed by OGC WKT geometry string. <http://xmlns.oracle.com/rdf/geo/srid/{srid}> WGS 84 Longitude, Latitude is the default SRS (assumed if SRS URI is absent) SRS: WGS84 Longitude, Latitude "POINT( )"^^orageo:WKTLiteral SRS: NAD27 Longitude, Latitude "<http://xmlns.oracle.com/rdf/geo/srid/8260> POINT( )"^^orageo:WKTLiteral Prepare for spatial querying by creating a spatial index for the orageo:WKTLiteral datatype SQL> exec sem_apis.add_datatype_index( 'http://xmlns.oracle.com/rdf/geo/WKTLiteral', options=>'TOLERANCE=1.0 SRID=8307 DIMENSIONS=((LONGITUDE,-180,180)(LATITUDE,-90,90))');

80 What Types of Spatial Data are Supported?
Spatial Reference Systems Built-in support for 1000’s of SRS Plus you can define your own Coordinate system transformations applied transparently during indexing and query Geometry Types Support OGC Simple Features geometry types Point, Line, Polygon Multi-Point, Multi-Line, Multi-Polyon Geometry Collection Up to 500,000 vertices per Geometry

81 Spatial Function Library
Topological Relations orageo:relate Distance-based Operations orageo:distance, orageo:withinDistance, orageo:buffer, orageo:nearestNeighbor Geometry Operations orageo:area, orageo:length orageo:centroid, orageo:mbr, orageo:convexHull Geometry-Geometry Operations orageo:intersection, orageo:union, orageo:difference, orageo:xor

82 Congressional District Polygons (435)
GovTrack Spatial Demo Congressional District Polygons (435) Complex Geometries Average over 1000 vertices per geometry Load .shp file from US Census into Oracle Spatial Load into Oracle semantic model Generate triples using sdo_util.toWKTGeometry()

83 Spatial Query 1 Which congressional district contains Nahsua, NH
select name, cdist from table(sem_match( 'SELECT ?name ?cdist WHERE { ?person usgovt:name ?name . ?person pol:hasRole ?role . ?role pol:forOffice ?office . ?office pol:represents ?cdist . ?cdist orageo:hasWKTGeometry ?cgeom FILTER (orageo:relate(?cgeom, "POINT( )"^^orageo:WKTLiteral, "mask=contains")) } ' ,sem_models('gov_all_vm'), null, null, null ,null, ' ALLOW_DUP=T ' )); Which congressional district contains Nahsua, NH

84 Spatial Query 2 select name, cdist from table(sem_match( 'SELECT ?name ?cdist WHERE { ?person usgovt:name ?name . ?person pol:hasRole ?role . ?role pol:forOffice ?office . ?office pol:represents ?cdist . ?cdist orageo:hasWKTGeometry ?cgeom FILTER (orageo:nearestNeighbor(?cgeom, "POINT( )"^^orageo:WKTLiteral, "sdo_num_res=10")) } ORDER BY ASC(orageo:distance(orageo:centroid(?cgeom), "unit=KM"))' ,sem_models('gov_all_vm'), null, null, null ,null, ' ALLOW_DUP=T ')); Who are my nearest 10 representatives ordered by centerpoint

85 Jena Adapter for Oracle Database 11g Release 2
SPARQL Querying Jena Adapter for Oracle Database 11g Release 2 85

86 Jena Adapter for Oracle Database 11g Release 2
Implements Jena Semantic Web Framework APIs Popular Java APIs for semantic web based applications Adapter adds Oracle-specific extensions Jena Adapter provides three core features: Java API for Oracle RDF Store SPARQL Endpoint for Oracle with SPARQL 1.1. support Oracle-specific extensions for query execution control and management 86

87 Jena Adapter as a Java API for Oracle RDF
“Proxy” like design Data not cached in memory for scalability SPARQL query converted into SQL and executed inside DB Various optimizations to minimize the number of Oracle queries generated given a SPARQL 1.1. query Various data loading methods Bulk/Batch/Incremental load RDF or OWL (in N3, RDF/XML, N-TRIPLE etc.) with strict syntax verification and long literal support Allows integration of Oracle Database 11g RDF/OWL with various tools TopBraid Composer External OWL DL reasoners (e.g., Pellet) For instance, a SPARQL query containing only conjunctive patterns is converted into a single SEM_MATCH query 87

88 No need to create model manually!
Programming Semantic Applications in Java Create a connection object oracle = new Oracle(oracleConnection); Create a GraphOracleSem Object graph = new GraphOracleSem(oracle, model_name, attachment); Load data graph.add(Triple.create(…)); // for incremental triple additions Collect statistics graph.analyze(); Run inference graph.performInference(); graph.analyzeInferredGraph(); Query QueryFactory.create(…); queryExec = QueryExecutionFactory.create(query, model); resultSet = queryExec.execSelect(); No need to create model manually! Important for performance! GraphOracleSem is our implemetation of Jena’s Graph interface. A bit easier to user than PL/SQL APIs since a lot of work is done behind the scenes. 88

89 Jena Adapter Feature: SPARQL Endpoint
SPARQL service endpoint supporting full SPARQL Protocol Integrated with Jena/Joseki (deployed in WLS 10.3 or Tomcat 6) Uses J2EE data source for DB connection specification SPARQL 1.1. and Update (SPARUL) supported Oracle-specific declarative configuration options in Joseki Each URI endpoint is mapped to a Joseki service: <#service> rdf:type joseki:Service ; rdfs:label "SPARQL with Oracle Semantic Data Management" ; joseki:serviceRef "GOV_ALL_VM" ;#web.xml must route this name to Joseki joseki:dataset <#oracle> ; # dataset part joseki:processor joseki:ProcessorSPARQL_FixedDS; 89

90 SPARQL Endpoint: Example
Example Joseki Dataset configuration: <#oracle> rdf:type oracle:Dataset; joseki:poolSize 4; # Number of concurrent connections # allowed to this dataset. oracle:connection [ a oracle:OracleConnection ; ]; oracle:defaultModel [ oracle:firstModel "GOV_PEOPLE"; oracle:modelName "GOV_TBOX“; oracle:modelName "GOV_VOTES_07“; oracle:rulebaseName "OWLPRIME"; oracle:useVM "TRUE” ] ; oracle:namedModel [ oracle:namedModelURI <http://oracle.com/govtrack#GOV_VOTES_07>; oracle:firstModel "GOV_VOTES_07" ]. A virtual model will be created transparently. Users could also specify named graphs in the Joseki config file. Each named graph is mapped to an Oracle model. 90

91 Property Path Queries Part of SPARQL 1.1.
Regular expressions for properties: ? + * ^ / | Translated to hierarchical SQL queries Using Oracle CONNECT BY clause Examples: Find all reachable friends of John SELECT * WHERE { :John foaf:friendOf+ ?friend. } Find reachable friends through two different paths SELECT * WHERE { :John (foaf:friendOf|urn:friend)+ ?friend. } Get names of people John knows one step away: SELECT * WHERE {:John foaf:knows/foaf:name ?person}. Find all people that can be reached from John by foaf:knows ?x foaf:mbox . ?x foaf:knows+/foaf:name ?name . } 91

92 Query Extensions in Jena Adapter
Query management and execution control Timeout Query abort framework Including monitoring threads and a management servlet Designed for a J2EE cluster environment Hints allowed in SPARQL query syntax Parallel execution Support ARQ functions for projected variables fn:lower-case, upper-case, substring, … Native, system provided functions can be used in SPARQL oext:lower-literal, oext:upper-literal, oext:build-uri-for-id, … 92 92

93 Query Extensions in Jena Adapter
Extensible user-defined functions in SPARQL Example PREFIX ouext: <http://oracle.com/semtech/jena-adapter/ext/user-def-function#> SELECT ?subject ?object (ouext:my_strlen(?object) as ?obj1) WHERE { ?subject dc:title ?object } User can implement the my_strlen functions in Oracle Database Connection Pooling through OraclePool java.util.Properties prop = new java.util.Properties(); prop.setProperty("InitialLimit", "2"); // create 2 connections prop.setProperty("InactivityTimeout", "1800"); // seconds …. OraclePool op = new OraclePool(szJdbcURL, szUser, szPasswd, prop, "OracleSemConnPool"); Oracle oracle = op.getOracle(); 93

94 Semantic Indexing for Unstructured Content
94

95 Overview: Creating and Using a Semantic Index
auto maintained like a B-tree index Triples table with rowid references Analytical Queries On Graph Data Subject Property Object Graph p:Marcus rdf:type rc::Person <…/r1> :hasName “Marcus”^^… :hasAge “38”^^xsd:.. CREATE INDEX ArticleIndex ON Newsfeed (Article) INDEXTYPE IS SemContext PARAMETERS (‘my_policy’) LOCAL1 PARALLEL 4 Bulk SemContext index on Article column Newsfeed table extractor Batch Rowid docId Article p_date 1 Indiana authorities filed felony charges and a court issued an arrest warrant for a financial manager who apparently tried to fake his death by crashing his airplane in a Florida swamp. Marcus Schrenker, 38 … 02/01/11 2 Major dealers and investors … 11/30/10 .. SELECT Sem_Contains_Select(1) FROM Newsfeed WHERE Sem_Contains (Article, ‘{?x rdf:type rc:Person ?x :hasAge ?age . FILTER(?age >= 35)}’,1)=1 r1 content type: File (path) Text URL r2 AND p_date > to_date(‘01-Jan-11’) 1LOCAL index support for semantic indexing is restricted to range-partitioned base tables only.

96 Combining Ontologies with extracted triples
The triples extracted from each document can be augmented with Entailment created by combining the triples with schema ontologies and rulebase(s) Knowledge bases (stored as RDF models) and their entailments Augmentation achieved using dependent policies begin sem_rdfctx.create_policy ( policy_name => ‘my_policy_plus_geo’ , base_policy => ‘my_policy’ , user_models => SEM_MODELS(‘USGeography’) , user_entailments => SEM_MODELS( ‘Doc_inf’ ,‘USGeography_inf’)); end; SELECT docId FROM Newsfeed WHERE SEM_CONTAINS (Articles, ‘ { ?comp rdf:type c:Company . ?comp p:categoryName c:BusinessFinance . ?comp p:location ?city . ?city geo:state “NY”^^xsd:string}’, ‘my_policy_plus_geo’) = 1  Extracted triples  Knowledge bases (KBs)  (local) Entailments from extr. triples  Entailments from KBs Will result in a multi-model query involving: the RDF model for my_policy index, the RDF model USGeography, and the entailments.

97 Inference: document-centric
Base table <Man> <Adult> <Parent> <Male> <familiarWith> <grewUpIn> rdfs:subClassOf rdfs:subPropertyOf Ontology: schema triples (for extracted data) id document 1 John is a parent. He grew up in NYC. 2 John is a man. Semantic Index: extracted triples Entailment: set of inferred triples Subject Property Object Graph <John> rdf:type <Parent> <…/r1> <grewUpIn> <NYC> <Man> <…/r2> Subject Property Object Graph <John> rdf:type <Adult> <…/r1> <familiarWith> <NYC> <…/r2> <Male>

98 Abstract Extractor Type
An abstract extractor type, rdfctx_extractor (in PL/SQL), defines the common interfaces for all extractor implementations. Specific implementations for the abstract type interact with individual third-party extractors and produce RDF/XML documents for the input document. create or replace type rdfctx_extractor authid current_user as object ( member function extractRdf (document CLOB, docId VARCHAR2, params VARCHAR2, options VARCHAR2 default NULL) return CLOB member function batchExtractRdf (docCursor SYS_REFCURSOR, extracted_info_table VARCHAR2, params VARCHAR2, partition_name VARCHAR2 default NULL, docId VARCHAR2 default NULL, preferences SYS.XMLType default NULL, options VARCHAR2 default NULL) return CLOB, ) not instantiable not final / The base extractor policy has no Rule bases and user-defined models. Virtual extractor policy has the name of the one base extractor policy and a set of rule bases and/or user defined models (uses virtual models).

99 A sample extractor type -- interface
create or replace type rdfctxu.info_extractor under rdfctx_extractor ( overriding member function getDescription return VARCHAR2, overriding member function rdfReturnType return VARCHAR2, overriding member function getContext(attribute VARCHAR2) return VARCHAR2, ) overriding member function extractRDF( document CLOB, docId VARCHAR2, params VARCHAR2 …) return CLOB, overriding member function batchExtractRdf( docCursor SYS_REFCURSOR, extracted_info_table VARCHAR2, params VARCHAR2, partition_name VARCHAR2 default NULL …) return CLOB The base extractor policy has no Rule bases and user-defined models. Virtual extractor policy has the name of the one base extractor policy and a set of rule bases and/or user defined models (uses virtual models).

100 Enterprise Security for Semantic Data
100

101 Enterprise Security for Semantic Data
Model-level access control Each semantic model accessible through a view (RDFM_modelName) Grant/revoke privileges on the view Discretionary access control on application table for model Finer granularity possible through Oracle Label Security Triple level security Mandatory Access Control 101

102 Oracle Label Security – Mandatory Access Control
Data records and users tagged with security labels Labels determine the sensitivity of the data or the rights a person must posses in order to read or write the data. User labels indicate their access rights to the data records. For reads/deletes/updates: user’s label must dominate row’s label For inserts: user’s label applied to inserted row A Security Administrator assigns labels to users ContractID Organization ContractValue Label ProjectHLS N. America SE:HLS:US

103 OLS Data Classification
Label Components: Levels – Determine the vertical sensitivity of data and the highest classification level a user can access. Compartments – Facilitate compartmentalization of data. Users need exclusive membership to a compartment to access its data. Groups – Allow hierarchical categorization of data. A user with authorization to a parent group can access data in any of its child groups. All compartments / Any Groups CONF : NAVY,MILITARY : NY,DC  Row Label matches User Access Label HIGHCONF : MILITARY,NAVY,SPCLOPS : US,UK 103

104 RDF Triple-level Security with OLS
Security Label Organization N.America SE:HLS:US projectHLS ContractValue Security Label SE:HLS:FIN,US Subject Predicate Objects Triples table SE:HLS:FIN,US ContractValue projectHLS SE:HLS:US N.America Organization RowLabel Object Predicate Subject Sensitivity labels associated with individual triples control read access to the triples. Triples describing a single resource may employ different sensitivity labels for greater control.

105 Securing RDF Data using OLS: Example (1)
Create an OLS policy Policy is the container for all the labels and user authorizations Can create multiple policies containing different labels Create label components Levels: UN (unclassified) < SE (secret) < TS (top secret) Compartments: HLS (Homeland Security), CIA, FBI Groups: NY, DC EASTUS US SD, SF WESTUS Create labels “EASTSE” = SE:CIA,HLS:EASTUS “USUN” = UN:FBI,HLS:US 105

106 Securing RDF Data using OLS: Example (2)
Assign labels to users John “EASTSE” (SE:CIA,HLS:EASTUS) John can read SE and UN triples John can read triples for CIA and HLS John can read triples for NY, DC, and EASTUS When inserting a row, the default write label is “EASTSE” Mary “USUN” (UN:FBI,HLS:US) Mary can only read UN triples Mary can read triples for FBI and HLS Mary can read all group triples (e.g. SF, NY, WESTUS, etc) When inserting a row, the default write label is “USUN” 106

107 Securing RDF Data using OLS: Example (3)
Apply the OLS policy to RDF store Triple inserts, deletes, updates, and reads will use the policy John inserts triple: <http://John> <rdf:type> <http://Person> Mary inserts triple: <http://Mary> <rdf:type> <http://Person> Both these triples inserted in model but tagged with different label values (“EASTSE”, “USUN”) Users can have multiple labels Only one label active at any time (user can switch labels) Only active label applied to operations (e.g. queries, deletes, inferred triples) A user can be assigned multiple labels - By switching labels between triple insertions, triples can be tagged with different labels By switching labels between queries, the same user may see different subsets of the graph By switching labels between inference calls, inferred triples can be tagged with different labels 107

108 Securing RDF Data using OLS: Example (4)
Example labels and read access “EASTSE” (SE:CIA,HLS:EASTUS) “USUN” (UN:FBI,HLS:US) John Read Triple Label Mary Read No TS:HLS:DC SE:HLS,FBI:DC Yes UN:HLS:DC UN:HLS,CIA:NY SE:CIA:SF UN:HLS,FBI:NY UN:HLS:SF 108

109 Securing RDF Data using OLS: Example (5)
Same triple may exist with different labels: <http://John> <rdf:type> <http://Person> ‘UN:HLS:DC’ <http://John> <rdf:type> <http://Person> ‘SE:HLS:DC’ When Mary queries, only 1 triple returned (UN triple) When John queries, both UN and SE triples are returned No way to distinguish since we don’t return label information! Solution: use FILTER_LABEL option in SEM_MATCH This query will filter out triples that are dominated by SE: SELECT s,p,y FROM table(sem_match('{?s ?p ?y}' , sem_models(TEST'), null, null, null, null, ‘FILTER_LABEL=SE POLICY_NAME=DEFENSE’)); MIN_LABEL can be used to filter out untrustworthy data 109

110 or Search web for: Oracle RDF or oracle.com For More Information
See documentation at: or Search web for: Oracle RDF or oracle.com 110

111 111

112


Download ppt "Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality"

Similar presentations


Ads by Google