Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.

Similar presentations


Presentation on theme: "Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute."— Presentation transcript:

1 Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute

2  The majority of data underpinning the Web are stored in Relational Databases (RDB).  Advantages:  Secure and scalable architecture.  Efficient storage.  Reliability.  Disadvantages:  Difficult to share data across large organizations where different database schemata are used.  Most importantly, there is no check on semantics.

3  Semantic web getting more mature, growing need for RDF applications to access content of legacy databases.  Compared to RDB, RDF is:  More expressive.  More easily processed and interpreted.  Easily reasoned over by software agents.  Need a way to make data in RDBMS available as RDF.

4 In order to generate Semantic Web content from a RDB, Tim Berners-Lee proposed a very direct mapping:  Each table in the RDB is a RDF class.  Each field (column) name is a RDF property.  Each record is a RDF node - an instance of the RDF class and so can play the role of a subject or an object in a RDF statement.

5  Semi-automatic generation of ontology from RDB  Read all records, export as RDF triples.  Mappings are direct, complex mappings do not usually appear.  Need to convert to RDF regularly.  Does not allow the population of an existing ontology – a BIG limitation!  Map existing RDB to an existing ontology  Customize mapping according to existing ontology.  Complex mappings can be implemented.

6  Provides an integrated environment for accessing the content of non-RDF, relational databases as virtual, read-only RDF graphs.  Using D2RQ we can:  Query a non-RDF database using SPARQL queries.  Access information in a non-RDF database using the Jena API or the Sesame API.  Access the content of the database as Linked Data over the Web.

7  D2RQ mapping language – describes the relation between ontology and RDB  D2RQ engine – uses mappings to rewrite Jena and Sesame API calls to SQL queries.  D2R server - provides a Linked Data view, a HTML view for debugging and a SPARQL Protocol endpoint over the database.

8

9  D2RQ mapping language formally defined by http://www4.wiwiss.fu-berlin.de/bizer/d2rq/0.1/  D2RQ namespace is defined by http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#  Database compatibility:  Oracle  MySQL  PostgreSQL  Microsoft SQL Server  ODBC data sources (e.g. Microsoft Access) - mapping generator and automatic detection of column types do not work.

10 Two command line tools (only on Windows and Unix systems ):  Mapping generator:  Analyzes database schema.  Generates a default mapping file.  Resultant D2RQ map is an RDF document in N3 format.  Mapping can be used as-is or can be customized.  Dump script:  Writes the content of the RDB into a single RDF file.  Supported syntaxes are "RDF/XML" (the default), "RDF/XML-ABBREV", "N3", "N-TRIPLE".

11 Ontology is mapped to a database schema using:  d2rq:ClassMaps – Represents a class or a group of similar classes in the ontology. Specifies how instances of the class are identified.  d2rq:PropertyBridges – A ClassMap has a set of PropertyBridges which specify how the properties of an instance are created.

12

13

14 # Table dataset (default mapping) map:dataset a d2rq:ClassMap; d2rq:dataStorage map:database; d2rq:uriPattern "dataset/@@dataset.dataset_id@@"; d2rq:class vocab:dataset; d2rq:classDefinitionLabel "dataset";. map:dataset__label a d2rq:PropertyBridge; d2rq:belongsToClassMap map:dataset; d2rq:property rdfs:label; d2rq:pattern "dataset #@@dataset.dataset_id@@";. map:dataset_dataset_id a d2rq:PropertyBridge; d2rq:belongsToClassMap map:dataset; d2rq:property vocab:dataset_dataset_id; d2rq:propertyDefinitionLabel "dataset dataset_id"; d2rq:column "dataset.dataset_id"; d2rq:datatype xsd:int; # Table dataset (customized mapping) map:dataset a d2rq:ClassMap; d2rq:dataStorage map:database; d2rq:uriPattern "http://escience.rpi.edu/ontology/BCO- DMO/bcodmo/2/0/DeploymentDatasetCollection_@@dataset. dataset_id@@"; d2rq:class bcodmo:DeploymentDatasetCollection; d2rq:classDefinitionLabel "DeploymentDatasetCollection";. map:seeAlsoStatement a d2rq:PropertyBridge; d2rq:belongsToClassMap map:dataset; d2rq:property rdfs:seeAlso; d2rq:uriPattern "http://osprey.bcodmo.org/dataset.cfm?id=@@dataset.datase t_id@@&flag=view";. map:hasIdentifier a d2rq:PropertyBridge; d2rq:property bcodmo:hasIdentifier; d2rq:belongsToClassMap map:dataset; d2rq:column "dataset.dataset_id"; d2rq:datatype xsd:int;. map:dataset_dataset_id a d2rq:PropertyBridge; d2rq:belongsToClassMap map:dataset; d2rq:property bcodmo:hasParameter; d2rq:refersToClassMap map:parameters; d2rq:propertyDefinitionLabel "dataset dataset_id"; d2rq:join "dataset.dataset_id = dataset_parameters.dataset_id"; d2rq:join "dataset_parameters.parameters_id = parameters.parameters_id";.

15  Customization is very direct in the case where a class in the ontology is represented by a table in the database.  Mapping is complicated or sometimes not possible when a class in the ontology is not a table in the database, but a record in a database table.

16  Define primary keys wherever possible and create indexes.  Indicate directions in d2rq:joins.  Set d2rq:autoReloadMapping to false whenever not needed.  Use hint properties:  d2rq:valueMaxLength  d2rq:valueRegex  d2rq:valueContains

17  Performs reasonably well with basic triple patterns, performance deteriorates when SPARQL features such as OPTIONAL, FILTER and LIMIT are used.  Does not have reasoning capability. Reasoning can be added by using the D2RQ engine within Jena.  Integration of multiple databases or other data sources using D2RQ alone is not possible.  Read-only, cannot perform INSERT, DELETE or UPDATE operations.  Cannot handle complicated database structures like VIEWS.

18  Virtuoso RDF View:  Uses table to class and column to predicate approach.  RDB data are represented as virtual RDF graphs.  Customization of mapping possible.  Triplify:  Maps HTTP-URI requests to relational database queries expressed in SQL.  No SPARQL support.

19  R2O:  XML based declarative mapping language.  DartGrid Semantic Web toolkit:  Provides a visual tool to define mapping.  RDBToOnto  User oriented tool that creates static mapping (RDF dump).  Asio Semantic Bridge for Relational Databases (SBDR) and Automapper:  Uses table to class approach.

20  Prof. Peter Fox  Patrick West  Eric Rozell  Ankesh Khandelwal  Evan Patton


Download ppt "Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute."

Similar presentations


Ads by Google