Towards linked sensor data Analysis of project task, tools and Hackystat architecture Author: Myriam Leggieri GSoC 2009 project for Hackystat.

Slides:



Advertisements
Similar presentations
OASIS OData Technical Committee. AGENDA Introduction OASIS OData Technical Committee OData Overview Work of the Technical Committee Q&A.
Advertisements

Database System Concepts and Architecture
General introduction to Web services and an implementation example
ESDSWG2011 – Semantic Web session Semantic Web Sub-group Session ESDSWG 2011 Meeting – Semantic Web sub-group session Wednesday, November 2, 2011 Norfolk,
RDF and RDB 1 Some slides adapted from a presentation by Ivan Herman at the Semantic Technology & Business Conference, 2012.
JSI Sensor Middleware. Slide 2 of x Embedded vs. Midleware based Architecture for Sensor Metadata Management Embedded approach assign an IP address to.
Michael Povolotsky CMSC491s/691s. What is Virtuoso? Virtuoso, known as Virtuoso Universal Server, is a multi-protocol RDBMS Includes an object-relational.
Triple Stores
Technical Architectures
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
Microsoft ® Official Course Interacting with the Search Service Microsoft SharePoint 2013 SharePoint Practice.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Overview of Database Languages and Architectures.
Triple Stores.
1 Overview of Database Federation and IBM Garlic Project Presented by Xiaofen He.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Publishing data on the Web (with.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
What Can Do for You! Fabian Christ
Berlin SPARQL Benchmark (BSBM) Presented by: Nikhil Rajguru Christian Bizer and Andreas Schultz.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
RDF Triple Stores Nipun Bhatia Department of Computer Science. Stanford University.
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
An Extension to XML Schema for Structured Data Processing Presented by: Jacky Ma Date: 10 April 2002.
Implemented Systems Presenter: Manos Karpathiotakis Extended Semantic Web Conference 2012.
-By Mohamed Ershad Junaid UTD ID :
Universität Innsbruck Leopold Franzens  Copyright 2007 DERI Innsbruck EASAIER 18 Month Coordination Meeting, Tel Aviv, Israel WP 2 – Media.
CST203-2 Database Management Systems Lecture 2. One Tier Architecture Eg: In this scenario, a workgroup database is stored in a shared location on a single.
Database Support for Semantic Web Masoud Taghinezhad Omran Sharif University of Technology Computer Engineering Department Fall.
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
Open Data Protocol * Han Wang 11/30/2012 *
Master Informatique 1 Semantic Technologies Part 11Direct Mapping Werner Nutt.
1 CS 430 Database Theory Winter 2005 Lecture 17: Objects, XML, and DBMSs.
 Open source RDF framework in Java.  Supports RDF Schema inferencing and querying.  Supports SPARQL 1.1 query, update, federated query.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
May 2003National Coastal Data Development Center Brief Introduction Two components Data Exchange Infrastructure (DEI) Spatial Data Model (SDM) Together,
RDF languages and storages part 1 - expressivness Maciej Janik Conrad Ibanez CSCI 8350, Fall 2004.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
Practical RDF Chapter 10. Querying RDF: RDF as Data Shelley Powers, O’Reilly SNU IDB Lab. Hyewon Lim.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Practical RDF Ch.10 Querying RDF: RDF as Data Taewhi Lee SNU OOPSLA Lab. Shelley Powers, O’Reilly August 27, 2004.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Triple Stores. What is a triple store? A specialized database for RDF triples Can ingest RDF in a variety of formats Supports a query language – SPARQL.
RDF and Relational Databases
Triple Storage. Copyright  2006 by CEBT Triple(RDF) Storages  A triple store is designed to store and retrieve identities that are constructed from.
Object storage and object interoperability
ESG-CET Meeting, Boulder, CO, April 2008 Gateway Implementation 4/30/2008.
Raluca Paiu1 Semantic Web Search By Raluca PAIU
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
2) Database System Concepts and Architecture. Slide 2- 2 Outline Data Models and Their Categories Schemas, Instances, and States Three-Schema Architecture.
Sesame A generic architecture for storing and querying RDF and RDFs Written by Jeen Broekstra, Arjohn Kampman Summarized by Gihyun Gong.
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
The AstroGrid-D Information Service Stellaris A central grid component to store, manage and transform metadata - and connect to the VO!
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
SysML v2 Model Interoperability & Standard API Requirements Axel Reichwein Consultant, Koneksys December 10, 2015.
Introduction  Model contains different kinds of elements (such as hosts, databases, web servers, applications, etc)  Relations between these elements.
1 RDF Storage and Retrieval Systems Jan Pettersen Nytun, UiA.
Databases (CS507) CHAPTER 2.
Triple Stores.
RDF and RDB 1 Some slides adapted from a presentation by Ivan Herman at the Semantic Technology & Business Conference, 2012.
Middleware independent Information Service
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Chapter 9 Web Services: JAX-RPC, WSDL, XML Schema, and SOAP
SDMX Reference Infrastructure Introduction
Triple Stores.
LOD reference architecture
Triple Stores.
Presentation transcript:

Towards linked sensor data Analysis of project task, tools and Hackystat architecture Author: Myriam Leggieri GSoC 2009 project for Hackystat

Overview  Hackystat Architecture Data Flow Modifications to the Data flow Modifications to the Subsystem  Project Task Why RDF How RDF solves problems  RDF Model (->to be added tomorrow after a final revision) Metadata of Sensor Data Sensor Data Type  RDF Storage Requirements Performance Relational DB support OpenLink Virtuoso  Semantic Web Framework for Java Jena Sesame  Conclusions

Hackystat Architecture Data Flow Sensor_1Sensor_nSensor_2 ……. Text/XML data Unmarshall and Marshall of Text/XML data Text/XML data Unmarshall and Marshall of Text/XML data REST API SensorBase (PUT) (GET) REST API DailyProjectData Telemetry Server-side Client-side Text/XML data (GET)

Project Task Modifications to the Data flow  Add RDF representation of Sensor Data REST API SensorBase REST API DailyProjectData Telemetry DataBase Containing either simple data and RDF triples Forwarding of received requests for RDF RDF_Manager Necessary to enhance performance Server-side

Project Task Modifications to the Subsystem Extensible and Configurable along 3 dimensions: 1.The set of Sensors 2.The set of Sensor Data Types 3.The set of Applications Modules organized into 4 Subsystems: CORE Basic framework mechanisms APP Applications that Generate useful Analyses over The Sensor Data SENSOR Implement Sensors For Development tools SDT Implement Sensor Data Types Server-side Server/Client-side Client-side TASK: Add basic framework mechanisms to handle RDF representations (Server-side)

Project Task Why RDF Sensor data originally built for human consumption machine-readable but not machine-understandable PROBLEMS: 1.hard to automate their manipulation 2.especially their aggregation -> sparseness and redundancy of data SOLUTION: Metadata to describe the available sensor data The Resource Description Framework (RDF) is the W3C Recommendation for describing resources And it’s Domain-independent

Project Task How RDF solves problem 1 Framework = basic conceptual structure used to solve or address complex issues 1.Resource (conceptual mapping of entities) 2.Property (particular feature characterizing a resource) 3.Statements (triple in the form of (subject, predicate, object)) ResourcePropertyResource or Literal Resource But which is the meaning of ‘Creator’? RDF Schema = collection of classes organized in hierarchy defining terms used in the model and restrictions on their usage. Sort of vocabulary -> Machine-understandable

Project Task How RDF solves problem 2 Different meanings for the same resource == Different namespaces associated with that resource Different namespace can be combined == Different ways of classifying the world can be combined + Different schema linkable through proper properties (e.g. rdfs:subClassOf, rdfs:subPropertyOf, rdfs:seeAlso) And easily mergeable Easily aggregation of sparse data and integration of redundancy data Example: RSS 1.0 describes web resources using title, description and link extended here by adding modules under different namespaces -> further information added <rdf:RDF xmlns:rdf=" xmlns:dc=" xmlns=" > XML: A Disruptive Technology XML is placing increasingly heavy loads on the existing technical infrastructure of the Internet. The O'Reilly Network Simon St.Laurent Copyright © 2000 O'Reilly & Associates, Inc. XML

RDF Storage Requirements embedded DB = Apache Derby Hackystat has been ported also to PostGreSQL server Microsoft SQL server SHOULD Use the W3C recommended SPARQL as query language Support large dataset Means to implement owl:sameAs inference Support at least the same Relational DB supported by Hackystat

RDF Storage Performance RDF Stores: Vituoso : Sesame Jena TDB Jena SDB Relational DB-to-RDF wrappers: D2R server Virtuoso – RDF Views All Support SPARQL Which are their performace? rewrites SPARQL queries into SQL queries against an application-specific relational schemata based on a mapping The Berlin SPARQL Benchmark (BSBM) compares the performance of storage systems that expose SPARQL endpoints Performance increase of the SUTs (System Under Test) between the second query mix and the average query mix in steady state: Load times:

RDF Storage Relational DB support Relational DB/RDF DB SesameJenaVirtuosoD2RSer ver HSQLDBXVVX MySQLVVVV Postgre SQL VVVV OracleVVVV MS SQL Server XVVV Apache Derby XVVX has two unsolved issues (though the critical one can be workaround)

RDF Storage OpenLink Virtuoso has a general-purpose relational database engine enhanced with RDF-oriented data types (e.g. IRIs and language and type-tagged strings). RDF data may be stored as RDF quads (i.e., graph, subject, predicate, object tuples) RDF data may also be generated-on-demand by SPARQL queries against a virtual graph mapped from relational data, which may reside in Virtuoso tables or tables managed by any third party RDBMS Present heterogeneous RDBMS-es as a single consistent SQL queriable data universe Virtuoso RDF Views allows mapping arbitrary collections of relational tables, views, procedures, or web services into SPARQL accessible RDF. The RDF data is constructed on demand by evaluating SQL queries and stored procedures generated on the fly as part of a SPARQL query-processing pipeline. A Virtuoso Jena Provider (Native Graph Model Storage Provider for the Jena Framework ) And A Virtuoso Sesame Provider (Native Graph Model Storage Provider for the Sesame Framework ) exists

Semantic Web framework for Java Jena Applications interact With An abstract model

Semantic Web framework for Java Sesame defines interfaces and implementation for all basic RDF entities RDF parsers/writers from/to statement/file developer-oriented methods for uploading data files, querying, and extracting and manipulating data (implementations are e.g. SailRepository and HttpRepository) For a client/server implementation Java Servlets that implement a protocol for accessing Sesame repositories over HTTP (there are client libraries To use this protocol, e.g. HttpClient Used by HttpRepository) JDBC Memory Native Store data directly to disk (instead of in main memory) In a binary format optimized for Compact storage and fast retrieval abstract from the storage and inference details, allowing various types of storage and inference to be used (implemetations are e.g. MemoryStore, NativeStore, JDBCStore) For a local implementation 3 types of queries Depending on the returned type: tuples, graphs, boolean 2 Query-Languages supported: SeRQL, SPARQL (a W3C recommendation)

Conclusions There are the following possibilities to choose between: 1.Using the Jena API 2.Using OpenLink Virtuoso + Jena OR OpenLink Virtuoso + Sesame OpenLink Virtuoso As Relational-to-RDF wrapper OpenLink Virtuoso As Relational Database engine Only for RDF storageFor RDF, XML and any kind of storage (substituting any other existing relational DB)

Conclusions Jena PROS with respect to Sesame: 1.supports Derby and the most common relational DB 2.Simplicity As RDF storage system: 1.Quite good performance during benchmark (using Jena TDB) CONS: 1.Doesn’t provide REST API As RDF storage system: 1.Poor support to large dataset Sesame PROS with respect to Jena: 1.Availability as web application through REST API 2.More complete set of functionality especially the ones web-oriented As RDF storage system: 1.Better support to large dataset CONS: As RDF storage system: 1.Poor performance during benchmark OpenLink Virtuoso PROS: 1.Supports any relational DB 2.Uncomparable better performance on benchmarks 3.Present heterogeneous RDBMS-es can be viewed as a single consistent SQL queriable data universe Which is the most suitable combination of tools?