Data Intensive Techniques to Boost the Real-time Performance of Global Agricultural Data Infrastructures SEMAGROW U SING A POWDER T RIPLE S TORE FOR BOOSTING.

Slides:

Advertisements

Similar presentations

Università di Modena e Reggio Emilia ;-)WINK Maurizio Vincini UniMORE Researcher Università di Modena e Reggio Emilia WINK System: Intelligent Integration.

Advertisements

1 Ontolog Open Ontology Repository Review 19 February 2009.

…to Ontology Repositories Mathieu dAquin Knowledge Media Institute, The Open University From…

Digital Repositories – Linked Open Data – the possible Role of D4Science Workshop, December 2010, FAO use cases A tool to create Linked Data providers.

Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.

Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.

CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.

A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.

GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.

ISO TC184/SC4 Future architecture Rotterdam Progress on the Future SC4 Architecture PWI Friday 13 th November 2009.

1 Publishing Linked Sensor Data Semantic Sensor Networks Workshop 2010 In conjunction with the 9th International Semantic Web Conference (ISWC 2010), 7-11.

0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.

CS652 Spring 2004 Summary. Course Objectives  Learn how to extract, structure, and integrate Web information  Learn what the Semantic Web is  Learn.

By ANDREW ZITZELBERGER A Framework for Extraction Ontology Based Information Management.

ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.

Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.

Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang

OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR

Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.

What Can Do for You! Fabian Christ

Networking Session: Global Information Structures for Science & Cultural Heritage - The Interoperability Challenge «INTEROPERABILITY FROM THE CULTURAL.

4th project meeting 27-29/05/2013, Budapest, Hungary FP 7-INFRASTRUCTURES programme agINFRA agINFRA A data infrastructure for agriculture.

Data on the Web Life Cycle Bernadette Farias Lóscio March, 2014.

Workshop – 10, December 2014, Berlin ICCS / NTUA Greece Efthymios Chondrogiannis An Intelligent Ontology Alignment Tool Dealing with Complicated Mismatches.

Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2

Peer-to-Peer Data Integration Using Distributed Bridges Neal Arthorne B. Eng. Computer Systems (2002) Supervisor: Babak Esfandiari April 12, 2005 Candidate.

The Semantic Web Web Science Systems Development Spring 2015.

An Integration Framework for Sensor Networks and Data Stream Management Systems.

Introduction to MDA (Model Driven Architecture) CYT.

PART IV: REPRESENTING, EXPLAINING, AND PROCESSING ALIGNMENTS & PART V: CONCLUSIONS Ontology Matching Jerome Euzenat and Pavel Shvaiko.

SWETO: Large-Scale Semantic Web Test-bed Ontology In Action Workshop (Banff Alberta, Canada June 21 st 2004) Boanerges Aleman-MezaBoanerges Aleman-Meza,

ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.

Supported by EU projects 12/12/2013 Athens, Greece Open Data in Agriculture Hands-on with data infrastructures that can power your agricultural data products.

10/18/20151 Business Process Management and Semantic Technologies B. Ramamurthy.

Development Process and Testing Tools for Content Standards OASIS Symposium: The Meaning of Interoperability May 9, 2006 Simon Frechette, NIST.

19/10/20151 Semantic WEB Scientific Data Integration Vladimir Serebryakov Computing Centre of the Russian Academy of Science Proposal: SkTech.RC/IT/Madnick.

Topic Rathachai Chawuthai Information Management CSIM / AIT Review Draft/Issued document 0.1.

Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.

Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.

Management of Digital Content in Business Environments Constantine D. Spyropoulos Director of Institute of Informatics & Telecommunications NCSR “Demokritos”

A Context Model based on Ontological Languages: a Proposal for Information Visualization School of Informatics Castilla-La Mancha University Ramón Hervás.

Interoperability & Knowledge Sharing Advisor: Dr. Sudha Ram Dr. Jinsoo Park Kangsuk Kim (former MS Student) Yousub Hwang (Ph.D. Student)

Towards Distributed Information Retrieval in the Semantic Web: Query Reformulation Using the Framework Wednesday 14 th of June, 2006.

Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.

Using Several Ontologies for Describing Audio-Visual Documents: A Case Study in the Medical Domain Sunday 29 th of May, 2005 Antoine Isaac 1 & Raphaël.

A Systemic Approach for Effective Semantic Access to Cultural Content Ilianna Kollia, Vassilis Tzouvaras, Nasos Drosopoulos and George Stamou Presenter:

Project Overview Vangelis Karkaletsis NCSR “Demokritos” Frascati, July 17, 2002 (IST )

10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.

Database Environment Chapter 2. Data Independence Sometimes the way data are physically organized depends on the requirements of the application. Result:

SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.

Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global.

Semantic Enhancement: Key to Massive and Heterogeneous Data Pools Violeta Damjanovic, Thomas Kurz, Rupert Westenthaler, Wernher Behrendt, Andreas Gruber,

Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.

Deepcarbon.net Xiaogang Ma, Patrick West, John Erickson, Stephan Zednik, Yu Chen, Han Wang, Hao Zhong, Peter Fox Tetherless World Constellation Rensselaer.

NeOn Components for Ontology Sharing and Reuse Mathieu d’Aquin (and the NeOn Consortium) KMi, the Open Univeristy, UK

Example projects using metadata and thesauri: the Biodiversity World Project Richard White Cardiff University, UK

Prizms for Data Publication and Management Katie Chastain May 9, 2014.

1 Chapter 2 Database Environment Pearson Education © 2009.

LoCloud Conference - Sharing local cultural heritage online with LoCloud services Microservices in LoCloud Walter Koch Gerda Koch

Linked Library (+AM) Data Presented LITA Next-Generation Catalog IG Corey A Harper Publish, Enrich, Relate and Un-Silo.

Infrastructure and Workflow for the Formal Evaluation of Semantic Search Technologies Stuart N. Wrigley 1, Raúl García-Castro 2 and Cassia Trojahn 3 1.

METADATA MANAGEMENT AT ISTAT: CONCEPTUAL FOUNDATIONS AND TOOLS Istituto Nazionale di Statistica ITALY.

System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.

Cloud based linked data platform for Structural Engineering Experiment

Middleware independent Information Service

knowledge organization for a food secure world

YourDataStories: Transparency and Corruption Fighting through Data Interlinking and Visual Exploration Georgios Petasis1, Anna Triantafillou2, Eric Karstens3.

Business Process Management and Semantic Technologies

Taxonomy of public services

Presentation transcript:

Data Intensive Techniques to Boost the Real-time Performance of Global Agricultural Data Infrastructures SEMAGROW U SING A POWDER T RIPLE S TORE FOR BOOSTING THE REAL - TIME PERFORMANCE OF GLOBAL AGRICULTURAL DATA INFRASTRUCTURES KREAM June 2013 Pythagoras Karampiperis National Centre for Scientific Research “Demokritos”

Outline 5 June 2013 KREAM /15  Introduction / Problem Statement  The SemaGrow Solution  The POWDER W3C Recommendation  SemaGrow Architecture  The SemaGrow Stack  SemaGrow Maintenance Components

Moving Forward with “Old” Technologies 3/15 KREAM June 2013 How Many? BigData Problem ! Is it feasible?

What Semantic Web can bring into the picture KREAM /15 5 June 2013  Going beyond existing Distributed Triple Store Implementations  Link Heterogeneous but Semantically Connected Data  Index Extremely Large Information Volumes (Peta Sizes)  Improve Information Retrieval response  Data (+Metadata) physically stored in Data Provider  No need for harvesting  Vocabularies / Thesauri / Ontologies of Data Provider choice  No need for aligning according to common schemas  One Data Access Point for the entire Data Cloud  Enabling Service-Data level agreements with Data providers  Application-level Vocabularies / Thesauri / Ontologies  Enabling different application facets for different communities of users over the SAME data pool

The SemaGrow Solution 5 June 2013 KREAM /15  Use POWDER to mass-annotate large-subspaces  Exploit naming convention regularities to compress the indexes used by the system  Partition triple patterns in the original query  Annotate each fragment with an ordered list of data sources most likely to contain relevant data  Distribute and transform the query fragments  Collect and align the results

The POWDER W3C Recommendation 5 June 2013 KREAM /15  Exploits natural groupings of URIs to annotate all resources in a subset of the URI space  Regular expression based grouping  Allows properties and their values to be associated with an arbitrary number of subjects within a fully- defined semantic framework  POWDER Description Resources:  POWDER Formal Semantics:

The SemaGrow Stack 5 June 2013 KREAM /15  Integrates the components in order to offer a single SPARQL endpoint that federates a number of heterogeneous data sources  Targets the federation of independently provided data sources

SemaGrow Architecture 5 June 2013 KREAM /15 Query Decomposition Resource Discovery Data Summaries Endpoint Federated Endpoint Wrapper

Query Decomposition 5 June 2013 KREAM /15  Analyses SPARQL queries  Decides on the optimal way to create query fragments to be dispatched to sources’ endpoints  Components  Query Decomposition: Suggestions of possible decompositions  Selector: Evaluates these suggestions based on information and predictions from the Resource Discovery Component

Resource Discovery 5 June 2013 KREAM /15  Provides an annotated list of candidate data sources that (possibly) hold triples matching a query pattern  Sources are annotated with additional information  Schema-level metadata  Instance-level metadata  Predicted Response Volume  Run-time information about current source load  Semantic proximity of source and query schemas

Data Summaries Endpoint 5 June 2013 KREAM /15  Serves metadata about the schema and instances of the various federated data stores  Receives entity URIs  Returns the repositories where these entities are located (either at the schema or instance level)  Returns ontology alignment knowledge regarding entity equivalence between different sources

Federated Endpoint Wrapper 5 June 2013 KREAM /15  Manages the communication with external data sources federated by the SemaGrow Stack  Query Manager  Call Query Transformation Service when necessary  Forwarding query fragments to the Query Results Merger  Collecting and forwarding run-time statistics to the Resource Discovery Component  Query Results Merger  Pay-as-you-go behaviour  Provides first approximations and iteratively refines them if more computational resources are warranted by the reactivity parameters  Query Transformation Service  Accesses the Schema Mappings Repository  Rewrites query fragments from the original query schema to that of the data source that will be used for the fragment  Rewrites query results from the source schema to the query schema

Maintenance Components 5 June 2013 KREAM /15  Authoring Tool  Visual tool for assisting data providers  Construction of POWDER statements  Provenance and cataloguing metadata  Ontology Alignment Tool  Semi-automatic (human intervention) alignment of Semantic Vocabularies used by data providers and consumers  Content Classification and Ontology Evolution  Refine coarsely annotated data to a level of detail where they can be more accurately aligned with other schemas within the federation

Project info 5 June 2013 KREAM /15  SemaGrow: Data intensive techniques to boost the real-time performance of global agricultural data infrastructures  FP7-ICT (Intelligent Information Management) No.NameCountry 1Universidad de Alcala 2NCSR “Demokritos” 3Universita Degli Studi di Roma Tor Vergata 4Semantic Web Company 5Institut Za Fiziku 6Stichting Dienst Landbouwkundik Onderzoek 7Food and Agriculture Organization of the UN 8Agroknow Technologies

Thank You! 5 June 2013 KREAM /15 Dr. Pythagoras P. Karampiperis ( ) Institute of Informatics & Telecommunications (IIT), NCSR “Demokritos” (NCSR)