FORTH Research Activities PlanetData WP1-3 Meeting (Frankfurt, Nov10) Giorgos Flouris, Irini Fundulaki – FORTH.

Slides:



Advertisements
Similar presentations
Access control for geospatial information objects using/extending the eXtensible Access Control Markup Language Andreas Matheus, Technische Universität.
Advertisements

Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
WP3: Provenance and Access Control Irini Fundulaki Giorgos Flouris Institute of Computer Science-FORTH 1st year review Luxembourg, December 2011.
Open Provenance Model Tutorial Session 2: OPM Overview and Semantics Luc Moreau University of Southampton.
WIMS 2014, June 2-4Thessaloniki, Greece1 Optimized Backward Chaining Reasoning System for a Semantic Web Hui Shi, Kurt Maly, and Steven Zeil Contact:
Chapter 6: Modeling and Representation Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.
Knowledge Representation
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
Identity Management Based on P3P Authors: Oliver Berthold and Marit Kohntopp P3P = Platform for Privacy Preferences Project.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.
8.2 Discretionary Access Control Models Weiling Li.
Effective Coordination of Multiple Intelligent Agents for Command and Control The Robotics Institute Carnegie Mellon University PI: Katia Sycara
Xyleme A Dynamic Warehouse for XML Data of the Web.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Distributed Computer Security 8.2 Discretionary Access Control Models - Sai Phalgun Tatavarthy.
Audumbar Chormale Advisor: Dr. Anupam Joshi M.S. Thesis Defense
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
10 December, 2013 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: DPM Meta model CWA1Page 1.
TAPP-09 23/02/2009Giorgos Flouris1 On Explicit Provenance Management in RDF/S Graphs Institute of Computer Science Foundation for Research and Technology.
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
BOEMIE Workshop 02/12/2008Giorgos Flouris1 Formalizing the Evolution Process Institute of Computer Science Foundation for Research and Technology – Hellas.
 Introduction Introduction  Purpose of Database SystemsPurpose of Database Systems  Levels of Abstraction Levels of Abstraction  Instances and Schemas.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
Practical RDF Chapter 1. RDF: An Introduction
Okech Odhiambo Faculty of Information Technology Strathmore University
Applying Belief Change to Ontology Evolution PhD Student Computer Science Department University of Crete Giorgos Flouris Research Assistant.
Design patterns. What is a design pattern? Christopher Alexander: «The pattern describes a problem which again and again occurs in the work, as well as.
SemSearch: A Search Engine for the Semantic Web Yuangui Lei, Victoria Uren, Enrico Motta Knowledge Media Institute The Open University EKAW 2006 Presented.
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
1 Dept of Information and Communication Technology Creating Objects in Flexible Authorization Framework ¹ Dep. of Information and Communication Technology,
Knowledge Modeling, use of information sources in the study of domains and inter-domain relationships - A Learning Paradigm by Sanjeev Thacker.
Next-generation databases Active databases: when a particular event occurs and given conditions are satisfied then some actions are executed. An active.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
A bad case of content reuse Validator Website to Validate License Violations Validator – Only requires the URI of the site to check This work by Oshani.
1 Introduction to Software Engineering Lecture 1.
Part I: Set Constructs. RDF Schema (RDFS) RDF does not provide mechanisms to define domain classes and properties RDFS is a vocabulary that provides many.
Extending context models for privacy in pervasive computing environments Jadwiga Indulska The School of Information Technology and Electrical Engineering,
Q2Semantic: A Lightweight Keyword Interface to Semantic Search Haofen Wang 1, Kang Zhang 1, Qiaoling Liu 1, Thanh Tran 2, and Yong Yu 1 1 Apex Lab, Shanghai.
A Context Model based on Ontological Languages: a Proposal for Information Visualization School of Informatics Castilla-La Mancha University Ramón Hervás.
1 Relational Algebra and Calculas Chapter 4, Part A.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Using Several Ontologies for Describing Audio-Visual Documents: A Case Study in the Medical Domain Sunday 29 th of May, 2005 Antoine Isaac 1 & Raphaël.
WP3: Provenance and Access Policies Giorgos Flouris (FORTH) - Irini Fundulaki (CWI & FORTH) -
What’s MPEG-21 ? (a short summary of available papers by OCCAMM)
1 Artificial Intelligence Applications Institute Centre for Intelligent Systems and their Applications Stuart Aitken Artificial Intelligence Applications.
A Comparative Study of Specification Models for Autonomic Access Control of Digital Rights K. Bhoopalam,K. Maly, R. MukkamalaM. Zubair Old Dominion University.
OilEd An Introduction to OilEd Sean Bechhofer. Topics we will discuss Basic OilEd use –Defining Classes, Properties and Individuals in an Ontology –This.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Of 38 lecture 6: rdf – axiomatic semantics and query.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Knowledge Technologies Manolis Koubarakis 1 Some Other Useful Features of RDF.
WP3: Data Provenance and Access Control Irini Fundulaki, FORTH December 11-12, 2012, Luxembourg.
Multiple-goal Search Algorithms and their Application to Web Crawling Dmitry Davidov and Shaul Markovitch Computer Science Department Technion, Haifa 32000,
哈工大信息检索研究室 HITIR ’ s Update Summary at TAC2008 Extractive Content Selection Using Evolutionary Manifold-ranking and Spectral Clustering Reporter: Ph.d.
Database Design, Application Development, and Administration, 6 th Edition Copyright © 2015 by Michael V. Mannino. All rights reserved. Chapter 5 Understanding.
Combined Metamodel for UCM Contributed by Anthony B. Coates, Londata 17 February, 2008.
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Provenance Interoperability and Reasoning Yannis Tzitzikas Assistant.
Pattern-Directed Programming
Building Trustworthy Semantic Webs
ece 720 intelligent web: ontology and beyond
Service-Oriented Computing: Semantics, Processes, Agents
Scalable and Efficient Reasoning for Enforcing Role-Based Access Control
Service-Oriented Computing: Semantics, Processes, Agents
Scalable and Efficient Reasoning for Enforcing Role-Based Access Control
Building Trustworthy Semantic Webs
On Provenance of Queries on Linked Web Data
Presentation transcript:

FORTH Research Activities PlanetData WP1-3 Meeting (Frankfurt, Nov10) Giorgos Flouris, Irini Fundulaki – FORTH

Slide 2 of 40November 25, 2010, Frankfurt, Germany FORTH in PD Research WP2: Quality Assessment and Context ◦T2.1: Data Quality Assessment and Repair (FUB, M1-M42) WP3: Provenance and Access Policies ◦T3.1: Provenance Management (FORTH, M1-M36) ◦T3.2: Privacy, DRM, and Access Control (FORTH, M1-M42)

Slide 3 of 40November 25, 2010, Frankfurt, Germany Presentation Outline Three main topics/tasks ◦Repair (T2.1) ◦Provenance (T3.1) ◦Access control, privacy, DRM (T3.2) Outline ◦Summary and objectives ◦Introduction and motivation ◦Existing work and research plan ◦Innovation ◦Interactions within the project

Slide 4 of 40November 25, 2010, Frankfurt, Germany PART I: Repairs WP2: Quality Assessment and Context ◦T2.1: Data Quality Assessment and Repair (FUB, M1-M42) Objective of our work ◦Study methodologies for repairing invalidities in a way that will cause minimal effects upon the data Extra: ◦Apply the same methodologies for updates

Slide 5 of 40November 25, 2010, Frankfurt, Germany Repairs: Introduction Validity rules to guarantee: ◦Special semantics (e.g., acyclic subsumptions) ◦Application-specific rules or requirements (e.g., functional properties) Validity: important dimension of quality ◦Violated at design time ◦Violated during updates or other changes ◦Violated when validity rules change Solution: Repair ◦Given an invalid graph, produce a valid one that is as close as possible to the original

Slide 6 of 40November 25, 2010, Frankfurt, Germany Repairing Process Repair Invalid graph Valid graph Main Challenges: 1)Several potential repairs 2)Must find the “closest” ones Major Questions: 1)How to determine potential repairs? 2)How is “distance” measured? Assumptions: 1)RDF/S graphs 2)Rules expressed in DED form

Slide 7 of 40November 25, 2010, Frankfurt, Germany Example Validity Rules: properties should have a unique domain and range subject/object of a property instance should be correctly classified per the property’s domain/range A rdf:type rdfs:Class B rdf:type rdfs:Class C rdf:type rdfs:Class P rdf:type rdf:Property P rdfs:range B P rdfs:domain A P rdfs:domain C x P y x rdf:type A y rdf:type B B P AC P yx P

Slide 8 of 40November 25, 2010, Frankfurt, Germany A rdf:type rdfs:Class B rdf:type rdfs:Class C rdf:type rdfs:Class P rdf:type rdf:Property P rdfs:range B P rdfs:domain A P rdfs:domain C x P y x rdf:type A y rdf:type B Example (Resolution #1) Validity Rules: properties should have a unique domain and range subject/object of a property instance should be correctly classified per the property’s domain/range Problem: two domains for the same property Solution: delete one of the domains B P AC P yx P

Slide 9 of 40November 25, 2010, Frankfurt, Germany B AC P yx P Solution #1 A rdf:type rdfs:Class B rdf:type rdfs:Class C rdf:type rdfs:Class P rdf:type rdf:Property P rdfs:range B P rdfs:domain A x P y x rdf:type A y rdf:type B Example (Resolution #2) Validity Rules: properties should have a unique domain and range subject/object of a property instance should be correctly classified per the property’s domain/range Solution #2 A rdf:type rdfs:Class B rdf:type rdfs:Class C rdf:type rdfs:Class P rdf:type rdf:Property P rdfs:range B P rdfs:domain C x P y x rdf:type A y rdf:type B Problem: incorrect classification Solution: change domain OR make x instance of C OR delete property instance [by the rule syntax] B P AC yx P

Slide 10 of 40November 25, 2010, Frankfurt, Germany Potential Repairs: Challenges Easy to determine how to resolve a single violated rule, but … ◦Several violations ◦Several repairing options per violation Resolution interdependencies: ◦Repairing one violation in a certain way may cause another violation ◦Repairing one violation in a certain way may repair multiple violations Need for an exhaustive, rule-based search to determine all potential repairs ◦Tree-based search (recursive) K

Slide 11 of 40November 25, 2010, Frankfurt, Germany Selecting a Potential Repair Which repair should be returned? ◦We want the repaired KB to be as close as possible to the original User-defined notion of “distance” ◦Specifications for selecting “preferred repairs” ◦Based on user-defined preferences Preferred repair depends on the context and application, for example: ◦Under complete knowledge, prefer removals ◦In an open setting, prefer additions

Slide 12 of 40November 25, 2010, Frankfurt, Germany Determining Preferences Provide specifications to determine the preferred repair ◦Important features of a potential repair E.g.: additions, schema changes etc ◦Comparing the values of important features E.g.: minimize, “around” etc ◦Combine features (preferences) E.g.: prioritize, pareto-preference etc Flight analogy ◦Minimize number of stops ◦Minimize cost

Slide 13 of 40November 25, 2010, Frankfurt, Germany Repairs: Summary Framework for repairing invalidities in RDF/S graphs ◦Potential repairs determined using syntactical manipulations over the validity rules ◦Preferred repairs determined using formal preferences Research plan: ◦Formal description of a repair framework ◦Develop, optimize, experiment with, study repair algorithm

Slide 14 of 40November 25, 2010, Frankfurt, Germany Innovation Existing approaches: ◦In-built preferences ◦In-built validity rules Our proposal is: ◦Flexible: preferences can be set at run-time ◦Adaptable: different rules and preferences ◦Intuitive: easy-to-define preferences ◦Very general: different repair policies from the literature can be expressed in our framework ◦Easy to be implemented: we can use off-the- shelf implementations for preference evaluation

Slide 15 of 40November 25, 2010, Frankfurt, Germany Interactions within PD Only within the WP2 and Task 2.1 Related deliverable: D2.2 (M18)

Slide 16 of 40November 25, 2010, Frankfurt, Germany An Extra: Updates Apply an update on an RDF/S graph, in the presence of validity rules ◦Originally a part of WP1 – not any more Similar ideas as with repair ◦Apply the update (in a naïve manner) ◦Repair the result  Taking into account what the update was Principles ◦Success (update must be applied) ◦Validity (result must be valid) ◦Minimal change (minimal “distance” - preferences)

Slide 17 of 40November 25, 2010, Frankfurt, Germany PART II: Provenance WP3: Provenance and Access Policies ◦T3.1: Provenance Management (FORTH, M1-M36) Objectives of our work ◦Provenance for RDF and RDFS (inference) ◦Provenance for SPARQL query and update ◦Efficient storage schemes for provenance

Slide 18 of 40November 25, 2010, Frankfurt, Germany Provenance: Introduction Provenance: information on the origin of data ◦From where and how the piece of data was obtained Allows/supports: ◦Assessment of data trustworthiness and quality ◦Reproducibility of experiments ◦Justification of decisions (e.g., argumentation) ◦Access control, privacy, DRM, trust Focus on RDF/S ◦Inspired by DB provenance and annotation models

Slide 19 of 40November 25, 2010, Frankfurt, Germany Main Challenge RDF triples RDFS inference rules ghf which provenance? a b c e d Provenance tag = colour ◦A subset I of URIs distinguished from the set of class and property names or types A B C ?

Slide 20 of 40November 25, 2010, Frankfurt, Germany Annotation Models Annotation models: ◦Annotation computation coupled with a particular application and a particular assignment of source data annotations XYAnnot abt cdt YZ be XYZ abe R1R1 R2R2 R 1 R 2 f t tf re-evaluate the query t: trusted f: untrusted

Slide 21 of 40November 25, 2010, Frankfurt, Germany XYAnnot abc1c1 cdc2c2 YZ bec3c3 XYZ abec 1 x c 3 R1R1 R2R2 R 1 R 2 tttt t t Λ t f t Λ f Abstract Annotation Models Abstract annotation models: ◦Abstract provenance tokens and operators are substituted by appropriate concrete tokens for a particular application and assignment

Slide 22 of 40November 25, 2010, Frankfurt, Germany Inference and Provenance Colours: a subset I of URIs distinguished from the set of class and property names or types To model colour propagation through inference rules we define an operation ‘+’ to compose colours (I, ‘+’) is a commutative semigroup ◦c 1 + c 2 = c 2 + c 1 (commutativity) ◦c 1 + (c 2 + c 3 ) = (c 1 + c 2 ) + c 3 (associativity) ◦c + c = c(idempotence)

Slide 23 of 40November 25, 2010, Frankfurt, Germany Inference and Provenance Why provenance ◦Which explicit triples contributed to get an implicit one? ◦Ignore how (i.e., which rules were used) ◦A single operator ‘+’ for all inference rules Ignore how many times a triple was used ◦‘+’: idempotent[c + c = c] Ignore the order of application ◦‘+’: commutativec 1 + c 2 = c 2 + c 1 ◦‘+’: associativec 1 + (c 2 +c 3 ) = (c 1 +c 2 ) + c 3

Slide 24 of 40November 25, 2010, Frankfurt, Germany SPARQL Provenance Model SPARQL construct queries generate triples in a manner similar to inference ◦Except that it is query-dependent Similar problems Abstract annotation models can capture the provenance of SPARQL ◦Queries that do not consider the OPTIONAL Operator ◦Monotonicity no longer holds in the case of OPTIONAL

Slide 25 of 40November 25, 2010, Frankfurt, Germany Work So Far (Provenance) Provenance models for RDF/S ◦Pediaditis, Flouris, Fundulaki, Christophides. On Explicit Provenance Management in RDF/S Graphs. TAPP-09. ◦Flouris, Fundulaki, Pediaditis, Theoharis, Christophides. Coloring RDF Triples to Capture Provenance. ISWC-09. Provenance models for SPARQL ◦Theoharis, Fundulaki, Karvounarakis, Christophides. On Provenance of Queries on Linked Web Data. To appear in IEEE Internet Computing: Jan/Feb Provenance in Web Applications.

Slide 26 of 40November 25, 2010, Frankfurt, Germany Research Plans (Provenance) How provenance (more expressive) Provenance for dynamically evolving data Support OPTIONAL (in SPARQL) Efficient storage schemes for provenance Apply this work on privacy, DRM and access control

Slide 27 of 40November 25, 2010, Frankfurt, Germany Innovation Use of abstract annotation models to model provenance propagation in the Semantic Web context (RDF/S) Current state-of-the-art either concrete and designed for a given application or designed for the DB context Advantages ◦Easy to update the KB ◦Easy to change or experiment with different provenance propagation models ◦Flexibility

Slide 28 of 40November 25, 2010, Frankfurt, Germany Interactions within PD Within WP3 the work is very relevant to: ◦T3.2 (Privacy, DRM and Access Control) – FORTH ◦T3.3 (Trust Management) – EPFL ◦Provenance essential in the above ◦General approach related to annotation models, tagging etc (T3.1, T3.2, T3.3) WP2 deals with provenance and annotation models as well (KIT) ◦Unsure about exact interaction and/or overlaps Related deliverable: D3.2 (M36)

Slide 29 of 40November 25, 2010, Frankfurt, Germany PART III: Access Control WP3: Provenance and Access Policies ◦T3.2: Privacy, DRM and Access Control (FORTH, M1-M42) Objectives of our work ◦Access control specification language ◦Access control enforcement mechanism ◦Data model agnostic access control framework ◦Privacy-aware framework (purpose) ◦Effects of provenance and access control on DRM

Slide 30 of 40November 25, 2010, Frankfurt, Germany Access Control: Introduction Crucial for sensitive content ◦ Refers to the ability to permit or deny the use of a particular resource by a particular entity ◦Ensures the selective exposure of information to different classes of users Focus ◦For RDF graphs ◦Fine-grained (triple-level)

Slide 31 of 40November 25, 2010, Frankfurt, Germany Permissions Used to tag triples (+/- tags) ◦Allow access for the user under question (+) ◦Deny access for the user under question (-) SPARQL query to identify which triples to tag R = include/exclude (x, p, y) where TP, C where ◦(x, p, y) is a SPARQL triple pattern ◦TP is a conjunction of triple patterns and ◦C is a conjunction of constraints

Slide 32 of 40November 25, 2010, Frankfurt, Germany Access Control Policies Some triples are untagged (missing permissions) Default Semantics ◦Will access be granted by default? ◦Access granted: +, access denied: - Some triples are multiply tagged with different tags (ambiguous permissions) Conflict Resolution ◦Will access be granted to multiply tagged triples? ◦Access granted: +, access denied: -

Slide 33 of 40November 25, 2010, Frankfurt, Germany Accessible Triples “include” permissions “exclude” permissions all triples (in the graph)

Slide 34 of 40November 25, 2010, Frankfurt, Germany Our Work (Access Control) Access control framework for RDF graphs ◦Flouris, Fundulaki, Michou, Antoniou. Controlling Access to RDF Graphs. FIS-10. At the moment ◦RDF only (RDFS inference not supported) ◦Focus on read-only operations (no update or write permissions can be set) ◦Implementation exists (repository-independent and portable across platforms) ◦Specific access permissions allowed (+/-)

Slide 35 of 40November 25, 2010, Frankfurt, Germany Research Plans Abstract access control models ◦More expressive tags (e.g., permission levels) Access control for RDFS ◦Requires more expressive policies ◦Support inference ◦Support propagation in access control ◦“Safe” access control policies Access control for dynamic data Access control for edits (not only read) Data model agnostic access control ◦Extension/generalization of existing work

Slide 36 of 40November 25, 2010, Frankfurt, Germany Privacy Privacy: controlling access to private data ◦Access control, enhanced with the notion of purpose ◦Ensure the selective exposure of sensitive data to different requesters and requester purposes Apply our access control model for privacy ◦Privacy-aware framework ◦Enhance our model with the notion of purpose

Slide 37 of 40November 25, 2010, Frankfurt, Germany Digital Rights Management DRM ◦Specification of digital rights ◦Controlling access/usage based on digital rights ◦Prevent/detect abuse of data (violation of digital rights) Importance ◦One must know what he can (legally) do with the data Effects of provenance and access control on DRM ◦DRM very related to provenance and access control models ◦Identify peculiarities of DRM, extend the approach

Slide 38 of 40November 25, 2010, Frankfurt, Germany Innovation Current state-of-the-art concrete and designed for a given application General approaches apply for general annotation models in the DB context Generality ◦Data model agnostic ◦Abstract access control policies ◦Policies support propagation and inference Advantages ◦Easy to update the KB ◦Easy to change or experiment with different access control policies ◦Flexible

Slide 39 of 40November 25, 2010, Frankfurt, Germany Interactions within PD Within WP3 the work is very relevant to: ◦T3.1 (Provenance Management) – FORTH ◦T3.3 (Trust Management) – EPFL ◦Provenance essential for access control ◦Trust management related to access control ◦General approach related to annotation models (applicable in T3.1, T3.2, T3.3) Related deliverables: D3.1 (M24), D3.3 (M42)

Slide 40 of 40November 25, 2010, Frankfurt, Germany Conclusion Research activities of FORTH within PlanetData ◦T2.1: Repair (plus update) ◦T3.1: Provenance ◦T3.2: Privacy, DRM, and Access Control Innovative work, focusing on generality, flexibility, adaptability Work already started ◦Basic ideas and preliminary results established ◦Some publications also ◦Research plans established (subject to change) Interactions mainly within the respective WPs ◦WP3: KIT, EPFL – interactions to be defined/discussed