Presentation is loading. Please wait.

Presentation is loading. Please wait.

FORTH Research Activities PlanetData WP1-3 Meeting (Frankfurt, Nov10) Giorgos Flouris, Irini Fundulaki – FORTH.

Similar presentations


Presentation on theme: "FORTH Research Activities PlanetData WP1-3 Meeting (Frankfurt, Nov10) Giorgos Flouris, Irini Fundulaki – FORTH."— Presentation transcript:

1 FORTH Research Activities PlanetData WP1-3 Meeting (Frankfurt, Nov10) Giorgos Flouris, Irini Fundulaki – FORTH

2 Slide 2 of 40November 25, 2010, Frankfurt, Germany FORTH in PD Research WP2: Quality Assessment and Context ◦T2.1: Data Quality Assessment and Repair (FUB, M1-M42) WP3: Provenance and Access Policies ◦T3.1: Provenance Management (FORTH, M1-M36) ◦T3.2: Privacy, DRM, and Access Control (FORTH, M1-M42)

3 Slide 3 of 40November 25, 2010, Frankfurt, Germany Presentation Outline Three main topics/tasks ◦Repair (T2.1) ◦Provenance (T3.1) ◦Access control, privacy, DRM (T3.2) Outline ◦Summary and objectives ◦Introduction and motivation ◦Existing work and research plan ◦Innovation ◦Interactions within the project

4 Slide 4 of 40November 25, 2010, Frankfurt, Germany PART I: Repairs WP2: Quality Assessment and Context ◦T2.1: Data Quality Assessment and Repair (FUB, M1-M42) Objective of our work ◦Study methodologies for repairing invalidities in a way that will cause minimal effects upon the data Extra: ◦Apply the same methodologies for updates

5 Slide 5 of 40November 25, 2010, Frankfurt, Germany Repairs: Introduction Validity rules to guarantee: ◦Special semantics (e.g., acyclic subsumptions) ◦Application-specific rules or requirements (e.g., functional properties) Validity: important dimension of quality ◦Violated at design time ◦Violated during updates or other changes ◦Violated when validity rules change Solution: Repair ◦Given an invalid graph, produce a valid one that is as close as possible to the original

6 Slide 6 of 40November 25, 2010, Frankfurt, Germany Repairing Process Repair Invalid graph Valid graph Main Challenges: 1)Several potential repairs 2)Must find the “closest” ones Major Questions: 1)How to determine potential repairs? 2)How is “distance” measured? Assumptions: 1)RDF/S graphs 2)Rules expressed in DED form

7 Slide 7 of 40November 25, 2010, Frankfurt, Germany Example Validity Rules: properties should have a unique domain and range subject/object of a property instance should be correctly classified per the property’s domain/range A rdf:type rdfs:Class B rdf:type rdfs:Class C rdf:type rdfs:Class P rdf:type rdf:Property P rdfs:range B P rdfs:domain A P rdfs:domain C x P y x rdf:type A y rdf:type B B P AC P yx P

8 Slide 8 of 40November 25, 2010, Frankfurt, Germany A rdf:type rdfs:Class B rdf:type rdfs:Class C rdf:type rdfs:Class P rdf:type rdf:Property P rdfs:range B P rdfs:domain A P rdfs:domain C x P y x rdf:type A y rdf:type B Example (Resolution #1) Validity Rules: properties should have a unique domain and range subject/object of a property instance should be correctly classified per the property’s domain/range Problem: two domains for the same property Solution: delete one of the domains B P AC P yx P

9 Slide 9 of 40November 25, 2010, Frankfurt, Germany B AC P yx P Solution #1 A rdf:type rdfs:Class B rdf:type rdfs:Class C rdf:type rdfs:Class P rdf:type rdf:Property P rdfs:range B P rdfs:domain A x P y x rdf:type A y rdf:type B Example (Resolution #2) Validity Rules: properties should have a unique domain and range subject/object of a property instance should be correctly classified per the property’s domain/range Solution #2 A rdf:type rdfs:Class B rdf:type rdfs:Class C rdf:type rdfs:Class P rdf:type rdf:Property P rdfs:range B P rdfs:domain C x P y x rdf:type A y rdf:type B Problem: incorrect classification Solution: change domain OR make x instance of C OR delete property instance [by the rule syntax] B P AC yx P

10 Slide 10 of 40November 25, 2010, Frankfurt, Germany Potential Repairs: Challenges Easy to determine how to resolve a single violated rule, but … ◦Several violations ◦Several repairing options per violation Resolution interdependencies: ◦Repairing one violation in a certain way may cause another violation ◦Repairing one violation in a certain way may repair multiple violations Need for an exhaustive, rule-based search to determine all potential repairs ◦Tree-based search (recursive) K

11 Slide 11 of 40November 25, 2010, Frankfurt, Germany Selecting a Potential Repair Which repair should be returned? ◦We want the repaired KB to be as close as possible to the original User-defined notion of “distance” ◦Specifications for selecting “preferred repairs” ◦Based on user-defined preferences Preferred repair depends on the context and application, for example: ◦Under complete knowledge, prefer removals ◦In an open setting, prefer additions

12 Slide 12 of 40November 25, 2010, Frankfurt, Germany Determining Preferences Provide specifications to determine the preferred repair ◦Important features of a potential repair E.g.: additions, schema changes etc ◦Comparing the values of important features E.g.: minimize, “around” etc ◦Combine features (preferences) E.g.: prioritize, pareto-preference etc Flight analogy ◦Minimize number of stops ◦Minimize cost

13 Slide 13 of 40November 25, 2010, Frankfurt, Germany Repairs: Summary Framework for repairing invalidities in RDF/S graphs ◦Potential repairs determined using syntactical manipulations over the validity rules ◦Preferred repairs determined using formal preferences Research plan: ◦Formal description of a repair framework ◦Develop, optimize, experiment with, study repair algorithm

14 Slide 14 of 40November 25, 2010, Frankfurt, Germany Innovation Existing approaches: ◦In-built preferences ◦In-built validity rules Our proposal is: ◦Flexible: preferences can be set at run-time ◦Adaptable: different rules and preferences ◦Intuitive: easy-to-define preferences ◦Very general: different repair policies from the literature can be expressed in our framework ◦Easy to be implemented: we can use off-the- shelf implementations for preference evaluation

15 Slide 15 of 40November 25, 2010, Frankfurt, Germany Interactions within PD Only within the WP2 and Task 2.1 Related deliverable: D2.2 (M18)

16 Slide 16 of 40November 25, 2010, Frankfurt, Germany An Extra: Updates Apply an update on an RDF/S graph, in the presence of validity rules ◦Originally a part of WP1 – not any more Similar ideas as with repair ◦Apply the update (in a naïve manner) ◦Repair the result  Taking into account what the update was Principles ◦Success (update must be applied) ◦Validity (result must be valid) ◦Minimal change (minimal “distance” - preferences)

17 Slide 17 of 40November 25, 2010, Frankfurt, Germany PART II: Provenance WP3: Provenance and Access Policies ◦T3.1: Provenance Management (FORTH, M1-M36) Objectives of our work ◦Provenance for RDF and RDFS (inference) ◦Provenance for SPARQL query and update ◦Efficient storage schemes for provenance

18 Slide 18 of 40November 25, 2010, Frankfurt, Germany Provenance: Introduction Provenance: information on the origin of data ◦From where and how the piece of data was obtained Allows/supports: ◦Assessment of data trustworthiness and quality ◦Reproducibility of experiments ◦Justification of decisions (e.g., argumentation) ◦Access control, privacy, DRM, trust Focus on RDF/S ◦Inspired by DB provenance and annotation models

19 Slide 19 of 40November 25, 2010, Frankfurt, Germany Main Challenge RDF triples RDFS inference rules ghf which provenance? a b c e d Provenance tag = colour ◦A subset I of URIs distinguished from the set of class and property names or types A B C ?

20 Slide 20 of 40November 25, 2010, Frankfurt, Germany Annotation Models Annotation models: ◦Annotation computation coupled with a particular application and a particular assignment of source data annotations XYAnnot abt cdt YZ be XYZ abe R1R1 R2R2 R 1 R 2 f t tf re-evaluate the query t: trusted f: untrusted

21 Slide 21 of 40November 25, 2010, Frankfurt, Germany XYAnnot abc1c1 cdc2c2 YZ bec3c3 XYZ abec 1 x c 3 R1R1 R2R2 R 1 R 2 tttt t t Λ t f t Λ f Abstract Annotation Models Abstract annotation models: ◦Abstract provenance tokens and operators are substituted by appropriate concrete tokens for a particular application and assignment

22 Slide 22 of 40November 25, 2010, Frankfurt, Germany Inference and Provenance Colours: a subset I of URIs distinguished from the set of class and property names or types To model colour propagation through inference rules we define an operation ‘+’ to compose colours (I, ‘+’) is a commutative semigroup ◦c 1 + c 2 = c 2 + c 1 (commutativity) ◦c 1 + (c 2 + c 3 ) = (c 1 + c 2 ) + c 3 (associativity) ◦c + c = c(idempotence)

23 Slide 23 of 40November 25, 2010, Frankfurt, Germany Inference and Provenance Why provenance ◦Which explicit triples contributed to get an implicit one? ◦Ignore how (i.e., which rules were used) ◦A single operator ‘+’ for all inference rules Ignore how many times a triple was used ◦‘+’: idempotent[c + c = c] Ignore the order of application ◦‘+’: commutativec 1 + c 2 = c 2 + c 1 ◦‘+’: associativec 1 + (c 2 +c 3 ) = (c 1 +c 2 ) + c 3

24 Slide 24 of 40November 25, 2010, Frankfurt, Germany SPARQL Provenance Model SPARQL construct queries generate triples in a manner similar to inference ◦Except that it is query-dependent Similar problems Abstract annotation models can capture the provenance of SPARQL ◦Queries that do not consider the OPTIONAL Operator ◦Monotonicity no longer holds in the case of OPTIONAL

25 Slide 25 of 40November 25, 2010, Frankfurt, Germany Work So Far (Provenance) Provenance models for RDF/S ◦Pediaditis, Flouris, Fundulaki, Christophides. On Explicit Provenance Management in RDF/S Graphs. TAPP-09. ◦Flouris, Fundulaki, Pediaditis, Theoharis, Christophides. Coloring RDF Triples to Capture Provenance. ISWC-09. Provenance models for SPARQL ◦Theoharis, Fundulaki, Karvounarakis, Christophides. On Provenance of Queries on Linked Web Data. To appear in IEEE Internet Computing: Jan/Feb 2011 - Provenance in Web Applications.

26 Slide 26 of 40November 25, 2010, Frankfurt, Germany Research Plans (Provenance) How provenance (more expressive) Provenance for dynamically evolving data Support OPTIONAL (in SPARQL) Efficient storage schemes for provenance Apply this work on privacy, DRM and access control

27 Slide 27 of 40November 25, 2010, Frankfurt, Germany Innovation Use of abstract annotation models to model provenance propagation in the Semantic Web context (RDF/S) Current state-of-the-art either concrete and designed for a given application or designed for the DB context Advantages ◦Easy to update the KB ◦Easy to change or experiment with different provenance propagation models ◦Flexibility

28 Slide 28 of 40November 25, 2010, Frankfurt, Germany Interactions within PD Within WP3 the work is very relevant to: ◦T3.2 (Privacy, DRM and Access Control) – FORTH ◦T3.3 (Trust Management) – EPFL ◦Provenance essential in the above ◦General approach related to annotation models, tagging etc (T3.1, T3.2, T3.3) WP2 deals with provenance and annotation models as well (KIT) ◦Unsure about exact interaction and/or overlaps Related deliverable: D3.2 (M36)

29 Slide 29 of 40November 25, 2010, Frankfurt, Germany PART III: Access Control WP3: Provenance and Access Policies ◦T3.2: Privacy, DRM and Access Control (FORTH, M1-M42) Objectives of our work ◦Access control specification language ◦Access control enforcement mechanism ◦Data model agnostic access control framework ◦Privacy-aware framework (purpose) ◦Effects of provenance and access control on DRM

30 Slide 30 of 40November 25, 2010, Frankfurt, Germany Access Control: Introduction Crucial for sensitive content ◦ Refers to the ability to permit or deny the use of a particular resource by a particular entity ◦Ensures the selective exposure of information to different classes of users Focus ◦For RDF graphs ◦Fine-grained (triple-level)

31 Slide 31 of 40November 25, 2010, Frankfurt, Germany Permissions Used to tag triples (+/- tags) ◦Allow access for the user under question (+) ◦Deny access for the user under question (-) SPARQL query to identify which triples to tag R = include/exclude (x, p, y) where TP, C where ◦(x, p, y) is a SPARQL triple pattern ◦TP is a conjunction of triple patterns and ◦C is a conjunction of constraints

32 Slide 32 of 40November 25, 2010, Frankfurt, Germany Access Control Policies Some triples are untagged (missing permissions) Default Semantics ◦Will access be granted by default? ◦Access granted: +, access denied: - Some triples are multiply tagged with different tags (ambiguous permissions) Conflict Resolution ◦Will access be granted to multiply tagged triples? ◦Access granted: +, access denied: -

33 Slide 33 of 40November 25, 2010, Frankfurt, Germany Accessible Triples “include” permissions “exclude” permissions all triples (in the graph)

34 Slide 34 of 40November 25, 2010, Frankfurt, Germany Our Work (Access Control) Access control framework for RDF graphs ◦Flouris, Fundulaki, Michou, Antoniou. Controlling Access to RDF Graphs. FIS-10. At the moment ◦RDF only (RDFS inference not supported) ◦Focus on read-only operations (no update or write permissions can be set) ◦Implementation exists (repository-independent and portable across platforms) ◦Specific access permissions allowed (+/-)

35 Slide 35 of 40November 25, 2010, Frankfurt, Germany Research Plans Abstract access control models ◦More expressive tags (e.g., permission levels) Access control for RDFS ◦Requires more expressive policies ◦Support inference ◦Support propagation in access control ◦“Safe” access control policies Access control for dynamic data Access control for edits (not only read) Data model agnostic access control ◦Extension/generalization of existing work

36 Slide 36 of 40November 25, 2010, Frankfurt, Germany Privacy Privacy: controlling access to private data ◦Access control, enhanced with the notion of purpose ◦Ensure the selective exposure of sensitive data to different requesters and requester purposes Apply our access control model for privacy ◦Privacy-aware framework ◦Enhance our model with the notion of purpose

37 Slide 37 of 40November 25, 2010, Frankfurt, Germany Digital Rights Management DRM ◦Specification of digital rights ◦Controlling access/usage based on digital rights ◦Prevent/detect abuse of data (violation of digital rights) Importance ◦One must know what he can (legally) do with the data Effects of provenance and access control on DRM ◦DRM very related to provenance and access control models ◦Identify peculiarities of DRM, extend the approach

38 Slide 38 of 40November 25, 2010, Frankfurt, Germany Innovation Current state-of-the-art concrete and designed for a given application General approaches apply for general annotation models in the DB context Generality ◦Data model agnostic ◦Abstract access control policies ◦Policies support propagation and inference Advantages ◦Easy to update the KB ◦Easy to change or experiment with different access control policies ◦Flexible

39 Slide 39 of 40November 25, 2010, Frankfurt, Germany Interactions within PD Within WP3 the work is very relevant to: ◦T3.1 (Provenance Management) – FORTH ◦T3.3 (Trust Management) – EPFL ◦Provenance essential for access control ◦Trust management related to access control ◦General approach related to annotation models (applicable in T3.1, T3.2, T3.3) Related deliverables: D3.1 (M24), D3.3 (M42)

40 Slide 40 of 40November 25, 2010, Frankfurt, Germany Conclusion Research activities of FORTH within PlanetData ◦T2.1: Repair (plus update) ◦T3.1: Provenance ◦T3.2: Privacy, DRM, and Access Control Innovative work, focusing on generality, flexibility, adaptability Work already started ◦Basic ideas and preliminary results established ◦Some publications also ◦Research plans established (subject to change) Interactions mainly within the respective WPs ◦WP3: KIT, EPFL – interactions to be defined/discussed


Download ppt "FORTH Research Activities PlanetData WP1-3 Meeting (Frankfurt, Nov10) Giorgos Flouris, Irini Fundulaki – FORTH."

Similar presentations


Ads by Google