XPath and Beyond: Formal Foundations Xerox Research Centre Europe / INRIA Jean-Yves Vion-Dury Xerox Research Centre Europe / INRIA INRIA Pierre Genevès.

Slides:



Advertisements
Similar presentations
Three-Step Database Design
Advertisements

Brief Introduction to Logic. Outline Historical View Propositional Logic : Syntax Propositional Logic : Semantics Satisfiability Natural Deduction : Proofs.
An Abstract Interpretation Framework for Refactoring P. Cousot, NYU, ENS, CNRS, INRIA R. Cousot, ENS, CNRS, INRIA F. Logozzo, M. Barnett, Microsoft Research.
Semantics Static semantics Dynamic semantics attribute grammars
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 11.
CSE 425: Semantic Analysis Semantic Analysis Allows rigorous specification of a program’s meaning –Lets (parts of) programming languages be proven correct.
Fall Semantics Juan Carlos Guzmán CS 3123 Programming Languages Concepts Southern Polytechnic State University.
CS 355 – Programming Languages
1 Conditional XPath, the first order complete XPath dialect Maarten Marx Presented by: Einav Bar-Ner.
Comp 205: Comparative Programming Languages Semantics of Imperative Programming Languages denotational semantics operational semantics logical semantics.
1 Basic abstract interpretation theory. 2 The general idea §a semantics l any definition style, from a denotational definition to a detailed interpreter.
1 Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications.
Brief Introduction to Logic. Outline Historical View Propositional Logic : Syntax Propositional Logic : Semantics Satisfiability Natural Deduction : Proofs.
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
Containment and Equivalence for an XPath Fragment By Gerom e Mikla Dan Suciu Presented By Roy Ionas.
G. Gottlob, C. Koch & R. Pichler TU Wien, Vienna, Austria Elias Politarhos Advanced Databases M.Sc. in Information Systems Athens University of Economics.
Programming Language Semantics Mooly SagivEran Yahav Schrirber 317Open space html://
C SC 520 Principles of Programming Languages 1 C SC 520: Principles of Programming Languages Peter J. Downey Department of Computer Science Spring 2006.
Semantics with Applications Mooly Sagiv Schrirber html:// Textbooks:Winskel The.
Schemas as Toposes Steven Vickers Department of Pure Mathematics Open University Z schemas – specification1st order theories – logic geometric theories.
Software Verification Bertrand Meyer Chair of Software Engineering Lecture 2: Axiomatic semantics.
Describing Syntax and Semantics
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 10 Slide 1 Formal Specification.
Overview of XPath Author: Dan McCreary Date: October, 2008 Version: 0.2 with TEI Examples M D.
Equational Reasoning Math Foundations of Computer Science.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
Notes for Chapter 12 Logic Programming The AI War Basic Concepts of Logic Programming Prolog Review questions.
1 Static Type Analysis of Path Expressions in XQuery Using Rho-Calculus Wang Zhen (Selina) Oct 26, 2006.
The ACL2 Proof Assistant Formal Methods Jeremy Johnson.
An Introduction to Description Logics. What Are Description Logics? A family of logic based Knowledge Representation formalisms –Descendants of semantic.
Mathematical Modeling and Formal Specification Languages CIS 376 Bruce R. Maxim UM-Dearborn.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
CS 363 Comparative Programming Languages Semantics.
Formal Verification Lecture 9. Formal Verification Formal verification relies on Descriptions of the properties or requirements Descriptions of systems.
Checking Reachability using Matching Logic Grigore Rosu and Andrei Stefanescu University of Illinois, USA.
Formal Semantics of Programming Languages 虞慧群 Topic 1: Introduction.
Theory of Computation, Feodor F. Dragan, Kent State University 1 TheoryofComputation Spring, 2015 (Feodor F. Dragan) Department of Computer Science Kent.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
ISBN Chapter 3 Describing Semantics.
Chapter 3 Part II Describing Syntax and Semantics.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Semantics In Text: Chapter 3.
1 Typing XQuery WANG Zhen (Selina) Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,
1 / 48 Formal a Language Theory and Describing Semantics Principles of Programming Languages 4.
CSE Winter 2008 Introduction to Program Verification January 15 tautology checking.
From Hoare Logic to Matching Logic Reachability Grigore Rosu and Andrei Stefanescu University of Illinois, USA.
© Copyright 2008 STI INNSBRUCK Intelligent Systems Propositional Logic.
Lecture 7: Foundations of Query Languages Tuesday, January 23, 2001.
1 First order theories (Chapter 1, Sections 1.4 – 1.5) From the slides for the book “Decision procedures” by D.Kroening and O.Strichman.
1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,
All-Path Reachability Logic Andrei Stefanescu 1, Stefan Ciobaca 2, Radu Mereuta 1,2, Brandon Moore 1, Traian Serbanuta 3, Grigore Rosu 1 1 University of.
Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications Chapter.
From Natural Language to LTL: Difficulties Capturing Natural Language Specification in Formal Languages for Automatic Analysis Elsa L Gunter NJIT.
CSC3315 (Spring 2009)1 CSC 3315 Languages & Compilers Hamid Harroud School of Science and Engineering, Akhawayn University
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
1 Ontological Foundations For SysML Henson Graves September 2010.
1 Interactive Computer Theorem Proving CS294-9 September 7, 2006 Adam Chlipala UC Berkeley Lecture 3: Data structures and Induction.
CENG 424-Logic for CS Introduction Based on the Lecture Notes of Konstantin Korovin, Valentin Goranko, Russel and Norvig, and Michael Genesereth.
Rewriting Nested Graphs, through Term Graphs Roberto Bruni, Andrea Corradini, Fabio Gadducci Alberto Lluch Lafuente and Ugo Montanari Dipartimento di Informatica,
1 Representing and Reasoning on XML Documents: A Description Logic Approach D. Calvanese, G. D. Giacomo, M. Lenzerini Presented by Daisy Yutao Guo University.
CSCE 355 Foundations of Computation
CSCE 355 Foundations of Computation
Propositional Calculus: Boolean Algebra and Simplification
Semantics In Text: Chapter 3.
MA/CSSE 474 More Math Review Theory of Computation
Department of Computer Science Abdul Wali Khan University Mardan
Towards a Unified Theory of Operational and Axiomatic Semantics
Presentation transcript:

XPath and Beyond: Formal Foundations Xerox Research Centre Europe / INRIA Jean-Yves Vion-Dury Xerox Research Centre Europe / INRIA INRIA Pierre Genevès INRIA

05/2004XPath and Beyond: Formal foundations Roadmap: Part 1 XPath: a cornerstone of the XML architecture Theory and Engineering Some key problems The trends around XPath theoretical studies A Logic Based approach Mathematical Characterization Why using the Coq Proof Assistant ?

05/2004XPath and Beyond: Formal foundations XPath: a cornerstone of the XML architecture Expresses both node selection and/or structural properties Currently used in XSLT, XQuery, XML Schema, XLink, XPointer,… XPath is elegant, compact, effective and powerful Claim: will be increasingly used and studied in the future Indexing large document bases Checking integrity constraints / global structural properties Linking increasing document volumes

05/2004XPath and Beyond: Formal foundations Theory and Engineering in Computer Sciences Some decades ago, some theoretical studies prepared engineering The relational algebra enabled a huge market around data storage and access Information Theory prepared digital processing (networks, image and sound processing, compression algorithms,…) Linguistic, Logic and Formal mathematics prepared programming languages A Strange situation today around documents… W3C Standardization activities produce specifications, and many problems remain open Some theoreticians try to capture problems and to understand underlying issues, long after the publication of the specifications! This induces new difficulties and requires different approaches In order to deal with low level issues, closed from implementations In order to face complexity of systems

05/2004XPath and Beyond: Formal foundations Some Key Problems around XPath Formal semantics definition Formal Model of Documents (trees, streams, graphs, strings,…?) Precise, useful and simple Denotational/Operational semantics Type checking Constraints on Document structure (tree grammars, graph grammars, pattern matching) Valid/Invalid Path expression with respect to a particular schema Rewriting path expressions In order to customize compilation/interpretation Normalization Optimization Reduction of the complexity of suitable models Simplifying expressions while preserving semantics Equivalence p1 ≈ p2 gives a fundamental understanding of the language Containment p1 ≤ p2 Gives an even more fundamental view Key inference: If p is a key for a schema S, then all p’ such that p’ ≤ p are keys too

05/2004XPath and Beyond: Formal foundations Linking Key Problems around XPath Invalid expression and containment p ≤  Rewriting and equivalence (p1 | p2)/p -> p1/p | p2/p and (p1 | p2)/p ≈ p1/p | p2/p Optimization and containment If p1 ≤ p2 then (p1 | p2)/p -> p2/p Equivalence and containment p1 ≈ p2 iff p1 ≤ p2 and p2 ≤ p1 Containment and type checking Structural constraints can be captured in XPath expression Structural Constraint satisfaction can thus be checked

05/2004XPath and Beyond: Formal foundations The problem of containment (expression)

05/2004XPath and Beyond: Formal foundations The problem of typed containment (expression)

05/2004XPath and Beyond: Formal foundations The Trends around XPath Theoretical Studies Formal semantics RewritingOptimization Containment & Equivalence Typed Containment/Optim ization Child and descendan t axes [Geneves,vion04] [Flesca03] [Miklau,Suciu] [Neven,Schwentick] [Deutsch,Tannen] [Geneves,vion04] [Deutsch,Tannen] [Kwong,Gertz] All axes [Olteanu01] [vion,layaida03] [Geneves,vion] [vion,layaida03]? All axes+ Position and count [Wadler99] [vion,layaida03] [Gottlob,koch03] [Geneves,Rose04] [Geneves,vion]

05/2004XPath and Beyond: Formal foundations A Logic Based Approach A set of axioms to reason on terms comparison ≤ As opposed to model based approaches A partial equivalence relation to minimize the axiom set fully congruent (e.g. p1 ≤ p2 and p1==p3 implies p3 ≤p2) Theorems for simplifying the containment proofs E.g. reflexivity, transitivity Drawback: syntactic level more combinatorial as opposed to model based approaches Advantage: syntactic level more extensible, provided the previous point is addressed Gives more indication on the underlying issues due to language peculiarities

05/2004XPath and Beyond: Formal foundations XPath: abstract syntax ([Wadler99],[Olteanu01])

05/2004XPath and Beyond: Formal foundations Denotational semantics ( [Wadler99][Olteanu01] )

05/2004XPath and Beyond: Formal foundations Denotational semantics ( [Wadler99][Olteanu01] )

05/2004XPath and Beyond: Formal foundations Denotational semantics ( [Wadler99][Olteanu01] )

05/2004XPath and Beyond: Formal foundations Basic axioms

05/2004XPath and Beyond: Formal foundations Union & Intersection

05/2004XPath and Beyond: Formal foundations Qualifiers

05/2004XPath and Beyond: Formal foundations The equivalence relation ( [Olteanu01] )

05/2004XPath and Beyond: Formal foundations Using equivalence in proofs

05/2004XPath and Beyond: Formal foundations Mathematical Characterization Soundness of the equivalence Soundness of rules (e.g.) Completeness of rule system (e.g.)

05/2004XPath and Beyond: Formal foundations Why Using the Coq Proof Assistant ? Coq is a Proof Assistant based on the Calculus of Inductive Constructionshttp://coq.inria.fr Higher Order Logic Constructive Logic Typed To address the complexity problem related to proofs To benefit from the help of the Proof Assistant in case analysis To maintain all the mathematical architecture along exploratory work To work in a rigorous frame To produce rock solid and readable results The challenge: Require powerful data structure modelling capabilities Learning Coq is an additional difficulty ! Developing a proof in Coq is more demanding But… Coq is quite mature now (v8.0, 25 years of research !) and very expressive

05/2004XPath and Beyond: Formal foundations Roadmap: Part 2 Modelling XPath using inductive constructions Formal Semantics and interpretations Interpreter based on the denotational semantics A relational semantics for XPath Modelling the containment relation Using the proof system: containment checking Current work on characterization Methodology and expected outcomes

05/2004XPath and Beyond: Formal foundations Modelling XPath using inductive constructions Paths are defined inductively “void” (  ), “top” (  ) are atoms | /  … are binary constructors [] involves qualifiers _true, _false are atoms “and”, “or”, “not” : constructors “leq” (  ): a cross-inductive definition Functional notation, example: a/b[c] slash a (qualif b c)

05/2004XPath and Beyond: Formal foundations Interpreter based on the denotational semantics Evaluates a path p from the context node x of the tree t The evaluation of a path returns a set of nodes Cross-Recursive and terminating functions The evaluation of a qualifier returns a boolean

05/2004XPath and Beyond: Formal foundations Need for a logic-based semantics The classical semantics describes an interpreter that computes nodesets This computational vision leads to useless complexity in proofs Is there another way to capture XPath Semantics?

05/2004XPath and Beyond: Formal foundations A Relational Semantics for XPath An Interpretation of paths in First-Order Logic A path is translated into a dyadic formula R p holds for all pairs (x,y) of nodes such that y is accessed from x through the path p. Advantages: interpretations of paths and qualifiers are unified Direct translation in Coq Sem math du papier

05/2004XPath and Beyond: Formal foundations Modelling the containment relation (1) A binary logical relation “Ple” Gathers all containment rules in a single inductive construction Suited for using Coq’s built-in tactics (constructor, inversion)

05/2004XPath and Beyond: Formal foundations Modelling the containment relation (2) The containment relation ≤ for paths Is inductive Is defined using its dual relation  for qualifiers (“Qipl”)

05/2004XPath and Beyond: Formal foundations Using the proof system: Containment Checking We have modelled: XPath terms Their interpretation The containment relation (that gathers our containment axioms) We can now check containment facts with the proof engine Demo of a tactical which proves the fact:./*/b ≤./descendant::b Underlying goal: extend the tactical in order to automatize the checking of all containment facts

05/2004XPath and Beyond: Formal foundations Proving Properties: Characterization Proving the equivalence of semantics (done) Current work: proving the validity of our axiomatization: Soundness Completeness Finding relevant induction schemes mutual induction (duality between paths-qualifiers) Induction on a measure of the term complexity Finding generic and modular Coq tactics (to reduce combinatorial issues)

05/2004XPath and Beyond: Formal foundations Methodology and Possible outcomes Sound Not Sound Inductive Relation Ple IncompleteComplete Fix wrong rules Add missing rules Intrinsically Incomplete Incomplete Algorithm Algorithm Extend the fragment UndecidableDecidable UndecidableDecidable why?

05/2004XPath and Beyond: Formal foundations Conclusion We proposed a Logic based framework for static analysis of XPath Modelling with inductive constructions (XPath terms and interpretations, Containment Relation) Preliminary result: a simpler semantics Ongoing Work on Characterization

05/2004XPath and Beyond: Formal foundations Backup slides Applications

05/2004XPath and Beyond: Formal foundations Some Applications (1) Optimization of XPath queries Detecting contradictions (p ≤ void) Eliminating redundancies Example: //a[*/b/c and descendant::b] /descendant::a[*/b/c] */b/c => descendant::b An optimization not currently achieved at runtime by XPath engines: Xalan C++

05/2004XPath and Beyond: Formal foundations Some Applications (2) Static Analysis of XPath host languages Example: XSLT Checking XSLT stylesheets Optimization of XSLT stylesheets Extending XPath expressive power with an inclusion constraint: p[p1  p2] Integrity Constraint-Checking  Transformation languages strongly based on XPath