Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exchange Intensional XML Data Tova MiloSerge Abiteboul Tova Milo INRIA & Tel-Aviv U. ; Serge Abiteboul INRIA ; Bernd AmannOmar Benjelloun Bernd Amann Cedric-CNAM.

Similar presentations


Presentation on theme: "Exchange Intensional XML Data Tova MiloSerge Abiteboul Tova Milo INRIA & Tel-Aviv U. ; Serge Abiteboul INRIA ; Bernd AmannOmar Benjelloun Bernd Amann Cedric-CNAM."— Presentation transcript:

1 Exchange Intensional XML Data Tova MiloSerge Abiteboul Tova Milo INRIA & Tel-Aviv U. ; Serge Abiteboul INRIA ; Bernd AmannOmar Benjelloun Bernd Amann Cedric-CNAM ; Omar Benjelloun INRIA ; Fred Dang Ngoc Fred Dang Ngoc INRIA

2 Outline  Introduction  The Model and The Problem  Exchanging Intensional Data  Safe Rewriting  Possible Rewriting  Implementation  Conclusion and Related Work

3 Introduction  What are intensional documents? XML document where;  some of defined explicitly  some of the documents are defined explicitly defined by programs (i.e Web services  some are defined by programs (i.e Web services) that generate data.  Materialisation of the programs the process of evaluating some of the programs included in an XML document and replacing them by their results.

4 Introduction (cont’d)  The goals of the paper Study the new issues raised by the exchange of intensional XML document btw. Applications Study the new issues raised by the exchange of intensional XML document btw. Applications Decide on which data should be materialised before it is sent and which should not Decide on which data should be materialised before it is sent and which should not

5 Introduction (cont’d) Sender capabilities ACL cost... Receiver capabilities ACL cost... Data Exchange Schema g qf f qg... g q r g f r q g r g q οData exchange scenario for intensional documents g r

6 Outline  Introduction  The Model and The Problem  Exchanging Intensional Data  Safe Rewriting  Possible Rewriting  Implementation  Conclusion and Related Work

7 The Model and The Problem  Simple intensional XML Model Intension document Simple schema Instance of a schema About rewritings  A Richer Data Model Function pattern Restricted Service Invocations

8 The Model and The Problem Simple intensional XML  Model intentional XML documents as Labelled Trees consisting of two types of nodes:  Data nodes: Nodes with a label in L U D  Function Nodes  correspond to “Service Calls”, that is, nodes with a label in F: The children subtrees of a function node are the Function Parameters When the function is called:  These subtrees are passed to it  The return value replaces the function node in the document. Assume the existance of some Disjoint Domains:  N : domain of NODES  L : domain of LABELS  F : domain of FUNCTION NAMES  D : domain of DATA VALUES

9 newspaper title “The Sun” date “04/10/2002” Get_Temp city “Paris” TimeOut “Exhibits” temp “16 ºC” The Model and The Problem Simple intensional XML (cont ’ d)  An example of intentional XML documents

10  Simple schema A document schema s is an expression (L,F,τ) where,  L L :finite set of labels  F F :finite set of function names  τ :function that maps: Each label name l Є L to a expression over L U F or to the keyword “data” Each function name f Є F to a pair of expressions called  τin( f )  input type of f  τout( f )  output type of f The Model and The Problem Simple intensional XML (cont ’ d)

11  An Example of a Schema: data:  τ (newspaper) =title.date.(Get_Temp|temp).(TimeOut|exhibit)  τ (title) = data  τ (date) = data  τ (temp) = data  τ (city) = data  τ (exhibit) = data Functions:  τin (Get_Temp)= city  τout (Get_Temp)= temp  τin (TimeOut)= data  τout (Timeout)= (exhibit|performance)  τin (Get_Date)= title  τin (Get_Date)= date The Model and The Problem Simple intensional XML (cont ’ d)

12  Instances of a schema An intensional document t is instance of a schema s=(L,F,τ) if for each:  Data Node n Є t with label l Є L, the labels of n’s children form a word in lang( τ ( l ))  Same is valid for Function Node. τ ( Used to denode the regular language defined by τ (l )

13  about Rewritings  t,t’: trees  IF t’ is obtained from t by; selecting a function node v in t with some label f and replacing it by an arbitrary output instance of f  THEN we say that t t’ The Model and The Problem Simple intensional XML (cont ’ d) v

14  about Rewritings (cont’d) IF t t 1 t 2 ------ t n THEN we say that t t n nodes v 1,........, v n are called rewriting sequence the set of all trees t’ such that t t’ is denoted ext(t). the set of all trees t’ such that t t’ is denoted ext(t). v1v1v1v1 v2v2v2v2 vnvnvnvn * t rewrites into t n *

15 The Model and The Problem Simple intensional XML (cont ’ d) about Rewritings (cont’d)  Let: t be a tree s be a schema  1. IF ext(t) contains some instance of s THEN t possibly rewrites into s.  2. IF either t is already an instance of s or there exists some node v in t such that all trees t’ where t t’ safely rewrite into s THEN we say that t safely rewrites into s v

16 The Model and The Problem Simple intensional XML (cont ’ d) safely rewriting of schema safely rewriting of schema  Let: s be a schema s be a schema r is a distinguished label called root label r is a distinguished label called root label  IF all the instances t of s with root label r rewrite safely into instances of s’ THENwe say that: s safely rewrites into s’ THENwe say that: s safely rewrites into s’Problems:

17 The Model and The Problem Simple intensional XML (cont ’ d) Sender capabilities ACL cost... Receiver capabilities ACL cost... Data Exchange Schema g qf f qg... g q r g f r q g r g q g r

18 The Model and The Problem A Richer Data Model Function Patterns  A function belongs to the pattern if its name satisfies the boolean predicate and its signature is the same as the required one  EX: τ name (Forecast)= UDDIF InACL τ name (Forecast)= UDDIF InACL τ in (Forecast)= city τ in (Forecast)= city τ out (Forecast)= temp τ out (Forecast)= temp

19 The Model and The Problem A Richer Data Model (cont ’ d)  Restricted Service Invocations We assumed so far that all the functions appearing in a document may be invoked in a rewriting, in order to match a given schema. We assumed so far that all the functions appearing in a document may be invoked in a rewriting, in order to match a given schema. This is not always the case, for the reasons like; This is not always the case, for the reasons like;  security,  cost,  access rights, etc. THUS, function names/patterns in the schema can be partitioned into two disjoint groups of invocable and noninvocable ones. THUS, function names/patterns in the schema can be partitioned into two disjoint groups of invocable and noninvocable ones. A legal rewriting is then one that invokes only invocable functions. A legal rewriting is then one that invokes only invocable functions.

20 Outline  Introduction  The Model and The Problem  Exchanging Intensional Data  Safe Rewriting  Possible Rewriting  Schema Rewriting  Implementation  Conclusion and Related Work

21 Exchanging Intensional Data  Rewriting process Safe writing Possible writing Mix approach  Restriction

22 Exchanging Intensional Data rewriting process  Safe rewriting: check if t safely rewrites to s  if so, find a rewriting sequence.  rewriting sequence  a sequence of functions that need to be invoked to transform t into the required structure  preferred required structure  shortest/ cheapest one

23 Exchanging Intensional Data rewriting process(cont ’ d)  Possible Rewriting : IF a safe rewriting does not exist  check whether at least t may rewrite to s.  IF it is acceptable to do so (the sender accepts that the rewriting may fail),  try to find a successful rewriting sequence if one exists  preferred rewriting sequence  one with the least cost.

24 Exchanging Intensional Data rewriting process(cont ’ d)  Mixed Approached: In mixed approach, one could first invoke some function calls then attempt from there to find safe rewritings.

25 Exchanging Intensional Data rewriting process(cont ’ d) K-depth rewriting sequence K-depth rewriting sequence  For a rewriting sequence t v : t 1.. t n, IF the node V j was returned by the invocation of the function V i, V j  t j, V i  t j-1 IF the node V j was returned by the invocation of the function V i, V j  t j, V i  t j-1 THEN we say that function node V j depends on a function node V i. THEN we say that function node V j depends on a function node V i. IF the dependency graph among the nodes contains no paths of length greater than k. IF the dependency graph among the nodes contains no paths of length greater than k. THEN we say that a rewriting sequence is of depth k THEN we say that a rewriting sequence is of depth k v1v1 vnvn

26 Exchanging Intensional Data Restriction RESTRICTION: onsider only k-depth left-to-right rewritings. “Consider only k-depth left-to-right rewritings.“

27 Outline  Introduction  The Model and The Problem  Exchanging Intensional Data  Safe Rewriting  Possible Rewriting  Schema Rewriting  Implementation  Conclusion and Related Work

28 Safe Rewriting(DEC16,2004)  Algorithm for k-depth left to right safe rewriting  Safe Rewriting Algorithm: Given: Given:  word w  the output types R f1,.....,R fn of the available functions  target regular language R Purpose of the algorithm: Purpose of the algorithm:  to test if w can be safely rewritten into a word in R  if so, to find a safe rewriting sequence

29 Safe Rewriting (cont’d)  Note:For illustration purposes we use the newspaper document w=title.date.Get_Temp.TimeOut  word children labels form w=title.date.Get_Temp.TimeOut  word children labels form R=title.date.temp (TimeOut|exhibit * )  safe rewriting of the above word into the word in R R=title.date.temp (TimeOut|exhibit * )  safe rewriting of the above word into the word in R  The Algorithm: Main idea: to put things in regular language terms, the intersection of the language generated by the k-depth invocation with the complement of the target language R should be Empty.

30 Safe Rewriting (cont’d) 1.Build the finite state automata for the following regular languages (1) w=title.date.Get_Temp.TimeOut (1) A w w=title.date.Get_Temp.TimeOut (2) Build automata A fi each accepting the regular language R fi (the output types of the available functions). q1 date q0 title q2 Get_Temp q3 TimeOut q4

31 Safe Rewriting (cont’d) (3) Build an automaton A accepting the complement of the regular language R. The automaton should be deterministic and complete. τ’(newspaper)=title.date.temp(TimeOut|exhibit*) The complement automation A for schema τ’(newspaper)=title.date.temp(TimeOut|exhibit*) p5 p2p2 p3p4 p6 tempTimeOut exhibit * * * * * p1 date p0 title *

32 Safe Rewriting (cont’d) 2. Construct automation represents all the words that can be generated by such k-depth rewriting process (by iteration) 2. Construct automation A w represents all the words that can be generated by such k-depth rewriting process (by iteration) w=title.date.Get_Temp.TimeOut  1 depth automaton A w for the word w=title.date.Get_Temp.TimeOut 1 q1 date q0 title q2 Get_Temp q3 TimeOut q4 q5 ε q6 ε temp q7 εε exhibit performance Fork node Represents choice of invoking the function Represents choice of not invoking the function k

33 Safe Rewriting (cont’d) 3.Construct the cartesian product automaton 3.Construct the cartesian product automaton AX=Aw X A k q0,p0 q3,p6 q1,p1q2,p2 q3,p3 q5,p2q6,p3 q4,p4 q7,p3q4,p3 q7,p5 q5,p5 q7,p6 q4,p6 q7,p6 title date Get_Temp temp TimeOut Perform. exhibit Performance exhibit TimeOut ε Exhibit Performance ε ε ε ε ε ε ε Figure6:

34 Safe Rewriting (cont’d) 4. Mark nodes in A X : q0,p0 q3,p6 q1,p1q2,p2 q3,p3 q5,p2q6,p3 q4,p4 q7,p3q4,p3 q7,p5 q5,p5 q7,p6 q4,p6 q7,p6 title date Get_Temp temp TimeOut Perform. exhibit Performance exhibit TimeOut ε Exhibit Performance ε ε ε ε ε ε ε Figure6:

35 Safe Rewriting (cont’d)  Try to obtain a SAFE REWRITING. “A safe rewriting exists IFF the initial state is not marked” “A safe rewriting exists IFF the initial state is not marked” Follow a non-marked path (corresponding to w ) starting from the initial state of A x to a state [q p] where q is an accepting state of A w Follow a non-marked path (corresponding to w ) starting from the initial state of A x to a state [q p] where q is an accepting state of A w  non-marked fork options on the path determine the rewriring choices (i.e. which functions to call)  when a function is invoked, we contnue the path with the new rewritten word rather than the word w k

36 Safe Rewriting (cont’d) To minimize the rewriting cost, choose a path with minimal number/cost of function invocations. To minimize the rewriting cost, choose a path with minimal number/cost of function invocations.  EXIT % End of the algorithm

37 Safe Rewriting (cont’d) τ’(newspaper)=title.date.temp.exhibit*  The complement automaton A for schema τ’(newspaper)=title.date.temp.exhibit* p5 q3 p3p4 p6 temp * exhibit * * * * * q1 date q0 title * Figure7:

38 Safe Rewriting (cont’d)  The cartesian product automaton A x = A w x A q0,p0 q3,p6 q1,p1q2,p2 q3,p3 q5,p2q6,p3 q7,p3q4,p3 q7,p5 q5,p5 q7,p6 q4,p6 q7,p6 title date Get_Temp temp TimeOut Perform. exhibit Performance exhibit TimeOut ε Exhibit Performance ε ε ε ε ε ε ε 11 Figure8:

39 Outline  Introduction  The Model and The Problem  Exchanging Intensional Data  Safe Rewriting  Possible Rewriting  Implementation  Conclusion and Related Work

40 Possible Rewriting  The Algorithm  1. Build finite state automaton for the following languages: 1.1. An automaton A w 1.1. An automaton A w 1.2. An automaton A accepting the regular language R 1.2. An automaton A accepting the regular language R k

41 Possible Rewriting(cont ’ d) τ’’(newspaper)=title.date. Temp.exhibit*  An automaton A for schema τ’’(newspaper)=title.date. Temp.exhibit* p2 p3p4 tempExhibit exhibit p1 date p0 title Figure10:

42 Possible Rewriting(cont ’ d)  2.Construct the cartesian product automaton A x =A w x A q0,p0 q1,p1q2,p2 q3,p3 q5,p2q6,p3 q7,p3 title date temp ε ε ε Figure11: q4,p3 q4,p4 q7,p4 ε ε exhibit k

43 Possible Rewriting(cont ’ d)  The cartesian product automaton for possible rewritting. q0,p0 q1,p1q2,p2 q3,p3 q5,p2q6,p3 q7,p3 title date temp ε ε ε Figure11: q4,p3 q4,p4 q7,p4 ε ε exhibit

44 Outline  Introduction  The Model and The Problem  Exchanging Intensional Data  Safe Rewriting  Possible Rewriting  Implementation  Conclusion and Related Work

45 Implementation  In the implementation; intensional XML document  a well-formed XML document  To distinguish intensional parts from the rest of the document; namespace http://www.activexml.com/ns/int is used. http://www.activexml.com/ns/int http://www.activexml.com/ns/int  namespace defined for function (service) calls. http://www.activexml.com/ns/int

46 Implementation (cont ’ d) newspaper title “The Sun” dat e “04/10/2002” Get_Temp city “Paris” TimeOut “Exhibits”

47 Implementation (cont ’ d) Namespace defined for function (service) calls Data nodes title and date 1.URL of the server Three attributes of the function nodes provide necessary information to call the SOAP Service 2.Method name 3.associated namespace

48 Implementation (cont ’ d) Function TimeOut 1.URL of the server 2.Method name 3.associated namespace

49 Implementation (cont ’ d)  Newspaper element with structure  Newspaper element with structure title.date.(Forecast|temp). (TimeOut|exhibit*)

50 Implementation (cont ’ d)  The Role of Schema Enforcement Module :  1.  1. to verify whether the call parameters conform to the WSDL int description of the service.  2  2. if not, try to rewrite them into the required structure.  3. if 2 fails, to report an error. NOTE:  Similarly, before an ActiveXML returns its answer, the Schema Enforcement Module performs the same three steps on the returned data.

51 Outline  Introduction  The Model and The Problem  Exchanging Intensional Data  Safe Rewriting  Possible Rewriting  Implementation  Conclusion and Related Work

52 CONCLUSION and RELATED WORK  XML documents with embedded calls to Web services are already present in several existing products.(ActivXML System) WHAT’S NEW ?  However, the proposed extension of the XML Schema with function types is a first step towards a more precise description of XML documents embedding computation. MAIN PROBLEM:  whether Safe Rewriting remains decidable when the k-depth restriction is removed.


Download ppt "Exchange Intensional XML Data Tova MiloSerge Abiteboul Tova Milo INRIA & Tel-Aviv U. ; Serge Abiteboul INRIA ; Bernd AmannOmar Benjelloun Bernd Amann Cedric-CNAM."

Similar presentations


Ads by Google