Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Incremental Validation of XML Databases Yannis Papakonstantinou Victor Vianu Computer Science & Eng, UCSD.

Similar presentations


Presentation on theme: "1 Incremental Validation of XML Databases Yannis Papakonstantinou Victor Vianu Computer Science & Eng, UCSD."— Presentation transcript:

1 1 Incremental Validation of XML Databases Yannis Papakonstantinou Victor Vianu Computer Science & Eng, UCSD

2 Incremental Validation of XML Databases: XML Database Document Type Definition (DTD) XML Schema/ XQuery Type System Updates O(log n) O(log 2 n) n nodes

3 XML As Labeled Ordered Trees cars usednew car yearmodelyearmodel 92 Civic 96 Acura model CivicMaxima year 03

4 Document Type Definitions (DTDs): Abstraction & Example cars usednew car yearmodelyearmodel root : cars cars used new used car* new car* car (year|) model car modelyear 92 Civic 96 AcuraCivicMaxima 03 dummy

5 Tree Satisfying DTD, General Case 1 2 i i-1 i+1 k-1 k … … … 1 2 k-1 k … … abc root : … r … r

6 XML Schemas/XQuery Types as Specialized DTDs cars usednew car yearmodelyearmodel root : cars T cars T used T new T used T car U * new T car N * car U year T model T car N (year T |) model T car modelyear used T new T cars T car U car N car U,car N model T year T model T year T LABEL TYPES car {car U, car N } cars {cars T } used {used T } …

7 Tree Automata Specialized DTDs cars usednew car yearmodelyearmodel car modelyear used T new T cars T car U, car N car U, car N car U, car N model T year T model T year T

8 Incremental Validation Problem Statement For each valid tree T use an auxiliary structure A(T) so that, given a series of update commands efficiently decide if the updated tree T is valid efficiently update A(T) and T

9 Types of Updates: Node Renaming u(v, ) 1 2 i i-1 i+1 k-1 k … … … r 1 2 k-1 k … … abc v

10 Types of Updates: Deletion d(v) 1 2 i-1 i+1 k-1 k … … … r … abc i 1 2 k-1 k … v

11 Types of Updates: Insertion 1 2 i-1 i+1 k-1 k … … … r … abc v i+1 i insert_after(v i-1, i ) v i-1

12 Validating a Renaming u(i, ) on a Regular String of N : Take One 1 2 i i-1 i+1 n-1 n … N … Validation of one update in O(1) given precomputed Pre and Post Post(i+1) Pre(i-1) u(i, ) requires recomputation of Pre(i), Pre(i+1), … and of Post(i), Post(i-1), … q0q0 1 2 i-1 … qFqF n n-1 i+1 … q0q0 1 2 i-1 …

13 Transition Relation Definition 1 2 i j n-1 n … ……… m T i,j = { (q, q) | } i+1 q i … q j m+1 T i,j = T i,m T m+1,j

14 Transition Relation Trees 1 2 3 4 5 6 7 8 T 5,8 T 1,4 T 3,4 T 1,2 T 5,6 T 7,8 T 1,1 T 2,2 T 3,3 T 4,4 T 5,5 T 6,6 T 7,7 T 8,8 T 1,8

15 Maintenance of the Structure and Validation in O(log n) 1 2 3 4 5 6 7 8 T 1,1 T 2,2 T 3,3 T 4,4 T 5,5 T 6,6 T 7,7 T 8,8 T 1,2 T 3,4 T 5,6 T 7,8 T 5,8 T 1,4 T 1,8 u(6, ) If (q 0, q F ) then valid T 6,6 T 5,6 T 5,8 T 1,8

16 Transition B-Trees (2-3 Trees) for O(log n) Insertions and Deletions 1 2 3 5 6 7 9 T 1 T 2 T 3 T 5 T 6 T 7 T 9 Ta Tb TcTa Tb Tc T a = T 1 T 2 If (q 0, q F ) T a T b T c then valid

17 Transition B-Trees (2-3 Trees) for O(log n) Insertions and Deletions 1 2 3 5 6 7 9 8 T 1 T 2 T 3 T 5 T 6 T 7 T 8 T 9 T a T b T c

18 Transition B-Trees (2-3 Trees) for O(log n) Insertions and Deletions 1 2 3 5 6 4 7 9 8 T 1 T 2 T 7 T 8 T 9 T a T b T c T 3 T 5 T 6

19 Transition B-Trees (2-3 Trees) for O(log n) Insertions and Deletions T3 T4T3 T4 T 5 T 6 1 2 3 5 6 4 7 9 8 T 1 T 2 T 7 T 8 T 9 T a T b T c

20 Transition B-Trees (2-3 Trees) for O(log n) Insertions and Deletions Ta TdTa Td T e T c T3 T4T3 T4 T 5 T 6 1 2 3 5 6 4 7 9 8 T 1 T 2 T 7 T 8 T 9 T f T g

21 Auxiliary Structures for Incremental DTD Validation 1 2 i i-1 i+1 k-1 k … … … r 1 2 k-1 k … … vivi u(v i, ) r i … … r r

22 Specialized DTD Incremental Validation: Take One a1a1 aiai a i-1 a i+1 akak … … r b1b1 b k-1 bkbk … … vivi u(v i, ) … types(v i )= { i,1,…, i,n } types() types(v i )= { i,1,…, i,n } types()

23 Inefficient for Deep Trees: Apply Divide- And-Conquer in Vertical Direction … … Turn Specialized DTD into NFA that validates a vertical line Fuse vertical and horizontal directions using binary tree and split work in both

24 Tree Satisfying Specialized DTD transformed into Binary Tree Accepted By Tree Automaton a b c dj k e fh gi a b c dj k e fh g i # # # # ## # # # # ##

25 Designate Lines in Binary Trees Size( ) > 2 Size( ) Size( ) > 4 Size( )

26 Example Line Structure a b c dj k e fh g i # # # # ## # # # # ## a c d b # f # j e k # h g i # # # # # # # # #

27 From Tree Automaton to Validating Lines with NFA a c b j e k h g i d f d

28 a c b, T c j e k h g i d, T j f, T g

29 Incremental Validation of the Line Structure in O(log 2 |T|) a c b, T c j e k h g i f, T g m d, T j Insert m after k #updated lines < 1 + log |T| Cost of line update O(log |T|)

30 Validating Insertions and Deletions: the Non-Line-Preserving Case Insertion

31 Key Complexity Results Given m updates on tree of size n, incrementally validate DTD in O(m log n) given alphabet, size of maximum regular expression d: O(m | | d 2 log d log n) Data structure of size O(d 2 n) Specialized DTDs in O(m log 2 n) given set of types O(m | | 2 d 2 (log d + log | |) log 2 n) Data structure of size O(| | 2 d 2 log 2 n) Lower complexity for 1-unambiguous

32 Ongoing and Future Work (with Andrey Balmin) Incorporate Transition Relation Trees in B-Tree Structure Exploit locality Experimental evaluation on set of 65 DTDs: In 96% of type definitions an update may only affect transition relations of length<4 Common case much more efficient than worse case Detect the property and employ algorithms that do not build trts in such cases Optimization over multiple updates More complex updates & edit operations


Download ppt "1 Incremental Validation of XML Databases Yannis Papakonstantinou Victor Vianu Computer Science & Eng, UCSD."

Similar presentations


Ads by Google