Ken C. K. Lee, Baihua Zheng, Huajing Li, Wang-Chien Lee VLDB 07 Approaching the Skyline in Z Order 1.

Presentation on theme: "Ken C. K. Lee, Baihua Zheng, Huajing Li, Wang-Chien Lee VLDB 07 Approaching the Skyline in Z Order 1."— Presentation transcript:

Ken C. K. Lee, Baihua Zheng, Huajing Li, Wang-Chien Lee VLDB 07 Approaching the Skyline in Z Order 1

Outline Introduction Preliminaries Skyline Processing in Z-Order ZSearch for Skyline Query ZUpdate for Skyline Updates Experiment Conclusion 2

Introduction Finding skyline points from very large in high dimensional space is expensive operation. Most of the work in the literature targets is improving performance of skyline query in high dimensional space. 3

Preliminaries Skyline Problems and Properties 4

(Cont.) Skyline Query Processing: 1.Divide and Conquer Algorithm. 2.Sorting-based Algorithm. 3.Hybrid Algorithm. 5

(Cont.) Divide and Conquer Algorithm: D&C divides a dataset into several small partitions and computes every partial skyline. The complete skyline is obtained by merging all partial skylines and removing dominated data points. Sorting-based Algorithm: SFS is devised based on an observation that by getting a dataset presorted according to a certain monotone scoring function such as sum of attributes. SFS sequentially scans the sorted dataset and keeps a set of skyline candidates. Dominance tests in SFS are based on an exhaustive scan on existing skyline candidates. 6

(Cont.) Hybrid Algorithm: Including Index, NN, and BBS. BBS is based on NN search, and adopts R-tree as its underlying index. 7

(Cont.) The nearest neighbor(NN): dominates is a skyline point. 8

(Cont.) The second NN: dominates not dominated by is another skyline. 9

(Cont.) are the skylines 10

(Cont.) BBS deletion an Insertion: Deletion: Need to found EDR(Exclusive Dominance Region). Insertion: Need compared with other skyline points. 11

Skyline Processing in Z-Order Skyline and Z-Order Curve For a d-dimensional space with as the coordinate value domain ranges, the Z-address of a data point contains dv bits, which can be considered as v d-bit groups. The i- th bit of a Z-address is contributed by the (i/d)-th bit of the (i%d)- th coordinate. 12

(Cont.) Instance : Z-address : (011111) 13

(Cont.) P1(00 11 11), P2(01 01 10), P3(01 10 00), P4(01 11 11) P5(10 01 10), P6(10 10 11), P7(10 11 00), P8(11 00 01) P9(11 11 00) 14

(Cont.) 15

(Cont.) Zbtree Index Structure 16

(Cont.) To facilitate data processing along a Z-order address sequence. To preserve data points in regions to enable efficient search space pruning. Assign a Z-address to all points, store them in a B + tree 17

ZSearch for Skyline Query RZ-Region Based Dominance Test 18

(Cont.) ZSearch Algorithm Use RZ-Region Based Dominance Test Input: ZBtree for source data set Local: a stack s Output: Skyline points 19

(Cont.) 20

ZUpdate for Skyline Updates Insertion: Insert Pinsert compared with skyline that Z-adress smaller than Pinsert. If Pinsert is be dominated, the skyline is the same. Else compared with skyline that Z-adress larger than Pinserts. 21

(Cont.) Deletion Find the points are only dominated by Pdel, and Pdel add to skyline set. Just comparing with the points Z-address larger than Pdels. 22

Experiment 23

(Cont.) 24

(Cont.) 25

Conclusion In this paper, we analyze the skyline problems and exploit the orderingand clustering properties of the Z-order curve which match perfectly well with the skyline processing strategies. The ZSearch algorithm scales very well in both dimensionality and cardinality. 26

Download ppt "Ken C. K. Lee, Baihua Zheng, Huajing Li, Wang-Chien Lee VLDB 07 Approaching the Skyline in Z Order 1."

Similar presentations