The σ-neighborhood skyline queries Chen, Yi-Chung; LEE, Chiang. The σ-neighborhood skyline queries. Information Sciences, 2015, 322: 92-114. 張天彥 2015/12/05.

Slides:



Advertisements
Similar presentations
Ken C. K. Lee, Baihua Zheng, Huajing Li, Wang-Chien Lee VLDB 07 Approaching the Skyline in Z Order 1.
Advertisements

6/10/20141 Top-Down Clustering Method Based On TV-Tree Zbigniew W. Ras.
Identifying the Most Influential Data Objects with Reverse Top-k Queries By Akrivi Vlachou 1, Christos Doulkeridis 1, Kjetil Nørvag 1 and Yannis Kotidis.
The A-tree: An Index Structure for High-dimensional Spaces Using Relative Approximation Yasushi Sakurai (NTT Cyber Space Laboratories) Masatoshi Yoshikawa.
VLDB 2011 Pohang University of Science and Technology (POSTECH) Republic of Korea Jongwuk Lee, Seung-won Hwang VLDB 2011.
1 Chapter 5 : Query Processing and Optimization Group 4: Nipun Garg, Surabhi Mithal
Similarity Search on Bregman Divergence, Towards Non- Metric Indexing Zhenjie Zhang, Beng Chi Ooi, Srinivasan Parthasarathy, Anthony K. H. Tung.
SIGMOD 2006 PAKDD 2009 Finding k-Dominant Skylines in High Dimensional Space K-Dominant Skyline Computation by Using Sort-Filtering Method 1.
ISAC 教育學術資安資訊分享與分析中心研發專案 The Skyline Operator Stephan B¨orzs¨onyi, Donald Kossmann, Konrad Stocker EDBT
1 NNH: Improving Performance of Nearest- Neighbor Searches Using Histograms Liang Jin (UC Irvine) Nick Koudas (AT&T Labs Research) Chen Li (UC Irvine)
Da Yan, Zhou Zhao and Wilfred Ng The Hong Kong University of Science and Technology.
Preferential top-k search over local data dissertation thesis RNDr. Martin Šumák supervisor: doc. RNDr. Stanislav Krajči, PhD. consultant: RNDr. Peter.
July 29HDMS'08 Caching Dynamic Skyline Queries D. Sacharidis 1, P. Bouros 1, T. Sellis 1,2 1 National Technical University of Athens 2 Institute for Management.
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
Design Of New Index Structures For The ISAT Algorithm By Biswanath Panda, Mirek Riedewald, Paul Chew, Johannes Gehrke.
Attribute-based Indexing Overlay Apr Outline Introduction Basic Idea Advantage Challenge Conclusion.
Stabbing the Sky: Efficient Skyline Computation over Sliding Windows COMP9314 Lecture Notes.
Subscription Subsumption Evaluation for Content-Based Publish/Subscribe Systems Hojjat Jafarpour, Bijit Hore, Sharad Mehrotra, and Nalini Venkatasubramanian.
Efficient Skyline Querying with Variable User Preferences on Nominal Attributes Raymond Chi-Wing Wong 1, Ada Wai-Chee Fu 2, Jian Pei 3, Yip Sing Ho 2,
1 Mining Favorable Facets Raymond Chi-Wing Wong (the Chinese University of Hong Kong) Jian Pei (Simon Fraser University) Ada Wai-Chee Fu (the Chinese University.
Liang Jin * UC Irvine Nick Koudas University of Toronto Chen Li * UC Irvine Anthony K.H. Tung National University of Singapore VLDB’2005 * Liang Jin and.
1 Continuous k-dominant Skyline Query Processing Presented by Prasad Sriram Nilu Thakur.
Probabilistic Skyline Operator over sliding Windows Wan Qian HKUST DB Group.
Efficient Computation of the Skyline Cube Yidong Yuan School of Computer Science & Engineering The University of New South Wales & NICTA Sydney, Australia.
Bin Jiang, Jian Pei.  Problem Definition  An On-the-fly Method ◦ Interval Skyline Query Answering Algorithm ◦ Online Interval Skyline Query Algorithm.
AGGREGATE PATH INDEX FOR INCREMENTL WEB VIEW MAINTENANCE Author: Li Chen and Elke Rundensteiner Department of Computer Science Worcester Polytechnic Institure.
Spatial and Temporal Databases Efficiently Time Series Matching by Wavelets (ICDE 98) Kin-pong Chan and Ada Wai-chee Fu.
Creating Competitive Products Qian Wan [1], Raymond Chi-Wing Wong [1], Ihab F. Ilyas [2], M. Tamer Ozsu [2], Yu Peng [1] [1] Hong Kong University of Science.
Introduction Using time property and location property from lost items’ pictures, we construct the Lost and Found System which combined with image search.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
Skyline Queries Against Mobile Lightweight Devices in MANETs Zhiyong Huang 1 Christian S. Jensen 2 Hua Lu 1 Beng Chin Ooi 1 1 National University of Singapore,
SUBSKY: Efficient Computation of Skylines in Subspaces Authors: Yufei Tao, Xiaokui Xiao, and Jian Pei Conference: ICDE 2006 Presenter: Kamiru Superviosr:
Maximal Vector Computation in Large Data Sets The 31st International Conference on Very Large Data Bases VLDB 2005 / VLDB Journal 2006, August Parke Godfrey,
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
Creating Competitive Products Qian Wan [1], Raymond Chi-Wing Wong [1], Ihab F. Ilyas [2], M. Tamer Ozsu [2], Yu Peng [1] [1] Hong Kong University of Science.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
Top-k Similarity Join over Multi- valued Objects Wenjie Zhang Jing Xu, Xin Liang, Ying Zhang, Xuemin Lin The University of New South Wales, Australia.
PMLAB Finding Similar Image Quickly Using Object Shapes Heng Tao Shen Dept. of Computer Science National University of Singapore Presented by Chin-Yi Tsai.
1 Introduction to Spatial Databases Donghui Zhang CCIS Northeastern University.
Efficient Progressive Processing of Skyline Queries in Peer-to-Peer Systems INFOSCALE’06.
RELAXED REVERSE NEAREST NEIGHBORS QUERIES Arif Hidayat Muhammad Aamir Cheema David Taniar.
Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007.
Efficient Computation of Reverse Skyline Queries VLDB 2007.
Parallel dynamic batch loading in the M-tree Jakub Lokoč Department of Software Engineering Charles University in Prague, FMP.
Efficient Metric Index For Similarity Search Lu Chen, Yunjun Gao, Xinhan Li, Christian S. Jensen, Gang Chen.
1 Top-k Dominating Queries DB seminar Speaker: Ken Yiu Date: 25/05/2006.
Data Management+ Laboratory Dynamic Skylines Considering Range Queries Speaker: Adam Adviser: Yuling Hsueh 16th International Conference, DASFAA 2011 Wen-Chi.
Reporter : Yu Shing Li 1.  Introduction  Querying and update in the cloud  Multi-dimensional index R-Tree and KD-tree Basic Structure Pruning Irrelevant.
Efficient Processing of Top-k Spatial Preference Queries
Probabilistic Contextual Skylines D. Sacharidis 1, A. Arvanitis 12, T. Sellis 12 1 Institute for the Management of Information Systems — “Athena” R.C.,
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
Bin Yao, Feifei Li, Piyush Kumar Presenter: Lian Liu.
Efficient Computation of Combinatorial Skyline Queries Author: Yu-Chi Chung, I-Fang Su, and Chiang Lee Source: Information Systems, 38(2013), pp
On Top-n Reverse Top-k Queries: Variants, Algorithms, and Applications 陳良弼 Arbee L.P. Chen National Chengchi University 9/21/2012 at NCHU.
Index in Database Unit 12 Index in Database 大量資料存取方法之研究 Approaches to Access/Store Large Data 楊維邦 博士 國立東華大學 資訊管理系教授.
Online Interval Skyline Queries on Time Series ICDE 2009.
1 Finding Competitive Price Yu Peng (Hong Kong University of Science and Technology) Raymond Chi-Wing Wong (Hong Kong University of Science and Technology)
Finding skyline on the fly HKU CS DB Seminar 21 July 2004 Speaker: Eric Lo.
SF-Tree and Its Application to OLAP Speaker: Ho Wai Shing.
Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.
HKU CSIS DB Seminar Skyline Queries HKU CSIS DB Seminar 9 April 2003 Speaker: Eric Lo.
1 Spatial Query Processing using the R-tree Donghui Zhang CCIS, Northeastern University Feb 8, 2005.
1 Introduction to Spatial Databases Donghui Zhang CCIS Northeastern University.
Fast Subsequence Matching in Time-Series Databases.
An Efficient Algorithm for Incremental Update of Concept space
Relaxing Join and Selection Queries
Unit 12 Index in Database 大量資料存取方法之研究 Approaches to Access/Store Large Data 楊維邦 博士 國立東華大學 資訊管理系教授.
The Skyline Query in Databases Which Objects are the Most Important?
Efficient Processing of Top-k Spatial Preference Queries
Faster skyline searching using Hilbert R-tree
Presentation transcript:

The σ-neighborhood skyline queries Chen, Yi-Chung; LEE, Chiang. The σ-neighborhood skyline queries. Information Sciences, 2015, 322: 張天彥 2015/12/05

Outline Introduction to skyline queries The σ-Neighborhood Skyline Queries k-dominant Skyline Conclusions 1

Introduction to skyline queries The concept of domination Distance Price A B C C1.8km$13 Distance to the beach Price of a hotel room Distance to the beach Price of a hotel room A dominates B and C

Introduction to skyline queries Definition of the skyline points Find all points are not dominated by other points 3

Introduction to skyline queries Definition of the skyline points Find all points are not dominated by other points 4

Introduction to skyline queries Definition of the skyline points Distance Price A A B B F F D D E E H H G G C C Find all points are not dominated by other points 5

Outline Introduction to skyline queries The σ-Neighborhood Skyline Queries k-dominant Skyline 6

The σ-Neighborhood Skyline Queries Problem to be solved by σ-N Skyline Queries Unquantifiable attributes Distance Price A A B B C C D D E E F F G G H H I I Unquantifiable attributeQuantifiable attribute 7

The σ-Neighborhood Skyline Queries σ-N Skyline Queries Distance Price A A B B F F D D E E H H G G C C 0.2km $2 σ-N skyline region σ-N skyline point I can tolerant 10% error $2 in price, 0.2km in distance 8

The σ-Neighborhood Skyline Queries Applied the σ-N Skyline query in a dataset with unquantifiable attribute Distance Price A B F D E HG C …… The user can tolerant 10% error ($2 in price 0.2km in distance) 9

The σ-Neighborhood Skyline Queries Difficulties of finding σ-N Skyline Queries Distance Price A B F D E HG C Naïve algorithm 1. Find the skyline points by the existing skyline algorithms  first scan 2. Find the σ-N skyline points by the skyline points  second scan Assume there are 1M data points in the dataset  2M data points need to be check 10

The σ-Neighborhood Skyline Queries Difficulties of finding σ-N Skyline Queries Distance Price A B F D E HG C Naïve algorithm 1. Find the skyline points by the existing skyline algorithms  first scan 2. Find the σ-N skyline points by the skyline points  second scan Assume there are 1M data points in the dataset  2M data points need to be check The cost can be too high to afford when the dataset is large Can we solve the σ- N Skyline query in one scan? Can we solve the σ-N Skyline query without scanning all data points in the dataset? 11

The σ-Neighborhood Skyline Queries The algorithms for the σ-N Skyline Queries 12 Rσ-N algorithm (based on R-tree) Mσ-N algorithm (based on M + -tree) Existing indexed structureNewly developed indexed structure Same searching idea Two pruning mechanisms

The σ-Neighborhood Skyline Queries R-tree s2 p2 p3 p4 p1 Example: 13 R-tree is constructed based on the size of area s3 s1 p5 e1 e3 e2 e4 e5e6 e7 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7

The σ-Neighborhood Skyline Queries Searching idea s2 p2 p3 p4 p1 Example: 14 Objective: Find skyline points and σ-N Skyline points in one scan s3 s1 p5 e1 e3 e2 e5 e4 e6 e7 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7

The σ-Neighborhood Skyline Queries Searching idea Example: 15 Objective: Find skyline points and σ-N Skyline points in one scan e7 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7

The σ-Neighborhood Skyline Queries Searching idea Example: 16 Objective: Find skyline points and σ-N Skyline points in one scan e5e6 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7 e5 e6 e7

The σ-Neighborhood Skyline Queries Searching idea Example: 17 Objective: Find skyline points and σ-N Skyline points in one scan e1 e2 e5e6 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7 e1 e2

The σ-Neighborhood Skyline Queries Searching idea s2 Example: 18 Objective: Find skyline points and σ-N Skyline points in one scan s1 e1 e2 e6 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7 s1s2 Range query: e2 v.s. s1 & s2

The σ-Neighborhood Skyline Queries Searching idea s2 p2 p1 Example: 19 Objective: Find skyline points and σ-N Skyline points in one scan s1 e2 e6 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7 p1p2 Range query: p1 v.s. s1 & s2 p1 Range query: p2 v.s. s1 & s2 Range query: e2 v.s. s1 & s2 Range query: e6 v.s. s1 & s2

The σ-Neighborhood Skyline Queries Disadvantages of Rσ-N algorithm 20 1.Too many redundant points (Caused by the property of R-tree) R-tree is constructed based on the size of area 1.Too many redundant points (Caused by the property of R-tree) R-tree is constructed based on the size of area e1 e2 Insert into e1 or e2? Insert into e1 or e2? Ans: e1 Insert into e1 or e2? Insert into e1 or e2? e1 e2 Ans: e2 Unrelated points

The σ-Neighborhood Skyline Queries Disadvantages of Rσ-N algorithm Too many redundant points s1s2 Additional I/O cost Additional range queries

The σ-Neighborhood Skyline Queries Disadvantages of Rσ-N algorithm Too many range queries s1s1s2s2s3s3s4s4s5s5 s6s6 A B e1 e1 v.s. s1 ? e1 v.s. s2 ? e1 v.s. s3 ? e1 v.s. s4 ? e1 v.s. s5 ? e1 v.s. s6 ? A v.s. s1 ? A v.s. s2 ? A v.s. s3 ? A v.s. s4 ? A v.s. s5 ? A v.s. s6 ? B v.s. s1 ? B v.s. s2 ? B v.s. s3 ? B v.s. s4 ? B v.s. s5 ? B v.s. s6 ?

The σ-Neighborhood Skyline Queries Using M + -tree to solve the problems R-tree is constructed based on the area M + -tree is constructed based on the distance 23 Related points

The σ-Neighborhood Skyline Queries Using M + -tree to solve the problems R-tree is constructed based on the area M + -tree is constructed based on the distance 24 Number of redundant points

The σ-Neighborhood Skyline Queries Using M + -tree to solve the problems Triangle inequality Skyline point s A, center of M + BR B C σ Only C needs further check 25 Number of redundant points <7

The σ-Neighborhood Skyline Queries Using M + -tree to solve the problems e1 v.s. s6 ? e1 v.s. s5 ? e1 v.s. s4 ? e1 v.s. s3 ? e1 v.s. s2 ? e1 v.s. s1 ? Original: e1 v.s. s1, s2, s3, s4, s5, s6 A v.s. s1, s2, s3, s4, s5, s6 B v.s. s1, s2, s3, s4, s5, s6  18 times of range query s1s1s2s2s3s3s4s4s5s5 s6s6 A B e1 summation 26 A v.s. s6 ? A v.s. s1 ? B v.s. s6 ? B v.s. s1 ?  10 times of range query Number of range queries Easy to get the summation of e1 Summation line

The σ-Neighborhood Skyline Queries Simulations Number of data points: 1M Number of dimensions: 2, 3, 4, 5, 6 Number of data points: 1M Number of dimensions: 2, 3, 4, 5, 6 Independent dataset Anti-correlated dataset 27

The σ-Neighborhood Skyline Queries Simulations- selection of σ Independent dataset Anti-correlated dataset 28

The σ-Neighborhood Skyline Queries Simulations 29

The σ-Neighborhood Skyline Queries Conclusion 10 using the Mσ-N algorithm in σ-N skyline queries is far more efficient than the Rσ-N algorithm.

Outline Introduction to skyline queries The σ-Neighborhood Skyline Queries k-dominant Skyline 30

k-dominant skylines k-dominant skylines 有利於在高維資料減少 支配 skylines 點集數量 Distance Price A B C D E F G H I K L k-Dominant Skylines 31