Junqi Zhang+ Xiangdong Zhou+ Wei Wang+ Baile Shi+ Jian Pei*

Slides:



Advertisements
Similar presentations
The Optimal-Location Query
Advertisements

Hierarchical Cellular Tree: An Efficient Indexing Scheme for Content-Based Retrieval on Multimedia Databases Serkan Kiranyaz and Moncef Gabbouj.
Trees for spatial indexing
The A-tree: An Index Structure for High-dimensional Spaces Using Relative Approximation Yasushi Sakurai (NTT Cyber Space Laboratories) Masatoshi Yoshikawa.
Computer Science and Engineering Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search Chengyuan Zhang 1,Ying Zhang 1,Wenjie Zhang 1, Xuemin.
Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
        iDistance -- Indexing the Distance An Efficient Approach to KNN Indexing C. Yu, B. C. Ooi, K.-L. Tan, H.V. Jagadish. Indexing the distance:
Indexing and Range Queries in Spatio-Temporal Databases
Multidimensional Indexing
Efficiently searching for similar images (Kristen Grauman)
Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University.
Pivoting M-tree: A Metric Access Method for Efficient Similarity Search Tomáš Skopal Department of Computer Science, VŠB-Technical.
Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Using Trees to Depict a Forest Bin Liu, H. V. Jagadish EECS, University of Michigan, Ann Arbor Presented by Sergey Shepshelvich 1.
Efficient Processing of Top-k Spatial Keyword Queries João B. Rocha-Junior, Orestis Gkorgkas, Simon Jonassen, and Kjetil Nørvåg 1 SSTD 2011.
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Fast Subsequence Matching in Time-Series Databases Christos Faloutsos M. Ranganathan Yannis Manolopoulos Department of Computer Science and ISR University.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
1 SD-Rtree: A Scalable Distributed Rtree Witold Litwin & Cédric du Mouza & Philippe Rigaux.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
Document retrieval Similarity –Vector space model –Multi dimension Search –Range query –KNN query Query processing example.
PMLAB Finding Similar Image Quickly Using Object Shapes Heng Tao Shen Dept. of Computer Science National University of Singapore Presented by Chin-Yi Tsai.
R ++ -tree: an efficient spatial access method for highly redundant point data Martin Šumák, Peter Gurský University of P. J. Šafárik in Košice.
Parallel dynamic batch loading in the M-tree Jakub Lokoč Department of Software Engineering Charles University in Prague, FMP.
Relevance Feedback Hongning Wang What we have learned so far Information Retrieval User results Query Rep Doc Rep (Index) Ranker.
The Simigle Image Search Engine Wei Dong
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Efficient Metric Index For Similarity Search Lu Chen, Yunjun Gao, Xinhan Li, Christian S. Jensen, Gang Chen.
Templated Search over Relational Databases Date: 2015/01/15 Author: Anastasios Zouzias, Michail Vlachos, Vagelis Hristidis Source: ACM CIKM’14 Advisor:
Challenges in Mining Large Image Datasets Jelena Tešić, B.S. Manjunath University of California, Santa Barbara
Group 8: Denial Hess, Yun Zhang Project presentation.
Optimal Dimensionality of Metric Space for kNN Classification Wei Zhang, Xiangyang Xue, Zichen Sun Yuefei Guo, and Hong Lu Dept. of Computer Science &
Approximate NN queries on Streams with Guaranteed Error/performance Bounds Nick AT&T labs-research Beng Chin Ooi, Kian-Lee Tan, Rui National.
1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.
Relevance Feedback Hongning Wang
DASFAA 2005, Beijing 1 Nearest Neighbours Search using the PM-tree Tomáš Skopal 1 Jaroslav Pokorný 1 Václav Snášel 2 1 Charles University in Prague Department.
AQWA Adaptive Query-Workload-Aware Partitioning of Big Spatial Data Dimosthenis Stefanidis Stelios Nikolaou.
A Spatial Index Structure for High Dimensional Point Data Wei Wang, Jiong Yang, and Richard Muntz Data Mining Lab Department of Computer Science University.
Jianping Fan Department of Computer Science University of North Carolina at Charlotte Charlotte, NC Relevance Feedback for Image Retrieval.
NN k Networks for browsing and clustering image collections Daniel Heesch Communications and Signal Processing Group Electrical and Electronic Engineering.
1 Queryy Sampling Based High Dimensional Hybrid Index Junqi Zhang, Xiangdong Zhou Fudan University.
Cross-modal Hashing Through Ranking Subspace Learning
Jeremy Iverson & Zhang Yun 1.  Chapter 6 Key Concepts ◦ Structures and access methods ◦ R-Tree  R*-Tree  Mobile Object Indexing  Questions 2.
1 Spatial Query Processing using the R-tree Donghui Zhang CCIS, Northeastern University Feb 8, 2005.
Computer Science and Engineering Jianye Yang 1, Ying Zhang 2, Wenjie Zhang 1, Xuemin Lin 1 Influence based Cost Optimization on User Preference 1 The University.
Spatial Data Management
General-Purpose Learning Machine
Progressive Computation of The Min-Dist Optimal-Location Query
Spatial Indexing.
Information Retrieval
RE-Tree: An Efficient Index Structure for Regular Expressions
Machine Learning Basics
Nearest-Neighbor Classifiers
15-826: Multimedia Databases and Data Mining
Locality Sensitive Hashing
Distributed Probabilistic Range-Aggregate Query on Uncertain Data
GPX: Interactive Exploration of Time-series Microarray Data
Skyline query with R*-Tree: Branch and Bound Skyline (BBS) Algorithm
Covering Uncertain Points in a Tree
Continuous Density Queries for Moving Objects
Minwise Hashing and Efficient Search
R-trees: An Average Case Analysis
Donghui Zhang, Tian Xia Northeastern University
Efficient Aggregation over Objects with Extent
Presentation transcript:

Junqi Zhang+ Xiangdong Zhou+ Wei Wang+ Baile Shi+ Jian Pei* Using High Dimensional Indexes to Support Relevance Feedback Based Interactive Images Retrieval Junqi Zhang+ Xiangdong Zhou+ Wei Wang+ Baile Shi+ Jian Pei* +Fudan University, China *Simon Fraser University, Canada Motivation K-means cluster approach had been widely used to improve the performance of high dimensional index. But, there are still some problems need to be discussed further, such as how to preset the query radius and the number K of the K-means cluster, etc. In this demo system, we present a new cluster splitting based B+-tree index to deal with the above problems, and the index has been applied to support relevance feedback for content-based images retrieval. Index Structure Background The central idea of iDistance is to cluster objects and find a reference point for each cluster. Then, the distance between an object and the reference point in the cluster to which the object belong can be indexed in a B+-tree. It has been well observed that in the high dimensional real data space, a majority of clusters are intersected each other. Therefore,it is often the case, that a query region covers many clusters and causes lower query efficiency. In order to improve the query performance, the iDistance search algorithm starts with a preset small search radius and enlarges the search radius gradually if necessary. Experiment results Challenges However, in the known work, the initial query radius and the enlarging step need to be preset by experiment or user’s experiences. It is lack of theory to guide the estimation of these parameters. Demo system 1. Based on the query cost model of metric space, we developed the formulas to compute the “optimal” cluster splitting number M. 2. In the interactive relevance feedback processing, the query distance is updated using users’ feedback and the index distance is guaranteed to be a lower bound of the query distance. Thus, the index structure does not need to be changed. , Nc: the number K of K-means cluster N: size of dataset H: height of internal node u: fanout of node Approach We present a new cluster splitting based B+-tree index to deal with the above problems, 1. The optimal KNN search algorithm is adopted to avoid the selection of initial query radius 2. Through cluster splitting, the data space is partitioned more finely to reduce the intersection between query region and data clusters