Multi Feature Indexing Network MUFIN Similarity Search Platform for many Applications Pavel Zezula Faculty of Informatics Masaryk University, Brno 23.1.20121MUFIN:

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Jan SedmidubskyOctober 28, 2011MUFIN: Large-scale Similarity Search Leader: prof. Pavel Zezula Members: Dr. Michal Batko Dr. Vlastislav.
Jan SedmidubskyOctober 28, 2011Scalability and Robustness in a Self-organizing Retrieval System Jan Sedmidubsky Vlastislav Dohnal Pavel Zezula On Investigating.
Spatial Database Systems. Spatial Database Applications GIS applications (maps): Urban planning, route optimization, fire or pollution monitoring, utility.
Searching on Multi-Dimensional Data
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Improving the Performance of M-tree Family by Nearest-Neighbor Graphs Tomáš Skopal, David Hoksza Charles University in Prague Department of Software Engineering.
Introduction to Spatial Database System Presented by Xiaozhi Yu.
Pivoting M-tree: A Metric Access Method for Efficient Similarity Search Tomáš Skopal Department of Computer Science, VŠB-Technical.
ADBIS 2003 Revisiting M-tree Building Principles Tomáš Skopal 1, Jaroslav Pokorný 2, Michal Krátký 1, Václav Snášel 1 1 Department of Computer Science.
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Answering Metric Skyline Queries by PM-tree Tomáš Skopal, Jakub Lokoč Department of Software Engineering, FMP, Charles University in Prague.
Xyleme A Dynamic Warehouse for XML Data of the Web.
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Scalable and Distributed Similarity Search in Metric Spaces Michal Batko Claudio Gennaro Pavel Zezula.
Overlay Networks EECS 122: Lecture 18 Department of Electrical Engineering and Computer Sciences University of California Berkeley.
Chapter 3: Data Storage and Access Methods
1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
Chapter 8 Physical Database Design. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Overview of Physical Database.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Overview of Search Engines
Jan SedmidubskySeptember 23, 2014Motion Retrieval for Security Applications Jan Sedmidubsky Jakub Valcik Pavel Zezula Motion Retrieval for Security Applications.
Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Image Based Positioning System Ankit Gupta Rahul Garg Ryan Kaminsky.
Other Structured P2P Systems CAN, BATON Lecture 4 1.
Multimedia and Time-series Data
Scalability of Similarity Searching the MUFIN approach Pavel Zezula Masaryk University Brno, Czech Republic.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Document retrieval Similarity –Vector space model –Multi dimension Search –Range query –KNN query Query processing example.
1 Pattern Classification X. 2 Content General Method K Nearest Neighbors Decision Trees Nerual Networks.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
M- tree: an efficient access method for similarity search in metric spaces Reporter : Ximeng Liu Supervisor: Rongxing Lu School of EEE, NTU
SIMILARITY SEARCH The Metric Space Approach Pavel Zezula, Giuseppe Amato, Vlastislav Dohnal, Michal Batko.
Parallel dynamic batch loading in the M-tree Jakub Lokoč Department of Software Engineering Charles University in Prague, FMP.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
Algorithmic Detection of Semantic Similarity WWW 2005.
SIMILARITY SEARCH The Metric Space Approach Pavel Zezula, Giuseppe Amato, Vlastislav Dohnal, Michal Batko.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Similarity Access for Networked Media Connectivity Pavel Zezula Masaryk University Brno, Czech Republic.
Multi-object Similarity Query Evaluation Michal Batko.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Ohio State University Department of Computer Science and Engineering Servicing Range Queries on Multidimensional Datasets with Partial Replicas Li Weng,
Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases Time Series databases Text databases.
DASFAA 2005, Beijing 1 Nearest Neighbours Search using the PM-tree Tomáš Skopal 1 Jaroslav Pokorný 1 Václav Snášel 2 1 Charles University in Prague Department.
Query by Image and Video Content: The QBIC System M. Flickner et al. IEEE Computer Special Issue on Content-Based Retrieval Vol. 28, No. 9, September 1995.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
Presenters: Amool Gupta Amit Sharma. MOTIVATION Basic problem that it addresses?(Why) Other techniques to solve same problem and how this one is step.
General Architecture of Retrieval Systems 1Adrienn Skrop.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Information Retrieval in Practice
Spatial Data Management
Digital Video Library - Jacky Ma.
Instance Based Learning
Spatial Indexing I Point Access Methods.
CHAPTER 3 Architectures for Distributed Systems
Multimedia Information Retrieval
Presentation transcript:

Multi Feature Indexing Network MUFIN Similarity Search Platform for many Applications Pavel Zezula Faculty of Informatics Masaryk University, Brno MUFIN: Multi Feature Indexing Network

Outline of the talk Why similarity Principles of metric similarity searching The MUFIN approach Demo applications Future directions MUFIN: Multi Feature Indexing Network2

Real-Life Motivation The social psychology view Any event in the history of organism is, in a sense, unique. Recognition, learning, and judgment presuppose an ability to categorize stimuli and classify situations by similarity. Similarity (proximity, resemblance, communality, representativeness, psychological distance, etc.) is fundamental to theories of perception, learning, judgment, etc MUFIN: Multi Feature Indexing Network

Contemporary Networked Media The digital data view Almost everything that we see, read, hear, write, measure, or observe can be digital. Users autonomously contribute to production of global media and the growth is exponential. Sites like Flickr, YouTube, Facebook host user contributed content for a variety of events. The elements of networked media are related by numerous multi-facet links of similarity MUFIN: Multi Feature Indexing Network

Examples with Similarity Does the computer disk of a suspected criminal contain illegal multimedia material? What are the stocks with similar price histories? Which companies advertise their logos in the direct TV transmission of football match? Is it the situation on the web getting close to any of the network attacks which resulted in significant damage in the past? MUFIN: Multi Feature Indexing Network

Challenge Networked media is getting close to the human “fact- bases” – the gap between physical and digital has blurred Similarity data management is needed to connect, search, filter, merge, relate, rank, cluster, classify, identify, or categorize objects across various collections. WHY? It is the similarity which is in the world revealing MUFIN: Multi Feature Indexing Network

Limitations: Data Types We have Attributes – Numbers, strings, etc. Text (text-based) – Documents, annotations We need Multimedia – Image, video, audio Security – Biometrics Medicine – EKG, EEG, EMG, EMR, CT, etc. Scientific data – Biology, chemistry, physics, life sciences, economics Others – Motion, emotion, events, etc MUFIN: Multi Feature Indexing Network

Limitations: Models of Similarity We have Simple geometric models, typically vector spaces We need More complex model Non metric models Asymmetric similarity Subjective similarity Context aware similarity Complex similarity Etc MUFIN: Multi Feature Indexing Network

Limitations: Queries We have Simple query – Nearest neighbor – Range We need More query types – Reverse NN, distinct NN, similarity join Other similarity-based operations – Filtering, classification, event detection, clustering, etc. Similarity algebra – May become the basis of a “Similarity Data Management System” MUFIN: Multi Feature Indexing Network

Limitations: Implementation Strategies We have Centralized or parallel processing We need Scalable and distributed architectures MapReduce like approaches P2P architectures Cloud computing Self-organized architectures Etc MUFIN: Multi Feature Indexing Network

Search Strategy Evolution Scalability ● data volume - exponential ● number of users (queries) ● variety of data types ● multi-lingual, -feature –modal queries Determinism exact match ► similarity precise ► approximate same answer ► good answer; recommendation fixed query ► personalized; context aware fixed infrastr. ► dynamic mapping; mobile dev. grade high low well establishedcutting-edgeresearch peer-to-peer centralized parallel distributed self-organized MUFIN: Multi Feature Indexing Network

similarity effectiveness efficiency stimuli matching extraction evaluation execution algebra Similarity Data Management System Similarity Data Management System MUFIN: Multi Feature Indexing Network

Metric Search Grows in Popularity Hanan Samet Foundation of Multidimensional and Metric Data Structures Morgan Kaufmann, 2006 P. Zezula, G. Amato, V. Dohnal, and M. Batko Similarity Search: The Metric Space Approach Springer, MUFIN: Multi Feature Indexing Network

The MUFIN Approach MUFINMUFIN: MUlti-Feature Indexing Network SEARCH data & queries infrastructure index structure Scalability P2P structure Extensibility metric space Independence Infrastructure as a service MUFIN: Multi Feature Indexing Network

Extensibility: Metric Abstraction of Similarity Metric space: M = ( D,d) – D – domain – distance function d(x,y)  x,y,z  D d(x,y) > 0- non-negativity d(x,y) = 0  x = y- identity d(x,y) = d(y,x)- symmetry d(x,y) ≤ d(x,z) + d(z,y)- triangle inequality MUFIN: Multi Feature Indexing Network

Examples of Distance Functions L p Minkovski distance (for vectors) L 1 – city-block distance L 2 – Euclidean distance L  – infinity Edit distance (for strings) minimal number of insertions, deletions and substitutions d(‘application’, ‘applet’) = 6 Jaccard’s coefficient (for sets A,B) MUFIN: Multi Feature Indexing Network

Examples of Distance Functions Mahalanobis distance – for vectors with correlated dimensions Hausdorff distance – for sets with elements related by another distance Earth movers distance – primarily for histograms (sets of weighted features) and many others MUFIN: Multi Feature Indexing Network

Similarity Search Problem For X  D in metric space M, pre-process X so that the similarity queries are executed efficiently. No total ordering exists! MUFIN: Multi Feature Indexing Network

MUFIN: Multi Feature Indexing Network19 Similarity Queries Range query Nearest neighbor query Similarity join Combined queries Complex queries

MUFIN: Multi Feature Indexing Network20 Similarity Range Query range query – R(q,r) = { x  X | d(q,x) ≤ r } … all museums up to 2km from my hotel … r q

MUFIN: Multi Feature Indexing Network21 Nearest Neighbor Query the nearest neighbor query – NN(q) = x – x  X,  y  X, d(q,x) ≤ d(q,y) k-nearest neighbor query – k-NN(q,k) = A – A  X, |A| = k –  x  A, y  X – A, d(q,x) ≤ d(q,y) … five closest museums to my hotel … q k=5

MUFIN: Multi Feature Indexing Network22 Similarity Join Queries similarity join of two data sets similarity self join  X = Y …pairs of hotels and museums which are five minutes walk apart … 

MUFIN: Multi Feature Indexing Network23 Combined Queries Range + Nearest neighbors Nearest neighbor + similarity joins – by analogy

MUFIN: Multi Feature Indexing Network24 Complex Queries Find the best matches of circular shape objects with red color The best match for circular shape or red color needs not be the best match combined A 0 algorithm Threshold algorithm

MUFIN: Multi Feature Indexing Network25 Partitioning Principles Given a set X  D in M =( D,d), basic partitioning principles have been defined: – Ball partitioning – Generalized hyper-plane partitioning – Excluded middle partitioning – Clustering

MUFIN: Multi Feature Indexing Network26 Ball Partitioning Inner set: { x  X | d(p,x) ≤ d m } Outer set: { x  X | d(p,x) > d m } p dmdm

MUFIN: Multi Feature Indexing Network27 Generalized Hyper-plane { x  X | d(p 1,x) ≤ d(p 2,x) } { x  X | d(p 1,x) > d(p 2,x) } p2p2 p1p1

MUFIN: Multi Feature Indexing Network28 Excluded Middle Partitioning Inner set: { x  X | d(p,x) ≤ d m -  } Outer set: { x  X | d(p,x) > d m +  } Excluded set: otherwise p dmdm 22 p dmdm

MUFIN: Multi Feature Indexing Network29 Clustering Cluster data into sets – bounded by a ball region – { x  X | d(p i,x) ≤ r i c }

Scalability: Peer-to-Peer Indexing Local search: M-tree, D-Index, M-Index Native metric techniques: GHT*, VPT* Transformation techniques: M-CAN, M-Chord MUFIN: Multi Feature Indexing Network

The M-tree [Ciaccia, Patella, Zezula, VLDB 1997] 1)Paged organization 2)Dynamic 3) Suitable for arbitrary metric spaces 4) I/O and CPU optimization - computing d can be time-consuming MUFIN: Multi Feature Indexing Network

The M-tree Idea Depending on the metric, the “shape” of index regions changes C D E F A B B F D E A C Metric: L 2 (Euclidean) L 1 (city-block) L  (max-metric) weighted-Euclidean quadratic form MUFIN: Multi Feature Indexing Network

MUFIN: Multi Feature Indexing Network33 o7o7 M-tree: Example o1o1 o6o6 o 10 o3o3 o2o2 o5o5 o4o4 o9o9 o8o8 o 11 o1o o2o o1o o o7o o2o o4o o2o2 0.0o8o8 2.9o1o1 0.0o6o6 1.4o o3o3 1.2o7o7 0.0o5o5 1.3o o4o4 0.0o9o9 1.6 Covering radius Distance to parent Leaf entries

M-tree family Bulk loading Slim-tree Multi-way insertion PM-tree M 2 -tree etc MUFIN: Multi Feature Indexing Network

D-Index [Dohnal, Gennaro, Zezula, MTA 2002] 4 separable buckets at the first level 2 separable buckets at the second level exclusion bucket of the whole structure MUFIN: Multi Feature Indexing Network

D-index: Insertion MUFIN: Multi Feature Indexing Network

D-index: Range Search q r q r q r q r q r q r MUFIN: Multi Feature Indexing Network

Implementation Postulates of Distributed Indexes dynamism – nodes can be added and removed no hot-spots – no centralized nodes, no flooding by messages (transactions) update independence – network update at one site does not require an immediate change propagation to all the other sites MUFIN: Multi Feature Indexing Network

Distributed Similarity Search Structures Native metric structures: – GHT* (Generalized Hyperplane Tree) – VPT* (Vantage Point Tree) Transformation approaches: – M-CAN (Metric Content Addressable Network) – M-Chord (Metric Chord) MUFIN: Multi Feature Indexing Network

MUFIN: Multi Feature Indexing Network40 GHT* Address Search Tree Based on the Generalized Hyperplane Tree [Uhl91] – two pivots for binary partitioning p6p6 p5p5 p3p3 p4p4 p1p1 p2p2 p1p1 p2p2 p5p5 p6p6 p3p3 p4p4

MUFIN: Multi Feature Indexing Network41 GHT* Address Search Tree Inner node – two pivots (reference objects) Leaf node – BID pointer to a bucket if data stored on the current peer – NNID pointer to a peer if data stored on a different peer p1p1 p2p2 p5p5 p6p6 p3p3 p4p4 BID 1 BID 2 BID 3 NNID 2 Peer 2

MUFIN: Multi Feature Indexing Network42 GHT* Address Search Tree

MUFIN: Multi Feature Indexing Network43 BID 1 BID 2 BID 3 NNID 2 Peer 2 p1p1 p2p2 p5p5 p6p6 p3p3 p4p4 BID 3 NNID 2 p5p5 p6p6 p1p1 p2p2 GHT* Range Query Range query R(q,r) – traverse peer’s own AST – search buckets for all BIDs found – forward query to all NNIDs found p6p6 p5p5 p3p3 p4p4 r q p1p1 p2p2

MUFIN: Multi Feature Indexing Network44 AST: Logarithmic replication Full AST on every peer is space consuming – replication of pivots grows in a linear way Store only a part of the AST: – all paths to local buckets Deleted sub-trees: – replaced by NNID of the leftmost peer p 13 p 14 p 11 p 12 p5p5 p6p6 p1p1 p2p2 p3p3 p4p4 p7p7 p8p8 p9p9 p 10 NNID 2 NNID 3 BID 1 NNID 4 NNID 5 NNID 6 NNID 7 NNID 8 p1p1 p2p2 p3p3 p4p4 p7p7 p8p8 BID 1 NNID 3 NNID 5

MUFIN: Multi Feature Indexing Network45 AST: Logarithmic Replication (cont.) Resulting tree – replication of pivots grows in a logarithmic way p1p1 p2p2 p3p3 p4p4 p7p7 p8p8 NNID 2 NNID 3 BID 1 NNID 5 p1p1 p2p2 p3p3 p4p4 p7p7 p8p8 BID 1

MUFIN: Multi Feature Indexing Network46 p1p1 r1r1 p3p3 r3r3 VPT* Structure Similar to the GHT* - ball partitioning is used for AST Based on the Vantage Point Tree [Yia93] inner nodes have one pivot and a radius different traversing conditions p2p2 r2r2 p 1 (r 1 ) p 2 (r 2 )p 3 (r 3 )

M-Chord: The Metric Chord Transform metric space to one-dimensional domain – Use M-Index - a generalized version of the iDistance Divide the domain into intervals – assign each interval to a peer Use the Chord P2P protocol for navigation The Skip graphs distributed protocol can be used, alternatively MUFIN: Multi Feature Indexing Network

–range query R(q,r): identify intervals of interest Generalization to metric spaces –select pivots –then partition: Voronoi-style M-Chord: Indexing the Distance iDistance – indexing technique for vector domains –cluster analysis = centers = reference points p i –assign iDistance keys to objects MUFIN: Multi Feature Indexing Network

M-Chord: Chord Protocol Peer-to-Peer navigation protocol Peers are responsible for intervals of keys hops to localize a node storing a key M-Chord set the iDistance domain make it uniform: function h Use Chord on this domain MUFIN: Multi Feature Indexing Network

M-Chord: Range Query Node N q initiates the search Determine intervals –generalized iDistance Forward requests to peers on intervals Search in the nodes –using local organization Merge the received partial answers MUFIN: Multi Feature Indexing Network

MUFIN: Multi Feature Indexing Network51 M-CAN: The Metric CAN Based on the Content-Addressable Network (CAN) – a DHT navigating in an N-dimensional vector space The Idea: 1.Map the metric space to a vector space – given N pivots: p 1, p 2, …, p N, transform every o into vector F(o) 2.Use CAN to – distribute the vector space zones among the nodes – navigate in the network

MUFIN: Multi Feature Indexing Network52 CAN: Principles & Navigation CAN – the principles – the space is divided in zones – each node “owns” a zone – nodes know their neighbors CAN – the navigation – greedy routing – in every step, move to the neighbor closer to the target location 2-dimensional vector space x,y

MUFIN: Multi Feature Indexing Network53 M-CAN: Contractiveness & Filtering Use the L ∞ as a distance measure – the mapping F is contractive More pivots  better filtering – but, CAN routing is better for less dimensions Additional filtering – some pivots are only used for filtering data (inside the explored nodes) – they are not used for mapping into CAN vector space

Infrastructure Independence: MESSIF Metric Similarity Search Implementation Framework Metric space (D,d) OperationsStorage Centralized index structures Distributed index structures Communication Net Vectors L p and quadratic form Strings (weighted) edit and protein sequence Insert, delete, range query, k-NN query, Incremental k-NN Volatile memory Persistent memory Performance statistics MUFIN: Multi Feature Indexing Network

Metric index structures Object Bucket Index structure Distributed index structure Sequential scan M-Tree, D-Index, M-Index GHT*, VPT*, M-Chord, MCAN Insert Delete Queries MUFIN Overlays MUFIN: Multi Feature Indexing Network

External index Feature extraction MUFIN Overview Peer-to-Peer Networks Multi-overlay structure Forms range k-nearest complex Strategies precise approximate social insert delete features Web service Universal batch, telnet, GUI Specialized image web interface MUFIN: Multi Feature Indexing Network

Applications: a Word Cloud MUFIN: Multi Feature Indexing Network

Concepts of the Image search Image base similar? MUFIN: Multi Feature Indexing Network

Images and their Descriptors Image level R B G Descriptor level MUFIN: Multi Feature Indexing Network

Largest publicly available collection of high-quality images metadata: 106 million images Each image contains: Five MPEG-7 VDs: Scalable Color, Color Structure, Color Layout, Edge Histogram, Homogeneous Texture Other textual information: title, tags, comments, etc. Photos have been crawled from the Flickr photo-sharing site. images + metadata + MPEG-7 VDs CoPhIR: Content-based Photo Image Retrieval MUFIN: Multi Feature Indexing Network

MUFIN SEARCH ENGINE data & queries infrastructure index structure Scalability M-Chord + M-Index Extensibility COPHIR edge histogram color structure scalable color homogeneous texture color layout 6 x IBM server x3400 – 2 servers used Image Search Demo MUFIN: Multi Feature Indexing Network

MUFIN demos MUFIN: Multi Feature Indexing Network

MUFIN Future Research Directions MUFIN - a universal similarity search technology Research directions in: – Core technology – Applications – A style of computing MUFIN Search Engine data & queries infrastructure index structure Scalability P2P structures Extensibility metric space Performance Tuning MUFIN: Multi Feature Indexing Network

MUFIN Future Research Directions October 28, 2011 MUFIN Search Engine data & queries infrastructure index structure More scalable, reliable, robust Multi-layer architectures Self-organizing architectures New query types Flexible sub-sequence matching Efficient multi-feature processing New style of computing Cloud Computing Similarity Search as Service MUFIN: Multi Feature Indexing Network

Major Applications – Images: Sub-image retrieval Ranking Annotation Categorization Benchmarking – Biometrics: Face recognition Fingerprint recognition Gait recognition – Signals: Audio recognition Time series similarity – Videos: Event detection MUFIN: Multi Feature Indexing Network

A New Style of Computing From the project-oriented approach towards similarity cloud for multimedia findability through similarity searching Advantages: – Cloud makes similarity search accessible to common users – Computational resources are shared – users don’t need to maintain any hardware infrastructure – Users don’t need to care for the OS, security, software platform, etc MUFIN: Multi Feature Indexing Network