FATIH CAKIR MELIHCAN TURK F. SUKRU TORUN AHMET CAGRI SIMSEK Content-Based Image Retrieval using the Bag-of-Words Concept.

Slides:



Advertisements
Similar presentations
Image Retrieval With Relevant Feedback Hayati Cam & Ozge Cavus IMAGE RETRIEVAL WITH RELEVANCE FEEDBACK Hayati CAM Ozge CAVUS.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Three things everyone should know to improve object retrieval
Object Recognition with Features Inspired by Visual Cortex T. Serre, L. Wolf, T. Poggio Presented by Andrew C. Gallagher Jan. 25, 2007.
Content-Based Image Retrieval
Discriminative Relevance Feedback With Virtual Textual Representation For Efficient Image Retrieval Suman Karthik and C.V.Jawahar.
Part 1: Bag-of-words models by Li Fei-Fei (Princeton)
Outline SIFT Background SIFT Extraction Application in Content Based Image Search Conclusion.
Marco Cristani Teorie e Tecniche del Riconoscimento1 Teoria e Tecniche del Riconoscimento Estrazione delle feature: Bag of words Facoltà di Scienze MM.
TP14 - Indexing local features
Large-Scale Image Retrieval From Your Sketches Daniel Brooks 1,Loren Lin 2,Yijuan Lu 1 1 Department of Computer Science, Texas State University, TX, USA.
Image Retrieval Basics Uichin Lee KAIST KSE Slides based on “Relevance Models for Automatic Image and Video Annotation & Retrieval” by R. Manmatha (UMASS)
ARNOLD SMEULDERS MARCEL WORRING SIMONE SANTINI AMARNATH GUPTA RAMESH JAIN PRESENTERS FATIH CAKIR MELIHCAN TURK Content-Based Image Retrieval at the End.
CS4670 / 5670: Computer Vision Bag-of-words models Noah Snavely Object
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
1 Image Retrieval Hao Jiang Computer Science Department 2009.
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
Object Recognition. So what does object recognition involve?
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Packing bag-of-features ICCV 2009 Herv´e J´egou Matthijs Douze Cordelia Schmid INRIA.
Effective Image Database Search via Dimensionality Reduction Anders Bjorholm Dahl and Henrik Aanæs IEEE Computer Society Conference on Computer Vision.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
What is Texture? Texture depicts spatially repeating patterns Many natural phenomena are textures radishesrocksyogurt.
Video retrieval using inference network A.Graves, M. Lalmas In Sig IR 02.
Lecture 28: Bag-of-words models
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Agenda Introduction Bag-of-words model Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
ACM Multimedia th Annual Conference, October , 2004
“Bag of Words”: when is object recognition, just texture recognition? : Advanced Machine Perception A. Efros, CMU, Spring 2009 Adopted from Fei-Fei.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.
Object Recognition. So what does object recognition involve?
Object recognition Jana Kosecka Slides from D. Lowe, D. Forsythe and J. Ponce book, ICCV 2005 Tutorial Fei-Fei Li, Rob Fergus and A. Torralba.
Object recognition Jana Kosecka Slides from D. Lowe, D. Forsythe and J. Ponce book, ICCV 2005 Tutorial Fei-Fei Li, Rob Fergus and A. Torralba.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
“Bag of Words”: recognition using texture : Advanced Machine Perception A. Efros, CMU, Spring 2006 Adopted from Fei-Fei Li, with some slides from.
Comparing protein structure and sequence similarities Sumi Singh Sp 2015.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Computer vision.
CSE 473/573 Computer Vision and Image Processing (CVIP)
Special Topic on Image Retrieval
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Content-Based Image Retrieval
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.
CANONICAL IMAGE SELECTION FROM THE WEB ACM International Conference on Image and Video Retrieval, 2007 Yushi Jing Shumeet Baluja Henry Rowley.
IEEE Int'l Symposium on Signal Processing and its Applications 1 An Unsupervised Learning Approach to Content-Based Image Retrieval Yixin Chen & James.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
2005/12/021 Content-Based Image Retrieval Using Grey Relational Analysis Dept. of Computer Engineering Tatung University Presenter: Tienwei Tsai ( 蔡殿偉.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
Content-Based Image Retrieval (CBIR) By: Victor Makarenkov Michael Marcovich Noam Shemesh.
Yixin Chen and James Z. Wang The Pennsylvania State University
Query by Image and Video Content: The QBIC System M. Flickner et al. IEEE Computer Special Issue on Content-Based Retrieval Vol. 28, No. 9, September 1995.
On Using SIFT Descriptors for Image Parameter Evaluation Authors: Patrick M. McInerney 1, Juan M. Banda 1, and Rafal A. Angryk 2 1 Montana State University,
Relevance Feedback in Image Retrieval System: A Survey Tao Huang Lin Luo Chengcui Zhang.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Content-Based Image Retrieval Using Color Space Transformation and Wavelet Transform Presented by Tienwei Tsai Department of Information Management Chihlee.
Class 3: Feature Coding and Pooling Liangliang Cao, Feb 7, 2013 EECS Topics in Information Processing Spring 2013, Columbia University
Highlighted Project 2 Implementations
CS 2770: Computer Vision Feature Matching and Indexing
Large-scale Instance Retrieval
Video Google: Text Retrieval Approach to Object Matching in Videos
By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,
Multimedia Information Retrieval
Video Google: Text Retrieval Approach to Object Matching in Videos
Part 1: Bag-of-words models
Prof. Adriana Kovashka University of Pittsburgh September 17, 2019
Presentation transcript:

FATIH CAKIR MELIHCAN TURK F. SUKRU TORUN AHMET CAGRI SIMSEK Content-Based Image Retrieval using the Bag-of-Words Concept

Outline Introduction Bag-of-Words Concept  Dictionary Formation Content-Based Image Retrieval using BoW Results Conclusion References 2/23

Introduction : Motivation CBIR motivation: Huge amount of multimedia content demands a sophisticated analysis rather than simple textual processing (metadata such as annotations or keywords). Traditional methods for retrieving images is not very satisfactory or may not meet user demand  E.g. In Google image typing ‘Apple’ returns the Apple products as well as the apple fruit.  Main reason is the ambiguity in the language. Several other limitations. 3/23

Introduction : Motivation CBIR systems compensates such issues by analyzing the actual ‘content’ of the image hence yielding a more effective feature for describing the image rather than user defined meta-data  Content may be texture, color or any other information that can be derived from the image itself. One promising idea is to represents images as ‘words’ analogous to text retrieval solutions.  Document ~ Image, term (word) ~ visual word  First introduces in [3]. 4/23

Bag of ‘words’ Concept Object Bag of ‘words’ 5/23

Bag-of-Words Concept: Analogy to documents China is forecasting a trade surplus of $90bn (£51bn) to $100bn this year, a threefold increase on 2004's $32bn. The Commerce Ministry said the surplus would be created by a predicted 30% jump in exports to $750bn, compared with a 18% rise in imports to $660bn. The figures are likely to further annoy the US, which has long argued that China's exports are unfairly helped by a deliberately undervalued yuan. Beijing agrees the surplus is too high, but says the yuan is only one factor. Bank of China governor Zhou Xiaochuan said the country also needed to do more to boost domestic demand so more goods stayed within the country. China increased the value of the yuan against the dollar by 2.1% in July and permitted it to trade within a narrow band, but the US wants the yuan to be allowed to trade freely. However, Beijing has made it clear that it will take its time and tread carefully before allowing the yuan to rise further in value. China, trade, surplus, commerce, exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Of all the sensory impressions proceeding to the brain, the visual experiences are the dominant ones. Our perception of the world around us is based essentially on the messages that reach the brain from our eyes. For a long time it was thought that the retinal image was transmitted point by point to visual centers in the brain; the cerebral cortex was a movie screen, so to speak, upon which the image in the eye was projected. Through the discoveries of Hubel and Wiesel we now know that behind the origin of the visual perception in the brain there is a considerably more complicated course of events. By following the visual impulses along their path to the various cell layers of the optical cortex, Hubel and Wiesel have been able to demonstrate that the message about the image falling on the retina undergoes a of nerve cells stored in columns. In this system each cell has its specific function and is responsible for a specific detail in the pattern of the retinal image. sensory, brain, visual, perception, retinal, cerebral cortex, eye, cell, optical nerve, image Hubel, Wiesel 6/23

Bag-of-Words Concept: Analogy to documents Each image can be represented as a histogram. where each bin of the histogram corresponds to a visual word in the dictionary and the value of the bin is the frequency of occurrence of such visual word 7/23

Bag of ‘words’ Concept Hence, we consider an image as a document. And as words/terms define a document, visual words define an image. Words are known? What are ‘visual words’?  Need to define a dictionary 8/23

Bag of ‘words’ Concept : Construct a dictionary feature detection & representation codewords dictionary image representation /23

Dictionary Formation: Feature Extraction Represent each patch/interest point with SIFT descriptors [1 Lowe ‘99] … 10/23

Dictionary Formation : Vector Quantization Vector quantization … 11/23

Example Dictionary 12/23

Example ‘Visual words’ 13/23

Image Representation ….. frequency codewords 14/23

Content Based Image Retrieval using BoW We saw have to represent images using the BoW concept.  With histograms.  It is a mapping of classical text representation onto the image domain. Hence based on the similarity of histograms we can return ranked results, given an query image.  Category search: Retrieving an arbitrary image representative of a specific class. Used a subset of Caltech 101 dataset [2]. 15/23

Content Based Image Retrieval using BoW Given an query image return the top k most similar results. A ‘positive’ or ‘true’ match considered to be within the same category. Mean average precision value (MAP) is computed for each category using 10 query images. 16/23

Content Based Image Retrieval using BoW: Details For vector quantization K-means is used with K=3000. Hence the dictionary contains 3000 visual words and the histogram has 3000 bins representing each visual word. L2-norm – Euclidean distance is used for similarity measure. Visual words are represented using Lowe’s SIFT descriptors. Interest points are extracted using DOG (Difference of Gaussians). For each of the 18 category 10 query images are used and the average MAP value is considered as the categories success rate. 17/23

Results : MAP of category based queries 18/23

Results : MAP of varied dictionary sizes 19/23

Results The ‘Motorbikes’ category has the highest MAP rate (0.70). The lowest is category ‘camera’ (0.07). Average of MAP rates : 0.25 As the dictionary size get larger (i.e. more visual words) images are represented accurately, hence MAP values increase Performance seem to converge after K> /23

Conclusion Content-Based Image Retrieval systems has gained severe interest among research scientists since multimedia files such as images and videos has dramatically entered our lives throughout the last decade Textual analysis is not sufficient for effective retrieval systems Analogous to document representation an image can be described by ‘visual words’. BoW concept. Using only such feature results are highly satisfying. 21/23

References [1] D. G. Lowe. Distinctive image features from scale- invariant keypoints. IJCV, 60(2):91–110, 2004 [2] Caltech101/ [3] J. Sivic and A. Zisserman. Video Google: A text retrieval approach to object matching in videos. In Proc. ICCV, /23

Thank You! Questions and Demo! 23/23