Multimedia Information Retrieval

Slides:



Advertisements
Similar presentations
Image Retrieval With Relevant Feedback Hayati Cam & Ozge Cavus IMAGE RETRIEVAL WITH RELEVANCE FEEDBACK Hayati CAM Ozge CAVUS.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Image Retrieval: Current Techniques, Promising Directions, and Open Issues Yong Rui, Thomas Huang and Shih-Fu Chang Published in the Journal of Visual.
Chapter 5: Introduction to Information Retrieval
Multimedia Database Systems
Content-Based Image Retrieval
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 4 – Digital Image Representation Klara Nahrstedt Spring 2009.
Fast Algorithms For Hierarchical Range Histogram Constructions
November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.
Automated Shot Boundary Detection in VIRS DJ Park Computer Science Department The University of Iowa.
Chapter 8 Content-Based Image Retrieval. Query By Keyword: Some textual attributes (keywords) should be maintained for each image. The image can be indexed.
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
1 Content-Based Retrieval (CBR) -in multimedia systems Presented by: Chao Cai Date: March 28, 2006 C SC 561.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
A Study of Approaches for Object Recognition
SWE 423: Multimedia Systems
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
LYU 0102 : XML for Interoperable Digital Video Library Recent years, rapid increase in the usage of multimedia information, Recent years, rapid increase.
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Presented by Zeehasham Rasheed
CS292 Computational Vision and Language Visual Features - Colour and Texture.
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
Information Retrieval in Practice
Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.
Content-Based Video Retrieval System Presented by: Edmund Liang CSE 8337: Information Retrieval.
Computer vision.
Information Extraction from Cricket Videos Syed Ahsan Ishtiaque Kumar Srijan.
Machine Vision for Robots
Multimedia Databases (MMDB)
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
OBJECT RECOGNITION. The next step in Robot Vision is the Object Recognition. This problem is accomplished using the extracted feature information. The.
Digital Image Processing, 2nd ed. © 2002 R. C. Gonzalez & R. E. Woods Chapter 11 Representation & Description Chapter 11 Representation.
Digital Image Processing Lecture 20: Representation & Description
Content-Based Image Retrieval
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Query Operations J. H. Wang Mar. 26, The Retrieval Process User Interface Text Operations Query Operations Indexing Searching Ranking Index Text.
Digital Image Processing CSC331
COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.
IMAGE DATABASES Prof. Hyoung-Joo Kim OOPSLA Lab. Computer Engineering Seoul National University.
Chapter 6: Information Retrieval and Web Search
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.
2005/12/021 Content-Based Image Retrieval Using Grey Relational Analysis Dept. of Computer Engineering Tatung University Presenter: Tienwei Tsai ( 蔡殿偉.
Content-Based Image Retrieval QBIC Homepage The State Hermitage Museum db2www/qbicSearch.mac/qbic?selLang=English.
Colour and Texture. Extract 3-D information Using Vision Extract 3-D information for performing certain tasks such as manipulation, navigation, and recognition.
1 Overview representing region in 2 ways in terms of its external characteristics (its boundary)  focus on shape characteristics in terms of its internal.
VISUAL INFORMATION RETRIEVAL Presented by Dipti Vaidya.
Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.
Sheng-Fang Huang Chapter 11 part I.  After the image is segmented into regions, how to represent and describe these regions? ◦ In terms of its external.
Visual homing using PCA-SIFT
Image Representation and Description – Representation Schemes
Visual Information Retrieval
Automatic Video Shot Detection from MPEG Bit Stream
Introduction Multimedia initial focus
Digital Image Processing Lecture 20: Representation & Description
Text Based Information Retrieval
Multimedia Content-Based Retrieval
Multimedia Information Retrieval
Computer Vision Lecture 16: Texture II
Representation of documents and queries
Multimedia Information Retrieval
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Presentation transcript:

Multimedia Information Retrieval Unlike alphanumeric data, multimedia data do not have any semantic structure Achieving symmetry between annotation and query is difficult Retrieval is based on similarity between query and stored information instead of exact match Stored information is represented using indexing

IR Model Information is preprocessed to extract features and semantic contents Indexed based on these features and semantics User’s query is processed and main features are extracted Query’s features are then compared with features or index of each information item in the database Information item whose features are similar to those of the query are retrieved and presented to the user

Design Issues Indexing a mechanism that reduces the search space of an operator without losing any relevant information Similarity Computation easy to compute and should conform to human judgement

Performance Measures Retrieval speed, recall, precision Recall measures the ability of retrieving relevant information items from the database defined as the ratio between the number of retrieved relevant items and the total number of relevant items in the database Precision measures retrieval accuracy defined as the ratio between the number of retrieved relevant items and the number of total retrieved items Recall and precision are usually considered together high recall and low precision high precision and low recall

Text Retrieval Text may be used to annotate other media such as audio, images and video and conventional IR techniques used to retrieve multimedia information Boolean IR systems or text-pattern search systems Substantial effort is spent in analyzing the contents of the documents and in generating keywords and indices Boolean queries are keywords connected with logical operators (AND, OR, NOT)

File Structures Flat files Inverted files for each term a separate index is constructed that stores the document identifiers for all documents containing the term each term and the document IDs containing the term are organized into one row searching and retrieval is fast because only rows containing the query terms need to be retrieved and there is no need to search the whole database

Extensions Nearness parameters used in query specification help define the topic more precisely and therefore increase probable relevance of the retrieved item Within Sentence and Adjacency specification in queries Term location information is included in the inverted file Term i : document id, paragraph no., sentence no., word no. For example, if an inverted file has the following entries: information: R99, 10, 8, 3; R155, 15, 3, 6; R166, 2, 3,1 retrieval: R77, 9, 7, 2; R99, 10, 8, 4; R166, 10, 2, 5

Indexing Stop words -- grammatical functional words, such as “of,” “the,” and “a.” Stemming -- reducing words to a common root form Thesaurus -- list of synonyms Weighting -- term significance derived from occurrence frequency within a document and among different documents

Relevance Feedback Query modification Document modification terms occurring in documents previously identified as relevant are added to the original query or the weight of such terms is increased terms occurring in documents previously identified as irrelevant are deleted from the query or the weight of such terms is reduced Document modification terms in the query, but not in the user-judged relevant documents, are added to the document index list with an initial weight weights of index terms in the query and also in relevant documents are increased by a certain amount weights of index terms not in the query but in the relevant documents are decreased by a certain amount

Problems with Annotation Automatic generation of descriptive key words or extracting semantic information to build classification hierarchies for broad varieties of images Involving human operators makes the process time-consuming and subjective retrieval fails if the user forms a query based on key words not employed by the operator retrieval fails if the query refers to elements of image content that were not described certain visual properties, textures and shapes, are difficult or nearly impossible to describe with text for general-purpose usage

Content-based IR Retrieve visual data using queries based on the visual content of an image/video : patterns , colors, textures, and shapes, layout and location information when it is necessary to verify that a trademark or logo has not been used by another comapany comparing fabric patterns Search is driven by first establishing one or more sample images and then identifying specific features of those sample images which need to match images from the database

Audio Search and Retrieval Keywords can be highly subjective because of a different perspective or even a different taxonomy Hard to browse directly since it must be auditioned in real-time (unlike video which can be keyframed) Two categories : Speech and Non-speech with speech, indexing and retrieval is based on obtaining spoken words either manually or by speech recognition technique with non-speech, indexing and retrieval may be based on text annotation (but will it help a query like “find the first occurrence of the note G-sharp.”)

Image Database Issues Selection, derivation, and computation of image features and objects that provide useful query expressiveness Retrieval methods based on similarity, as opposed to exact matching User interface that supports the visual expression of queries and allows query refinement and navigation of results Indexing which is compatible with the expressiveness of the queries A system architecture that supports this approach

Color Analysis Color distribution represented as a histogram of intensity values each of whose bins corresponds to a range of pixel values Histograms are compared by an intersection operation. This sum may be interpreted as enumerating the number of pixels which are common to both histograms This value may be normalized by the total number of pixels in one of the two histograms Computationally expensive -- O(NM) where N is th enumber of histogram bins and M is the total number of images in the database

Color Analysis (contd.) Reduce search time by reducing the number of histogram bins transform RGB representation (coarse segmentation of color space) apply clustering technique to determine K best colors in a given color space (clustering process takes into account the color distribution of images over the entire database) a small number of histogram bins tend to capture the majority of pixels of an image; only largest bins in terms of pixels counts need be selected as representation of any histogram. As long as the bins of the query and image histograms are appropriately matched, intersection may be computed over this reduce set.

Color Analysis (contd.) Disadvantages: histogram-based similarity computation lacks information about location (this problem may be solved by dividing an image into sub-areas and calculating a histogram for each of those sub-areas image representations in the image database as well as queries have to be the same

Texture Analysis Statistical methods are used to characterize texture in terms of the spatial distribution of image intensity Tamura features: contrast : quantification is based on the statistical distribution of pixel intensities coarseness : measure of the granularity of the texture directionality : to compute this measure, a gradient vector is calculated at each pixel

Shape Analysis Histogram of significant edges Ordered list of interest points Chain-code-based shape representation and similarity measure

Chain Code-based Shape Analysis 4-directional 8-directional Grid spacing Normalization process -- starting point, rotation, scale

Starting Point Normalization Treat the chain code generated by an arbitrary starting point as a circular sequence of direction numbers Redefine the starting point such that the resulting sequence of numbers forms an integer of minimum magnitude 0303332221211010 (arbitrary starting point) 0030333222121101 (after normalizing) After normalizing, the shape boundary has unique chain code (for a fixed orientation and grid size)

Shape Number Rotation normalization is needed because a boundary after rotation has a different chain code. Rotation changes the spatial relationships between the grid space and boundary. First difference of the chain code reflects spatial relationships between boundary segments which are independent of rotation The difference is computed by counting (in a counter- clockwise) the number of directions that separate two adjacent elements in a code Shape number of a boundary is defined as the first difference of the smallest magnitude

Unique Shape Number Need for making the shape boundaries invariant to rotation and scale Solution -- orient the resampling grid along the principal axis of the shape boundary. In this case, the grid and the boundary have fixed spatial relationships. Major axis is defined as the line segment between two farthest points on the boundary. Minor axis is perpendicular to the major axis and its length is such that a rectangle formed by these axes will enclose the shape boundary.

Scale Normalization Eccentricity of the boundary -- ratio of the major to the minor axes Basic Rectangle -- rectangle formed by the major and the minor axes of a boundary Shape number obtained using basic rectangle will be unique

Unique Chain Code Algorithm select the first digit as any number within the chain code direction range, say 0; the second digit differs from the first digit by an amount determined by the first digit of the shape number use the shape number to determine the rest of the digits in the unique chain code

Similarity Measurement The distance d between two boundaries is defined as the number of grids not commonly covered by the two boundaries boundaries with the same unique chain code have distance 0 Obtain a binary number for each boundary Exclusive OR of the binary numbers of the two boundaries and the number of 1s in the result is the distance d Similarity is 1 - (d/N)

Indexing and Retrieval of Video Video is normally made of a number of logic units or segments (video shots) frames depicting the same scene frames signify single camera operation frames contain a distinct event or or action (signifying the presence of the same object) Consecutive frames on either side of a camera break generally display a significant quantitative change in the content (other camera operations such as dissolve, wipe, fade-in, and fade-out require sophisticated measures to quantify the change)

Shot Detection Difference metrics between frames are based on the comparison of pixel intensity histograms Difference threshold are chosen such that all boundaries are detected and false detection is minimized Dealing with gradual changes requires sophisticated techniques Indexing is done by finding a representative frame and features of this frame are extracted and indexed based on text, color, shape, and/or texture