Image Retrieval and Annotation via a Stochastic Modeling Approach

Slides:

Advertisements

Similar presentations

LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.

Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.

Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.

ICIP 2000, Vancouver, Canada IVML, ECE, NTUA Face Detection: Is it only for Face Recognition?  A few years earlier  Face Detection Face Recognition 

Young Deok Chun, Nam Chul Kim, Member, IEEE, and Ick Hoon Jang, Member, IEEE IEEE TRANSACTIONS ON MULTIMEDIA,OCTOBER 2008.

HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.

Morris LeBlanc.  Why Image Retrieval is Hard?  Problems with Image Retrieval  Support Vector Machines  Active Learning  Image Processing ◦ Texture.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Image Search Presented by: Samantha Mahindrakar Diti Gandhi.

CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.

Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern.

Image Categorization by Learning and Reasoning with Regions Yixin Chen, University of New Orleans James Z. Wang, The Pennsylvania State University Published.

Novelty Detection and Profile Tracking from Massive Data Jaime Carbonell Eugene Fink Santosh Ananthraman.

1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman

Presented by Zeehasham Rasheed

Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.

Jia Li, Ph.D. The Pennsylvania State University Image Retrieval and Annotation via a Stochastic Modeling Approach.

Content-based Image Retrieval (CBIR)

SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,

Content-Based Video Retrieval System Presented by: Edmund Liang CSE 8337: Information Retrieval.

From Pixels to Semantics – Research on Intelligent Image Indexing and Retrieval James Z. Wang PNC Technologies Career Dev. Professorship School of Information.

Wavelet-Based Multiresolution Matching for Content-Based Image Retrieval Presented by Tienwei Tsai Department of Computer Science and Engineering Tatung.

Multimedia Databases (MMDB)

Research on Intelligent Image Indexing and Retrieval James Z. Wang School of Information Sciences and Technology The Pennsylvania State University Joint.

PageRank for Product Image Search Kevin Jing (Googlc IncGVU, College of Computing, Georgia Institute of Technology) Shumeet Baluja (Google Inc.) WWW 2008.

Content-Based Image Retrieval

INDEPENDENT COMPONENT ANALYSIS OF TEXTURES based on the article R.Manduchi, J. Portilla, ICA of Textures, The Proc. of the 7 th IEEE Int. Conf. On Comp.

Report on Intrusion Detection and Data Fusion By Ganesh Godavari.

COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.

ALIP: Automatic Linguistic Indexing of Pictures Jia Li The Pennsylvania State University.

Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.

IEEE Int'l Symposium on Signal Processing and its Applications 1 An Unsupervised Learning Approach to Content-Based Image Retrieval Yixin Chen & James.

A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,

21/11/20151Gianluca Demartini Ranking Clusters for Web Search Gianluca Demartini Paul–Alexandru Chirita Ingo Brunkhorst Wolfgang Nejdl L3S Info Lunch Hannover,

Algorithmic Detection of Semantic Similarity WWW 2005.

1 A Compact Feature Representation and Image Indexing in Content- Based Image Retrieval A presentation by Gita Das PhD Candidate 29 Nov 2005 Supervisor:

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.

Semi-Automatic Image Annotation Liu Wenyin, Susan Dumais, Yanfeng Sun, HongJiang Zhang, Mary Czerwinski and Brent Field Microsoft Research.

Image Classification for Automatic Annotation

Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.

GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.

Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.

Yixin Chen and James Z. Wang The Pennsylvania State University

Towards Total Scene Understanding: Classiﬁcation, Annotation and Segmentation in an Automatic Framework N 工科所錢雅馨 2011/01/16 Li-Jia Li, Richard.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

Query by Image and Video Content: The QBIC System M. Flickner et al. IEEE Computer Special Issue on Content-Based Retrieval Vol. 28, No. 9, September 1995.

Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,

1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.

School of Computer Science & Engineering

Multimedia Content-Based Retrieval

Content-Based Image Retrieval Readings: Chapter 8:

Project Implementation for ITCS4122

Advanced Techniques for Automatic Web Filtering

Local Feature Extraction Using Scale-Space Decomposition

Advanced Techniques for Automatic Web Filtering

Ying Dai Faculty of software and information science,

Brief Review of Recognition + Context

Interactive Visual System

Ying Dai Faculty of software and information science,

Ying Dai Faculty of software and information science,

Ying Dai Faculty of software and information science,

Nonparametric Bayesian Texture Learning and Synthesis

Text Categorization Berlin Chen 2003 Reference:

Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017

Presentation transcript:

Image Retrieval and Annotation via a Stochastic Modeling Approach Jia Li, Ph.D. The Pennsylvania State University

Outline Introduction Image retrieval: SIMPLIcity Automatic annotation: ALIP A stochastic modeling approach Conclusions and future work

Image Retrieval The retrieval of relevant images from an image database on the basis of automatically-derived image features Applications: biomedicine, defense, commercial, cultural, education, entertainment, Web, …… Approaches: Color layout Region based User feedback

Can a computer do this? “Building, sky, lake, landscape, Europe, tree”

Outline Introduction Image retrieval: SIMPLIcity Automatic annotation: ALIP A stochastic modeling approach Conclusions and future work

The SIMPLIcity System Semantics-sensitive Integrated Matching for Picture LIbraries Major features Sensitive to semantics: combine semantic classification with image retrieval Region based retrieval:wavelet-based feature extraction and k-means clustering Reduced sensitivity to inaccurate segmentation and simple user interface: Integrated Region Matching (IRM)

Wavelets

Fast Image Segmentation Partition an image into 4×4 blocks Extract wavelet-based features from each block Use k-means algorithm to cluster feature vectors into ‘regions’ Compute the shape feature by normalized inertia

IRM: Integrated Region Matching IRM defines an image-to-image distance as a weighted sum of region-to-region distances Weighting matrix is determined based on significance constrains and a ‘MSHP’ greedy algorithm

A 3-D Example for IRM

IRM: Major Advantages Reduces the influence of inaccurate segmentation Helps to clarify the semantics of a particular region given its neighbors Provides the user with a simple interface

Experiments and Results Speed 800 MHz Pentium PC with LINUX OS Databases: 200,000 general-purpose image DB (60,000 photographs + 140,000 hand-drawn arts) 70,000 pathology image segments Image indexing time: one second per image Image retrieval time: Without the scalable IRM, 1.5 seconds/query CPU time With the scalable IRM, 0.15 second/query CPU time External query: one extra second CPU time

RANDOM SELECTION

Query Results Current SIMPLIcity System

External Query

Robustness to Image Alterations 10% brighten on average 8% darken Blurring with a 15x15 Gaussian filter 70% sharpen 20% more saturation 10% less saturation Shape distortions Cropping, shifting, rotation

Status of SIMPLIcity Researchers from more than 40 institutions/government agencies requested and obtained SIMPLIcity We applied SIMPLicity to: Automatic image classification Searching of pathological images Searching of art and cultural images

Outline Introduction Image retrieval: SIMPLIcity Automatic annotation: ALIP A stochastic modeling approach Conclusions and future work

Image Database The image database contains categorized images. Each category is annotated with a few words. Landscape, glacier Africa, wildlife Each category of images is referred to as a concept.

A Category of Images Annotation: “man, male, people, cloth, face”

ALIP: Automatic Linguistic Indexing for Pictures Learn relations between annotation words and images using the training database. Profile each category by a statistical image model: 2-D Multiresolution Hidden Markov Model (2-D MHMM). Assess the similarity between an image and a category by its likelihood under the profiling model.

Training Process

Automatic Annotation Process

Model: 2-D MHMM Represent images by local features extracted at multiple resolutions. Model the feature vectors and their inter- and intra-scale dependence. 2-D MHMM finds “modes” of the feature vectors and characterizes their spatial dependence.

2D HMM Regard an image as a grid. A feature vector is computed for each node. Each node exists in a hidden state. The states are governed by a Markov mesh (a causal Markov random field). Given the state, the feature vector is conditionally independent of other feature vectors and follows a normal distribution. The states are introduced to efficiently model the spatial dependence among feature vectors. The states are not observable, which makes estimation difficult.

2D HMM The underlying states are governed by a Markov mesh. (i’,j’)<(i,j) if i’<i; or i’=i & j’<j

2D MHMM An image is a pyramid grid. A Markovian dependence is assumed across resolutions. Given the state of a parent node, the states of its child nodes follow a Markov mesh with transition probabilities depending on the parent state.

2D MHMM First-order Markov dependence across resolutions.

2D MHMM The child nodes at resolution r of node (k,l) at resolution r-1: Conditional independence given the parent state:

Annotation Process Rank the categories by the likelihoods of an image to be annotated under their profiling 2-D MHMMs. Select annotation words from those used to describe the top ranked categories. Statistical significance is computed for each candidate word. Words that are unlikely to have appeared by chance are selected. Favor the selection of rare words.

Initial Experiment 600 concepts, each trained with 40 images 15 minutes Pentium CPU time per concept, train only once highly parallelizable algorithm

Preliminary Results Computer Prediction: people, Europe, man-made, water Building, sky, lake, landscape, Europe, tree People, Europe, female Food, indoor, cuisine, dessert Snow, animal, wildlife, sky, cloth, ice, people

More Results

Results: using our own photographs P: Photographer annotation Underlined words: words predicted by computer (Parenthesis): words not in the learned “dictionary” of the computer

Systematic Evaluation 10 classes: Africa, beach, buildings, buses, dinosaurs, elephants, flowers, horses, mountains, food.

600-class Classification Task: classify a given image to one of the 600 semantic classes Gold standard: the photographer/publisher classification This procedure provides lower-bounds of the accuracy measures because: There can be overlaps of semantics among classes (e.g., “Europe” vs. “France” vs. “Paris”, or, “tigers I” vs. “tigers II”) Training images in the same class may not be visually similar (e.g., the class of “sport events” include different sports and different shooting angles) Result: with 11,200 test images, 15% of the time ALIP selected the exact class as the best choice I.e., ALIP is about 90 times more intelligent than a system with random-drawing system

More Information J. Li, J. Z. Wang, ``Automatic linguistic indexing of pictures by a statistical modeling approach,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(9):1075-1088,2003.

Conclusions SIMPLIcity system Automatic Linguistic Indexing of Pictures Highly challenging Much more to be explored Statistical modeling has shown some success.

Future Work Explore new methods for better accuracy refine statistical modeling of images learning from 3D medical images refine matching schemes Apply these methods to special image databases very large databases Integration with large-scale information systems ……