ALIP: Automatic Linguistic Indexing of Pictures Jia Li The Pennsylvania State University.

Slides:



Advertisements
Similar presentations
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Advertisements

Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
Exploiting Sparse Markov and Covariance Structure in Multiresolution Models Presenter: Zhe Chen ECE / CMR Tennessee Technological University October 22,
3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
What is Statistical Modeling
Ch-9: Markov Models Prepared by Qaiser Abbas ( )
A Bayesian Approach to Joint Feature Selection and Classifier Design Balaji Krishnapuram, Alexander J. Hartemink, Lawrence Carin, Fellow, IEEE, and Mario.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Expectation Maximization Method Effective Image Retrieval Based on Hidden Concept Discovery in Image Database By Sanket Korgaonkar Masters Computer Science.
Generative Models Rong Jin. Statistical Inference Training ExamplesLearning a Statistical Model  Prediction p(x;  ) Female: Gaussian distribution N(
Presented by Zeehasham Rasheed
Object Class Recognition using Images of Abstract Regions Yi Li, Jeff A. Bilmes, and Linda G. Shapiro Department of Computer Science and Engineering Department.
Object Class Recognition Readings: Yi Li’s 2 Papers Abstract Regions Paper 1: EM as a Classifier Paper 2: Generative/Discriminative Classifier.
Visual Speech Recognition Using Hidden Markov Models Kofi A. Boakye CS280 Course Project.
Jia Li, Ph.D. The Pennsylvania State University Image Retrieval and Annotation via a Stochastic Modeling Approach.
Information Retrieval in Practice
From Pixels to Semantics – Research on Intelligent Image Indexing and Retrieval James Z. Wang PNC Technologies Career Dev. Professorship School of Information.
Tracking Pedestrians Using Local Spatio- Temporal Motion Patterns in Extremely Crowded Scenes Louis Kratz and Ko Nishino IEEE TRANSACTIONS ON PATTERN ANALYSIS.
Isolated-Word Speech Recognition Using Hidden Markov Models
1 The Hidden Vector State Language Model Vidura Senevitratne, Steve Young Cambridge University Engineering Department.
Research on Intelligent Image Indexing and Retrieval James Z. Wang School of Information Sciences and Technology The Pennsylvania State University Joint.
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Yao, B., and Fei-fei, L. IEEE Transactions on PAMI(2012)
1 Generative and Discriminative Models Jie Tang Department of Computer Science & Technology Tsinghua University 2012.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Processing Sequential Sensor Data The “John Krumm perspective” Thomas Plötz November 29 th, 2011.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Chapter 3: Maximum-Likelihood Parameter Estimation l Introduction l Maximum-Likelihood Estimation l Multivariate Case: unknown , known  l Univariate.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
PhD Candidate: Tao Ma Advised by: Dr. Joseph Picone Institute for Signal and Information Processing (ISIP) Mississippi State University Linear Dynamic.
Image Classification for Automatic Annotation
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
Lecture 2: Statistical learning primer for biologists
Flat clustering approaches
GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.
A Dynamic Conditional Random Field Model for Object Segmentation in Image Sequences Duke University Machine Learning Group Presented by Qiuhua Liu March.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.
Dimensionality Reduction in Unsupervised Learning of Conditional Gaussian Networks Authors: Pegna, J.M., Lozano, J.A., Larragnaga, P., and Inza, I. In.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Statistical Significance Hypothesis Testing.
1.Learn appearance based models for concepts 2.Compute posterior probabilities or Semantic Multinomial (SMN) under appearance models. -But, suffers from.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Other Models for Time Series. The Hidden Markov Model (HMM)
Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.
Object Class Recognition Readings: Yi Li’s 2 Papers
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Image Retrieval and Annotation via a Stochastic Modeling Approach
LECTURE 33: STATISTICAL SIGNIFICANCE AND CONFIDENCE (CONT.)
Advanced Techniques for Automatic Web Filtering
Advanced Techniques for Automatic Web Filtering
Ying Dai Faculty of software and information science,
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
Unsupervised Classification
Wavelet-Based Denoising Using Hidden Markov Models
CONTEXT DEPENDENT CLASSIFICATION
Ying Dai Faculty of software and information science,
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Text Categorization Berlin Chen 2003 Reference:
Random Neural Network Texture Model
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

ALIP: Automatic Linguistic Indexing of Pictures Jia Li The Pennsylvania State University

“Building, sky, lake, landscape, Europe, tree” Can a computer do this?

Outline Background Statistical image modeling approach The system architecture The image model Experiments Conclusions and future work

Image Database The image database contains categorized images. Each category is annotated with a few words. Landscape, glacier Africa, wildlife Each category of images is referred to as a concept.

A Category of Images Annotation: “man, male, people, cloth, face”

ALIP: Automatic Linguistic Indexing for Pictures Learn relations between annotation words and images using the training database. Profile each category by a statistical image model: 2-D Multiresolution Hidden Markov Model (2-D MHMM). Assess the similarity between an image and a category by its likelihood under the profiling model.

Outline Background Statistical image modeling approach The system architecture The image model Experiments Conclusions and future work

Training Process

Automatic Annotation Process

Training Training images used to train a concept with description “man, male, people, cloth, face”

Outline Background Statistical image modeling approach The system architecture The image model Experiments Conclusions and future work

2D HMM Each node exists in a hidden state. The states are governed by a Markov mesh (a causal Markov random field). Given the state, the feature vector is conditionally independent of other feature vectors and follows a normal distribution. The states are introduced to efficiently model the spatial dependence among feature vectors. The states are not observable, which makes estimation difficult. Regard an image as a grid. A feature vector is computed for each node.

2D HMM The underlying states are governed by a Markov mesh. (i’,j’)<(i,j) if i’<i; or i’=i & j’<j Context: the set of states for (i’, j’): (i’, j’)<(i, j)

2-D MHMM Incorporate features at multiple resolutions. Provide more flexibility for modeling statistical dependence. Reduce computation by representing context information hierarchically. Filtering, e.g., by wavelet transform

2D MHMM An image is a pyramid grid. A Markovian dependence is assumed across resolutions. Given the state of a parent node, the states of its child nodes follow a Markov mesh with transition probabilities depending on the parent state.

2D MHMM First-order Markov dependence across resolutions.

2D MHMM The child nodes at resolution r of node (k,l) at resolution r-1: Conditional independence given the parent state:

2-D MHMM Statistical dependence among the states of sibling blocks is characterized by a 2-D HMM. The transition probability depends on: The neighboring states in both directions The state of the parent block

2-D MHMM (Summary) 2-D MHMM finds “modes” of the feature vectors and characterizes their inter- and intra-scale spatial dependence.

Estimation of 2-D HMM Parameters to be estimated: Transition probabilities Mean and covariance matrix of each Gaussian distribution EM algorithm is applied for ML estimation.

EM Iteration

Computation Issues An approximation to the classification EM approach

Annotation Process Rank the categories by the likelihoods of an image to be annotated under their profiling 2-D MHMMs. Select annotation words from those used to describe the top ranked categories. Statistical significance is computed for each candidate word. Words that are unlikely to have appeared by chance are selected. Favor the selection of rare words.

Outline Background Statistical image modeling approach The system architecture The image model Experiments Conclusions and future work

Initial Experiment 600 concepts, each trained with 40 images 15 minutes Pentium CPU time per concept, train only once highly parallelizable algorithm

Preliminary Results Computer Prediction: people, Europe, man-made, water Building, sky, lake, landscape, Europe, tree People, Europe, female Food, indoor, cuisine, dessert Snow, animal, wildlife, sky, cloth, ice, people

More Results

Results: using our own photographs P: Photographer annotation Underlined words: words predicted by computer (Parenthesis): words not in the learned “dictionary” of the computer

10 classes: Africa, beach, buildings, buses, dinosaurs, elephants, flowers, horses, mountains, food. Systematic Evaluation

600-class Classification Task: classify a given image to one of the 600 semantic classes Gold standard: the photographer/publisher classification This procedure provides lower-bounds of the accuracy measures because: There can be overlaps of semantics among classes (e.g., “Europe” vs. “France” vs. “Paris”, or, “tigers I” vs. “tigers II”) Training images in the same class may not be visually similar (e.g., the class of “sport events” include different sports and different shooting angles) Result: with 11,200 test images, 15% of the time ALIP selected the exact class as the best choice I.e., ALIP is about 90 times more intelligent than a system with random-drawing system

More Information J. Li, J. Z. Wang, ``Automatic linguistic indexing of pictures by a statistical modeling approach,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(9): ,2003.

Conclusions Automatic Linguistic Indexing of Pictures Highly challenging Much more to be explored Statistical modeling has shown some success. To be explored: Training image database is not categorized. Better modeling techniques. Real-world applications.