Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of.

Slides:



Advertisements
Similar presentations
A Human-Centered Computing Framework to Enable Personalized News Video Recommendation (Oh Jun-hyuk)
Advertisements

Hyeonsoo, Kang. ▫ Structure of the algorithm ▫ Introduction 1.Model learning algorithm 2.[Review HMM] 3.Feature selection algorithm ▫ Results.
Biointelligence Laboratory, Seoul National University
Ch 11. Sampling Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by I.-H. Lee Biointelligence Laboratory, Seoul National.
Lecture 07 Segmentation Lecture 07 Segmentation Mata kuliah: T Computer Vision Tahun: 2010.
3D Human Body Pose Estimation from Monocular Video Moin Nabi Computer Vision Group Institute for Research in Fundamental Sciences (IPM)
Broadcast News Parsing Using Visual Cues: A Robust Face Detection Approach Yannis Avrithis, Nicolas Tsapatsoulis and Stefanos Kollias Image, Video & Multimedia.
ICIP 2000, Vancouver, Canada IVML, ECE, NTUA Face Detection: Is it only for Face Recognition?  A few years earlier  Face Detection Face Recognition 
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Locally Constraint Support Vector Clustering
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
Results Audio Information Retrieval using Semantic Similarity Luke Barrington, Antoni Chan, Douglas Turnbull & Gert Lanckriet Electrical & Computer Engineering.
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
Vector Space Information Retrieval Using Concept Projection Presented by Zhiguo Li
MSU CSE 803 Stockman Linear Operations Using Masks Masks are patterns used to define the weights used in averaging the neighbors of a pixel to compute.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
The Chinese University of Hong Kong Department of Computer Science and Engineering Lyu0202 Advanced Audio Information Retrieval System.
Scalable Text Mining with Sparse Generative Models
MSU CSE 803 Linear Operations Using Masks Masks are patterns used to define the weights used in averaging the neighbors of a pixel to compute some result.
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
Information Retrieval in Practice
Image Representation Gaussian pyramids Laplacian Pyramids
Content-Based Video Retrieval System Presented by: Edmund Liang CSE 8337: Information Retrieval.
TEMPORAL VIDEO BOUNDARIES -PART ONE- SNUEE KIM KYUNGMIN.
Kernel Classifiers from a Machine Learning Perspective (sec ) Jin-San Yang Biointelligence Laboratory School of Computer Science and Engineering.
1 CS 430 / INFO 430 Information Retrieval Lecture 23 Non-Textual Materials 2.
TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.
Generic text summarization using relevance measure and latent semantic analysis Gong Yihong and Xin Liu SIGIR, April 2015 Yubin Lim.
MATLAB Practice 1 Introducing MATLAB Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of.
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
MATLAB Practice 2 Introducing MATLAB Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of.
Experiments Test different parking lot images captured in different luminance conditions The test samples include 1300 available parking spaces and 1500.
Artificial Intelligence Chapter 6 Robot Vision Biointelligence Lab School of Computer Sci. & Eng. Seoul National University.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
Copyright © 2010 Siemens Medical Solutions USA, Inc. All rights reserved. Hierarchical Segmentation and Identification of Thoracic Vertebra Using Learning-based.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Non-Photorealistic Rendering and Content- Based Image Retrieval Yuan-Hao Lai Pacific Graphics (2003)
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
ICIP 2004, Singapore, October A Comparison of Continuous vs. Discrete Image Models for Probabilistic Image and Video Retrieval Arjen P. de Vries.
1 Applications of video-content analysis and retrieval IEEE Multimedia Magazine 2002 JUL-SEP Reporter: 林浩棟.
Image Classification for Automatic Annotation
Context-based vision system for place and object recognition Antonio Torralba Kevin Murphy Bill Freeman Mark Rubin Presented by David Lee Some slides borrowed.
A Reliable Skin Detection Using Dempster-Shafer Theory of Evidence
Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
1 CS 430 / INFO 430 Information Retrieval Lecture 17 Metadata 4.
6.4 Random Fields on Graphs 6.5 Random Fields Models In “Adaptive Cooperative Systems” Summarized by Ho-Sik Seok.
Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.
Information Bottleneck Method & Double Clustering + α Summarized by Byoung Hee, Kim.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
Edge Preserving Spatially Varying Mixtures for Image Segmentation Giorgos Sfikas, Christophoros Nikou, Nikolaos Galatsanos (CVPR 2008) Presented by Lihan.
Chapter 9. The PlayMate System ( 2/2 ) in Cognitive Systems Monographs. Rüdiger Dillmann et al. Course: Robots Learning from Humans Summarized by Nan Changjun.
Visual Information Retrieval
Improving Chinese handwriting Recognition by Fusing speech recognition
Reading Notes Wang Ning Lab of Database and Information Systems
Self-Organizing Maps for Content-Based Image Database Retrieval
Chapter 6. Robot Vision.
Detecting Artifacts and Textures in Wavelet Coded Images
SMEM Algorithm for Mixture Models
Multimedia Information Retrieval
Overview Proposed Approach Experiments Compositional inference
Linear Operations Using Masks
Guideline Try to summarize to express explicitly
Adaptive Cooperative Systems Chapter 6 Markov Random Fields
Information Retrieval
Parallel BFS for Maximum Clique Problems
Restructuring Sparse High Dimensional Data for Effective Retrieval
Image segmentation Grey scale image Binary image
Three Dimensional DNA Structures in Computing
Presentation transcript:

Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of Computer Science and Engineering Seoul National Univertisy

Semantic Understanding There are some tools which attempt to segment video at a higher level. But this level of analysis does not tell us much about the meaning represented in the media. Problem Statement © 2012, SNU CSE Biointelligence Lab.,

Approach Segmentation Literature Use LSI because it allow us to quantify the position of a portion of the document in a multi-dimensional semantic space. Propose to summarize the text with LSI and analyze the signal with smooth Gaussians. Semantic Retrieval Literature Use mixtures of probability experts for semantic-audio retrieval (MPESAR) to model which more sophisticated model connecting words and media. © 2012, SNU CSE Biointelligence Lab.,

Analysis Tools © 2012, SNU CSE Biointelligence Lab.,

Segmenting Video Temporal Properties of Video Color: It provides robust evidence for a shot change in a video signal. However, it cannot tell us global structure of the video. Random words form a transcript: The words indicate a lot about the overall structure of the story. © 2012, SNU CSE Biointelligence Lab.,

Segmenting Video Test Material CNN Headline News (30min TV show). 21 st Century Jet (Documentary). Use automatic speech recognition(ASR) to provide a transcript of the audio. © 2012, SNU CSE Biointelligence Lab.,

Segmenting Video Scale Space Convert the original signal into scaled space. In scale space, we analyze a signal with many different kernels. © 2012, SNU CSE Biointelligence Lab., With Low Pass Filter Histogram

Segmenting Video Combined Image and Audio Data Combined color, words and scale space analysis. The result is a 20-dimensional vector function of time and scale. © 2012, SNU CSE Biointelligence Lab.,

Segmenting Video Hierarchical Segmentation Results Color and word autocorrelations for the Boeing 777 video © 2012, SNU CSE Biointelligence Lab.,

Segmenting Video Hierarchical Segmentation Results Grouping 4-8 sentences produces a larger semantic autocorrelation. © 2012, SNU CSE Biointelligence Lab.,

Segmenting Video Intermediate Results A scale-space segmentation algorithm produced a boundary map showing the edges in the signal. © 2012, SNU CSE Biointelligence Lab.,

Segmenting Video A comparison of ground truth. Left: estimated result. Right: ground truth. © 2012, SNU CSE Biointelligence Lab.,

Segmenting Video Shot Boundary Segmentation. Use commercial product, designed by YesVideo. © 2012, SNU CSE Biointelligence Lab.,

Segmenting Video Manual Segmentation result © 2012, SNU CSE Biointelligence Lab.,

Semantic Retrieval © 2012, SNU CSE Biointelligence Lab., MPESAR process

Semantic Retrieval Acoustic Signal processing chain Acoustic to Semantic Lookup © 2012, SNU CSE Biointelligence Lab.,

Semantic Retrieval © 2012, SNU CSE Biointelligence Lab., Testing

Retrieval Results © 2012, SNU CSE Biointelligence Lab., Histogram of true label ranks based on likelihoods from audio-to-semantic tests Histogram of true label ranks based on likelihoods from semantic-to-acoustic tests