Download presentation
Presentation is loading. Please wait.
1
DL:Lesson 11 Multimedia Search Luca Dini dini@celi.it
2
MPEG-4: Content-based Encoding Encodes objects that can be tracked from frame to frame. Video frames are layers of video object planes (VOP). Each VOP is segmented & coded separately throughout the shot Background encoded only once. Objects are not defined as to what they represent, only their motion, shapes, colors and textures, allowing them to be tracked through time. Objects and their backgrounds are brought together again by the decoder.
3
MPEG-4: Content-based encoding Ghanbari, M. (1999) Video Coding: An Introduction to Standard Codecs Video object plane (VOP) Background encoded only once
4
AMOS: Tracking Objects Beyond the Frame http://www.ctr.columbia.edu/~dzhong/rtrack/demo.htm
5
“Are We Doing Multimedia?”* Multimodal Indexing Ramesh Jain: “To solve multimedia problems, we should use as much context as we can.” – Visual (frames, shots, scenes) – Audio (soundtrack: speech recognition) – Text (closed captions, subtitles) – Context—hyperlinks, etc. *IEEE Multimedia. Oct-Nov. 2003 http://jain.faculty.gatech.edu/media_vision/doing_mm.pdf
6
Snoek, C., Worring, M. Multimodal Indexing: A Review of the State-of-the-art. Multimedia Tools & Applications. January 2005 Settings, Objects, People Modalities: Video, audio, text
7
Building Video Indexes Same as any indexing process…decide: – What to index: granularity – How to index: modalities (images, audio, etc.) – Which features? Discover spatial and temporal structure: deconstructing the authoring process Construct data models for access
8
Building Video Indexes: Structured modeling Predict relationship between shots: Pattern recognition Hidden Markov Models SVM (support vector machines) Neural networks Relevance feedback via machine learning
9
Data Models for Video IR Based on text (DBMS, MARC) Semi-structured (video + XML or hypertext): MPEG-7, SMIL Based on context: Yahoo Video, Blinkx, Truveo Multimodal: Marvel, Virage
10
Virage VideoLogger TM SMPTE timecode Keyframes Text or audio extracted automatically Mark & annotate clips
11
Annotation: Metadata Schemes MPEG-7 MPEG-21 METS SMIL
12
IBM MPEG-7 Annotation Tool
13
MPEG-7 Output from IBM Annotation Tool - T00:00:27:20830F30000 248 - T00:00:31:23953F30000 - Indoors - 14 15 351 238 Duration of shot in frames Location and dimension of spatial locator in pixelsAnnotation
14
The MPEG group Motion Picture Expert Group Founded by ISO (International Standards Organization) in 1988 Four standards, MPEG 1, 2, 4 and 7
15
MPEG-1 Standard in 1992 Gave good quality audio and video Usually low resolution video with around 30 frames per second Three audio layers
16
MPEG-2 Standardized in 1996 The codec of DVD Very good quality audio and video Uses high resolution and high bit-rate
17
MPEG-4 Standardized in 1998 Based on MPEG-1, MPEG-2 and QuickTime First real multimedia representation standard Intended for videoconferences Several different versions
18
MPEG-7 Standardized in 2001 Not a video codec Called “Multimedia Content Description Interface” Utilizes the earlier MPEG Standards Developed to simplify search for media elements
19
Standardization Progress ITU-T ISO/IEC Joint ITU-T, ISO/IEC H.261 (1990) JPEG (1992) MPEG-1 (1992) MPEG-2 (1994) H.263 (1995) H.26L (2001) MPEG-4 (1999) MPEG-7 (2001) Application Areas Features Videophone PSTN, B-ISDN Low quality 64kbps ~ 1.5Mbps Video CD Internet VHS quality < 1.5 Mbps Stereo Audio Digital Broadcasting DVD Digital Camcoder High quality 1.5 ~ 80 Mbps 5.1 channel Audio Content Production Internet Multimedia Broadcast Various quality Synthetic Audio/Video User Interactivity Content Search Internet, DSM Broadcasting User Interactivity Data CompressionContent Manipulation
20
MPEG-7 Scope Diversity of Applications – Multimedia, Music/Audio, Graphics, Video Descriptors (Ds) – Describe basic characteristics of audiovisual content – Examples: Shape, Color, Texture, … Description Schemes (DSs) – Describe combinations of descriptors - Example: Spoken Content
21
Scope Description Production (extraction) Description Consumption Standard Description Normative part of MPEG-7 standard MPEG-7 does not specify -How to extract descriptions -How to use descriptions -The similarity between contents
22
Descriptions Annotations – cannot be deduced from content – recording date & conditions, author, copyright, viewing age, etc. Features – that is present in the content – low level features color, texture, shape, key, mood, tempo, etc. – high level features composition, event, action, situation, etc.
23
MPEG-7 Terminology Data – Audiovisual information that will be described using MPEG-7 Feature – A distinctive part or characteristic of data (ex. Color, shape,...) Descriptor – Associates a representation value to one or more features. Description Scheme – Defines a structure and semantics of descriptors and their relationships to model data content. Description Definition Language (DDL) – A language to specify Description Scheme Coded description – A representation of description allowing efficient storage and transmission
24
Components 1) MPEG-7 Systems 2) MPEG-7 Description Definition Language 3) MPEG-7 Visual 4) MPEG-7 Audio 5) MPEG-7 Multimedia DSs 6) MPEG-7 Reference Software 7) MPEG-7 Conformance
25
Visual Descriptors Color Descriptors Texture Descriptors Shape Descriptors Motion Descriptors for Video
26
Colors
27
Etc… http://mp7.watson.ibm.com/marvel/
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.