Download presentation
Presentation is loading. Please wait.
1
Video summarization by video structure analysis and graph optimization M. Phil 2 nd Term Presentation Lu Shi Dec 5, 2003
2
Outline Motivation Motivation Video structure Video structure Video skim length distribution Video skim length distribution Spatial-temporal graph modeling Spatial-temporal graph modeling Optimization based video shot selection Optimization based video shot selection Experimental results Experimental results
3
Motivation Huge volume of video data are distributed over the Web Huge volume of video data are distributed over the Web Browsing and management in the huge video database are time consuming Browsing and management in the huge video database are time consuming Help the user to quickly grasp the content of a video Help the user to quickly grasp the content of a video Two kinds of applications: Two kinds of applications: Video skimming (dynamic) Video skimming (dynamic) Video static summary (static) Video static summary (static)
4
Goals Conciseness Conciseness Content coverage Content coverage Spatial and temporal Spatial and temporal Coherency Coherency Not too jumpy Not too jumpy
5
Flowchart
6
Video structure Video narrates a story just like an article does Video narrates a story just like an article does Video (story) Video (story) Video scenes (paragraph) Video scenes (paragraph) Video shot groups Video shot groups Video shots (sentence) Video shots (sentence) Video frames Video frames
7
Video structure Graphical example Graphical example
8
Video structure Can be built up in a bottom-up manner Can be built up in a bottom-up manner Video shot detection Video shot detection Video shot grouping Video shot grouping Video scene formation Video scene formation
9
Video structure Video shot detection Video shot detection Video slice image [1] Video slice image [1] Column - pairwise distance Column - pairwise distance Filtering and thresholding Filtering and thresholding … … … …
10
Video structure Video shot grouping Video shot grouping Window-sweeping algorithm [2] Window-sweeping algorithm [2] Spatial similarity Spatial similarity Temporal distance Temporal distance Intersected video shot groups form loop scenes Intersected video shot groups form loop scenes
11
Video structure Summarize each video scene respectively Summarize each video scene respectively Loop scenes and progressive scenes Loop scenes and progressive scenes Loop scenes depict an event happened at a place Loop scenes depict an event happened at a place Progressive scenes: “ transition ” between events or dynamic events Progressive scenes: “ transition ” between events or dynamic events
12
Video structure Scene importance: length and complexity Scene importance: length and complexity Content entropy for loop scenes Content entropy for loop scenes Measure the complexity for a loop scene Measure the complexity for a loop scene
13
Video structure Determine each video scene ’ s target skim length Determine each video scene ’ s target skim length Determine each progressive scenes ’ skim length Determine each progressive scenes ’ skim length If, discard it, else If, discard it, else Determine each loop scenes ’ skim length Determine each loop scenes ’ skim length If,discard it If,discard it Redistribute to remaining scenes Redistribute to remaining scenes
14
Graph modeling Spatial-temporal dissimilarity function Spatial-temporal dissimilarity function Linear with visual dissimilarity Linear with visual dissimilarity Exponential with temporal distance Exponential with temporal distance
15
Graph modeling The spatial temporal relation graph The spatial temporal relation graph Each vertex corresponds to a video shot Each vertex corresponds to a video shot Each edge corresponds to the dissimilarity function between shots Each edge corresponds to the dissimilarity function between shots Directional and complete Directional and complete
16
Skim generation The goal of video summarization The goal of video summarization Conciseness: given the target skim length Conciseness: given the target skim length Content coverage Content coverage The spatial temporal dissimilarity function The spatial temporal dissimilarity function The spatial temporal relation graph The spatial temporal relation graph A path corresponds to a series of video shots A path corresponds to a series of video shots Vertex weight summation Vertex weight summation Path length is the summation of the dissimilarity between consecutive shot pairs Path length is the summation of the dissimilarity between consecutive shot pairs
17
Skim generation Objectives: Objectives: Search for a path in the graph such that: Search for a path in the graph such that: Maximize the path length (dissimilarity summation) Maximize the path length (dissimilarity summation) Vertex weight summation should be close to but not exceed it Vertex weight summation should be close to but not exceed it The objective function The objective function
18
Skim generation Global optimal solution Global optimal solution Let denote the paths begin with, whose vertex weight summation is upper bounded by Let denote the paths begin with, whose vertex weight summation is upper bounded by The optimal path is denoted by The optimal path is denoted by The target is The target is
19
Skim generation Optimal substructure Optimal substructure Dynamic programming Dynamic programming Effective way to compute the global optimal solution Effective way to compute the global optimal solution Trace back to find the optimal path Trace back to find the optimal path Time complexity, space complexity Time complexity, space complexity
20
Experiments Key frames of selected video shots Key frames of selected video shots
21
Experiments There is no ground truth so that it is hard to objectively evaluate a video skim There is no ground truth so that it is hard to objectively evaluate a video skim Subjective experiment Subjective experiment Parameters: Parameters:
22
Conclusion Video structure analysis Video structure analysis Scene boundaries, sub-skim length determination Scene boundaries, sub-skim length determination Graph scene modeling Graph scene modeling Optimization based sub skim generation Optimization based sub skim generation Generate a video skim Generate a video skim
23
Reference [1] C. W. Ngo, Analysis of spatial temporal slices for video content representation, Ph. D thesis, HKUST, Aug.2000 [1] C. W. Ngo, Analysis of spatial temporal slices for video content representation, Ph. D thesis, HKUST, Aug.2000 [2] [2] Y. Rui, T.S. Huang, and S. Mehrotra, Constructing table-of content for videos, ACM Multimedia Systems Journal, Special Issue Multimedia Systems on Video Libraries, vol. 7, no.5, pp. 359~368, Sept 1999.
24
Q & A Thank you!! Thank you!!
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.