Presentation is loading. Please wait.

Presentation is loading. Please wait.

Video Summarization Using Mutual Reinforcement Principle and Shot Arrangement Patterns Lu Shi Oct. 4, 2004.

Similar presentations


Presentation on theme: "Video Summarization Using Mutual Reinforcement Principle and Shot Arrangement Patterns Lu Shi Oct. 4, 2004."— Presentation transcript:

1 Video Summarization Using Mutual Reinforcement Principle and Shot Arrangement Patterns Lu Shi Oct. 4, 2004

2 Outline Background Background Video semantics and annotation Video semantics and annotation Mutual reinforcement Mutual reinforcement Shot arrangement analysis Shot arrangement analysis Video skim selection Video skim selection Preliminary experiments Preliminary experiments

3 Background Why video summarization Why video summarization Help the user to quickly grasp the content of a video Help the user to quickly grasp the content of a video Video summary target: Video summary target: Conciseness Conciseness Content coverage Content coverage Coherency Coherency Type Type Static and dynamic Static and dynamic

4 Background Two kinds of video summarization Two kinds of video summarization Unconstrained Unconstrained Generate a preview, only try to cover all the content of the video, only constrained by the time limit L Generate a preview, only try to cover all the content of the video, only constrained by the time limit L Can be helped by mutual reinforcement result Can be helped by mutual reinforcement result Constrained Constrained User may have some preference on some specific content, like specific time range, with specific person, etc. User may have some preference on some specific content, like specific time range, with specific person, etc.

5 Background 4 level hierarchical video structure 4 level hierarchical video structure

6 System overview

7 Video semantics Low level features and high level concepts: semantic gap Low level features and high level concepts: semantic gap Summary based on low level features is not able to ensure the perceived quality Summary based on low level features is not able to ensure the perceived quality Solution: obtain video semantic information by manual/semi-automatic annotation Solution: obtain video semantic information by manual/semi-automatic annotation Usage: Usage: Retrieval Retrieval Summary Summary

8 Video semantics Semantic content template for a video shot Semantic content template for a video shot Who Who Where Where What action What action What other What other When When Dialog script Dialog script Concept term and video shot description (user editable) Concept term and video shot description (user editable)

9 Video semantics Concept term and video shot description Concept term and video shot description Term: denote an entity, e.g. “ Joe ”, “ talking ”, “ in the bank ” Term: denote an entity, e.g. “ Joe ”, “ talking ”, “ in the bank ” Context: “ who ”, “ what action ”… Context: “ who ”, “ what action ”… Shot description: the set comprising all the concept terms that is related to the shot Shot description: the set comprising all the concept terms that is related to the shot Obtained by semi-automatic or video annotation Obtained by semi-automatic or video annotation

10 Video shot annotation Annotation interface Annotation interface

11 Video Edit Process Shoot a set of video shot groups with similar semantic content (takes) Shoot a set of video shot groups with similar semantic content (takes) Select video shots from the takes then arrange the video shots from different video shot groups to depict the story scene Select video shots from the takes then arrange the video shots from different video shot groups to depict the story scene

12 Video summarization Recover the semantic video shot groups Recover the semantic video shot groups Video summarization can be viewed as an “ inversion ” of video editing, then select the important parts Video summarization can be viewed as an “ inversion ” of video editing, then select the important parts

13 Mutual Reinforcement Given the annotated video shots Given the annotated video shots How to measure the priority for a set of concept terms and a set of descriptions? Who is the most important person? Which shot is the most important one? How to measure the priority for a set of concept terms and a set of descriptions? Who is the most important person? Which shot is the most important one? A more important description contains more important terms; A more important description contains more important terms; A more important term should be contained by more important descriptions A more important term should be contained by more important descriptions Mutual reinforcement principle [1] Mutual reinforcement principle [1]

14 Mutual Reinforcement Let W be the weight matrix describes the relationship between some terms and some shot descriptions (W can have various definitions, e.g. the number of occurrence of a term in a description) Let W be the weight matrix describes the relationship between some terms and some shot descriptions (W can have various definitions, e.g. the number of occurrence of a term in a description) Let U,V be the vector of the importance value of the video shot description set and concept term set Let U,V be the vector of the importance value of the video shot description set and concept term set We have We have U and V can be calculated by SVD of W U and V can be calculated by SVD of W

15 Mutual Reinforcement For each semantic context: For each semantic context: We choose the singular vectors correspond to W ’ s largest singular value as the importance vector for concept terms and sentences We choose the singular vectors correspond to W ’ s largest singular value as the importance vector for concept terms and sentences Since W is non-negative, the first singular vector will be non-negative Since W is non-negative, the first singular vector will be non-negative The importance score vector can be used to group semantic similar video shots The importance score vector can be used to group semantic similar video shots

16 Experiments Priority calculation on one video scene Priority calculation on one video scene Based on context “ who ” Based on context “ who ”

17 Experiments Shot groups Shot groups Joe Joe and Terry Terry Background people

18 Experiments Priority calculation Priority calculation Based on context “ what action ” Based on context “ what action ”

19 Experiments Shot groups Shot groups fight Quarrel Background

20 Shot arrangement patterns The way the director arrange the video shots conveys his intention The way the director arrange the video shots conveys his intention Minimal content redundancy and visual coherence Minimal content redundancy and visual coherence Semantic video shot group label form a string Semantic video shot group label form a string K-Non-Repetitive Strings (k-nrs) K-Non-Repetitive Strings (k-nrs) String coverage String coverage {3124} covers {312,124,31,12,24,3,1,2,4} {3124} covers {312,124,31,12,24,3,1,2,4}

21 Shot arrangement patterns Several detected nrs strings Several detected nrs strings

22 Video skim selection do do Select the most important k-nrs string into the skim shot set Select the most important k-nrs string into the skim shot set Remove those nrs strings covered by the selected string Remove those nrs strings covered by the selected string Until the target skim length is reached Until the target skim length is reached

23 Experiments We conduct the subjective test We conduct the subjective test Compared with the previous graph based algorithm Compared with the previous graph based algorithm Achieve better coherency Achieve better coherency

24 Future work More efficient way to annotate video shots More efficient way to annotate video shots Augment the semantic template Augment the semantic template Personalized video summary Personalized video summary

25 Q & A Thank you!! Thank you!!


Download ppt "Video Summarization Using Mutual Reinforcement Principle and Shot Arrangement Patterns Lu Shi Oct. 4, 2004."

Similar presentations


Ads by Google