Presentation is loading. Please wait.

Presentation is loading. Please wait.

2003.09.23 - SLIDE 1IS 202 – FALL 2003 Lecture 10: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.

Similar presentations


Presentation on theme: "2003.09.23 - SLIDE 1IS 202 – FALL 2003 Lecture 10: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30."— Presentation transcript:

1

2 2003.09.23 - SLIDE 1IS 202 – FALL 2003 Lecture 10: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003 http://www.sims.berkeley.edu/academics/courses/is202/f03/ SIMS 202: Information Organization and Retrieval

3 2003.09.23 - SLIDE 2IS 202 – FALL 2003 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media –Mobile Media Metadata Project Discussion Questions Action Items for Next Time

4 2003.09.23 - SLIDE 3IS 202 – FALL 2003 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media –Mobile Media Metadata Project Discussion Questions Action Items for Next Time

5 2003.09.23 - SLIDE 4IS 202 – FALL 2003 Global Media Network Digital media produced anywhere by anyone accessible to anyone anywhere Today’s media users become tomorrow’s media producers Not 500 Channels — 500,000,000 multimedia Web Sites

6 2003.09.23 - SLIDE 5IS 202 – FALL 2003 Media Asset Management and Reuse Media Asset Management –Corporate Media companies, media archives, training, sales, catalogs, etc. –Government Military, surveillance, law enforcement, etc. –Academia Libraries, research, instruction, etc. –Consumer Home video and photos, fan reuse of popular content, etc.

7 2003.09.23 - SLIDE 6IS 202 – FALL 2003 Applications of Analysis and Retrieval Professional and educational applications –Automated authoring of Web content –Searching and browsing large video archives –Easy access to educational materials –Indexing and archiving multimedia presentations –Indexing and archiving multimedia collaborative sessions Consumer domain applications –Video overview and access –Video content filtering –Enhanced access to broadcast video

8 2003.09.23 - SLIDE 7IS 202 – FALL 2003 The Media Opportunity Vastly more media will be produced Without ways to manage it (metadata creation and use) we lose the advantages of digital media Most current approaches are insufficient and perhaps misguided Great opportunity for innovation and invention Need interdisciplinary approaches to the problem

9 2003.09.23 - SLIDE 8IS 202 – FALL 2003 What is the Problem? Today people cannot easily find, edit, share, and reuse media Computers don’t understand media content –Media is opaque and data rich –We lack structured representations Without content representation (metadata), manipulating digital media will remain like word- processing with bitmaps

10 2003.09.23 - SLIDE 9IS 202 – FALL 2003 The Semantic Gap “[…] the semantic gap between the rich meaning that users want when they query and browse media and the shallowness of the content descriptions that we can actually compute is weakening today’s automatic content-annotation systems.”

11 2003.09.23 - SLIDE 10IS 202 – FALL 2003 M E T A D A T A Traditional Media Production Chain PRE-PRODUCTIONPOST-PRODUCTIONPRODUCTIONDISTRIBUTION Metadata-Centric Production Chain

12 2003.09.23 - SLIDE 11IS 202 – FALL 2003 Chang: Content-Based Media Analysis “Traditional views of content-based technologies focus on search and retrieval—which is important but relatively narrow.” “[…] emphasizing the end-to-end content chain and the many issues evolving around it. What’s the best way to integrate manual and automatic solutions in different parts of the chain?”

13 2003.09.23 - SLIDE 12IS 202 – FALL 2003 Computational Media Aesthetics “ […] the algorithmic study of a variety of image and aural elements in media (based on their use in film grammar). It is also the computational analysis of the principles that have emerged underlying their manipulation in the creative art of clarifying, intensifying, and interpreting an event for an audience.” “Our research systematically uses film grammar to inspire and underpin an automated process of analyzing, characterizing, and structuring professionally produced videos.”

14 2003.09.23 - SLIDE 13IS 202 – FALL 2003 Chang: Content-Based Media Technology Practical impact criteria for evaluating multimedia research directions –Generating metadata not available from production –Providing metadata that humans aren’t good at generating –Focusing on content with large volume and low individual value –Adopting well-defined tasks and performance metrics

15 2003.09.23 - SLIDE 14IS 202 – FALL 2003 Chang: Content-Based Media Technology Areas of research –Reverse engineering of the media capturing and editing processes –Extracting and matching objects –Meaning decoding and automatic annotation –Analysis and retrieval with user feedback –Generating time-compressed skims –Efficient indexing for large databases –Content adaptation for accessing, multimedia over heterogeneous devices –Standards for specifying content description language and scheme like MPEG-7

16 2003.09.23 - SLIDE 15IS 202 – FALL 2003 CMA Challenges Can we dynamically detect successful aesthetic principles with accuracy and consistency using computational analysis? Can we build new postproduction tools based on this analysis for rapid, cost-efficient, and effective moviemaking and consistent evaluation? How can we use these successful audio–visual strategies for improved training and education in mass communication? How do we raise the quality of media annotation and improve the usability of content-based video search and retrieval systems?

17 2003.09.23 - SLIDE 16IS 202 – FALL 2003 Asset Retrieval and Reuse Automated Media Production Process Web Integration and Streaming Media Services Flash Generator WAP HTML Email Print/Physical Media Active Capture 1 Automatic Editing 3 Personalized/ Customized Delivery 4 Adaptive Media Engine 2 Annotation and Retrieval Reusable Online Asset Database Annotation of Media Assets

18 2003.09.23 - SLIDE 17IS 202 – FALL 2003 Technology Summary Media Streams provides a framework for creating metadata throughout the media production cycle to make media assets searchable and reusable Active Capture automates direction and cinematography using real-time audio-video analysis in an interactive control loop to create reusable media assets Adaptive Media uses adaptive media templates and automatic editing functions to mass customize and personalize media and thereby eliminate the need for editing on the part of end users Together, these technologies will automate, personalize, and speed up media production, distribution, and reuse

19 2003.09.23 - SLIDE 18IS 202 – FALL 2003 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media –Mobile Media Metadata Project Discussion Questions Action Items for Next Time

20 2003.09.23 - SLIDE 19IS 202 – FALL 2003 New Solutions for Creating Metadata After CaptureDuring Capture

21 2003.09.23 - SLIDE 20IS 202 – FALL 2003 New Solutions for Creating Metadata After CaptureDuring Capture

22 2003.09.23 - SLIDE 21IS 202 – FALL 2003 After Capture: Media Streams

23 2003.09.23 - SLIDE 22IS 202 – FALL 2003 Media Streams Features Key features –Stream-based representation (better segmentation) –Semantic indexing (what things are similar to) –Relational indexing (who is doing what to whom) –Temporal indexing (when things happen) –Iconic interface (designed visual language) –Universal annotation (standardized markup schema) Key benefits –More accurate annotation and retrieval –Global usability and standardization –Reuse of rich media according to content and structure

24 2003.09.23 - SLIDE 23IS 202 – FALL 2003 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media –Mobile Media Metadata Project Discussion Questions Action Items for Next Time

25 2003.09.23 - SLIDE 24IS 202 – FALL 2003 New Solutions for Creating Metadata After CaptureDuring Capture

26 2003.09.23 - SLIDE 25IS 202 – FALL 2003 Creating Metadata During Capture New Capture Paradigm 1 Good Capture Drives Multiple Uses Current Capture Paradigm Multiple Captures To Get 1 Good Capture

27 2003.09.23 - SLIDE 26IS 202 – FALL 2003 Active Capture Processing CaptureInteraction Active Capture Computer Vision/ Audition Human Computer Interaction Direction/ Cinematography

28 2003.09.23 - SLIDE 27IS 202 – FALL 2003 Active Capture Active engagement and communication among the capture device, agent(s), and the environment Re-envision capture as a control system with feedback Use multiple data sources and communication to simplify the capture scenario Use HCI to support “human- in-the-loop” algorithms for computer vision and audition

29 2003.09.23 - SLIDE 28IS 202 – FALL 2003 Human-In-The-Loop Algorithms Leverage what humans and computers are respectively good at –Example: Object recognition and tracking Leverage interaction with the situated human agent –Examples: Activity recognition (Jump detector with “Simon Says” interaction) Object recognition (Car finder with “Treasure Hunt” interaction)

30 2003.09.23 - SLIDE 29IS 202 – FALL 2003 Active Capture

31 2003.09.23 - SLIDE 30IS 202 – FALL 2003 Active Capture: Reusable Shots

32 2003.09.23 - SLIDE 31IS 202 – FALL 2003 Marc Davis in Godzilla Scene

33 2003.09.23 - SLIDE 32IS 202 – FALL 2003 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media –Mobile Media Metadata Project Discussion Questions Action Items for Next Time

34 2003.09.23 - SLIDE 33IS 202 – FALL 2003 Evolution of Media Production Customized production –Skilled creation of one media product Mass production –Automatic replication of one media product Mass customization –Skilled creation of adaptive media templates –Automatic production of customized media

35 2003.09.23 - SLIDE 34IS 202 – FALL 2003 Editing Paradigm Has Not Changed

36 2003.09.23 - SLIDE 35IS 202 – FALL 2003 Computational Media More intimately integrate two great 20 th century inventions

37 2003.09.23 - SLIDE 36IS 202 – FALL 2003 Movies change from being static data to programs Shots are inputs to a program that computes new media based on content representation and functional dependency (US Patents 6,243,087 & 5,969,716) Central Idea: Movies as Programs Parser Producer Media Content Representation Content Representation

38 2003.09.23 - SLIDE 37IS 202 – FALL 2003 Content Not Author- Generated Author- Generated Author- Generated Structure Compilation Movie Making Traditional Movie Making Historical Documentary Movie Making Adaptive Media Design Space

39 2003.09.23 - SLIDE 38IS 202 – FALL 2003 Video Lego (structure is constrained) Video MadLibs (structure is determined) Content Not Author- Generated Author- Generated Author- Generated Structure Compilation Movie Making Traditional Movie Making Historical Documentary Movie Making Adaptive Media Design Space

40 2003.09.23 - SLIDE 39IS 202 – FALL 2003 The Blank Page Approach

41 2003.09.23 - SLIDE 40IS 202 – FALL 2003 Captain Zoom IV MadLib™

42 2003.09.23 - SLIDE 41IS 202 – FALL 2003 Constructing With Lego™ Blocks

43 2003.09.23 - SLIDE 42IS 202 – FALL 2003 Video MadLibs and Video Lego Video MadLibs –Adaptive media template with open slots –Structure is fixed –Content can be varied Video Lego –Reusable media components that know how to fit together –Structure is constrained –Content can be varied

44 2003.09.23 - SLIDE 43IS 202 – FALL 2003 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media –Mobile Media Metadata Project Discussion Questions Action Items for Next Time

45 2003.09.23 - SLIDE 44IS 202 – FALL 2003 Moore’s Law for Cameras 2000 Kodak DC40 Nintendo GameBoy Camera $400 $ 40 2002 Kodak DX4900 SiPix StyleCam Blink

46 2003.09.23 - SLIDE 45IS 202 – FALL 2003 Capture+Processing+Interaction+Network

47 2003.09.23 - SLIDE 46IS 202 – FALL 2003 Mobile Media Metadata Project Leverage the context and community of media capture in mobile devices –Gather all automatically available information at the point of capture (time, spatial location, phone user, etc.) –Use metadata similarity and media analysis algorithms to find similar media that has been annotated before –Take advantage of this previously annotated media to make educated guesses about the content of the newly captured media –Interact in a simple and intuitive way with the phone user to confirm and augment system-supplied metadata for captured media

48 2003.09.23 - SLIDE 47IS 202 – FALL 2003 Campanile Scenario

49 2003.09.23 - SLIDE 48IS 202 – FALL 2003 MMM Architecture Phone Browser Application Server Annotation Process Handling xhtml Facade Metadata Service DB Desktop Browser Phone Gateway html Web Server Metadata Processing Object Factory Flamenco Dumper Flamenco Dialog Logic

50 2003.09.23 - SLIDE 49IS 202 – FALL 2003 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media –Mobile Media Metadata Project Discussion Questions Action Items for Next Time

51 2003.09.23 - SLIDE 50IS 202 – FALL 2003 Discussion Questions (Davis) Ryan Shaw on Davis on “Editing Out Video Editing” –There are obviously a lot of legal and political barriers to be overcome in order to realize the type of media production and re-use that Marc envisions. Current metadata schemes, if they have elements related to intellectual property at all, seem to be focused on ownership and restrictions, i.e. they spell out what we can't do. How might we design metadata that tell us and our media production machines what we *can* do legally? Are things like Creative Commons licenses a step in the right direction?

52 2003.09.23 - SLIDE 51IS 202 – FALL 2003 Discussion Questions (Davis) Ryan Shaw on Davis on “Editing Out Video Editing” –To what extent can computer games like Grand Theft Auto: Vice City be considered machines for mass customization of narrative media? –A couple of weeks ago we discussed the issue of authors categorizing and annotating their works themselves vs. having professionals such as librarians categorize and annotate them. If we look at what librarians do as being similar to annotation of media assets in postproduction, how might Marc's ideas about using software to capture metadata throughout the production process apply to textual information assets like books, articles, and web pages?

53 2003.09.23 - SLIDE 52IS 202 – FALL 2003 Discussion Questions (CMA) Margaret Spring on “Computational Media Aesthetics” –The article discusses a means of referencing film grammar (shot, lighting, recording distance, etc.) to visual, aural and thematic patterns in order to develop media analysis. Does this seem like it would be exclusively a production & annotation tool? If a catalog of film techniques where created, could a legitimate and comprehensible film be made purely utilizing technique references? While it might not be useful for dramatic films, could such technology be effective for training films, annotating events with rules (such as sports) or user guide type vignettes?

54 2003.09.23 - SLIDE 53IS 202 – FALL 2003 Discussion Questions (CMA) Margaret Spring on “Computational Media Aesthetics” –There is a reference to “reverse-engineering intent and meaning from available content.” What would this require? Do people believe that such technology could at one point be applied to summarizing a television program? If such a tool were created, would it be able to keep up with pop culture/connotative references in order to be effective? Can you teach an annotation system to understand sarcasm?

55 2003.09.23 - SLIDE 54IS 202 – FALL 2003 Discussion Questions (Chang) Andrea La Pietra on “The Holy Grail of Content-Based Media Analysis” –Chang discusses that one of the criteria that is important in order for content-based analysis to have a “high impact” is “providing metadata that humans aren’t good at generating.” But, what is the right balance of human perception/annotation and automatic annotation to be used in each domain? And can it be dangerous to use more automatic elements in some domains as opposed to others?

56 2003.09.23 - SLIDE 55IS 202 – FALL 2003 Discussion Questions (Chang) Andrea La Pietra on “The Holy Grail of Content- Based Media Analysis” –In the medical video indexing example he claims each piece of content “does not have significant value.” But, isn’t this subjective? Medical knowledge is dynamic. Is it possible that they may stumble upon a medical case similar to a prior medical case but do not make the connection because they chose to throw out some element common to both cases which seemed unimportant at one time? Chang even states himself, “A significant skill threshold exists for personnel qualified for the annotation process.”

57 2003.09.23 - SLIDE 56IS 202 – FALL 2003 Discussion Questions (Dimitrova) Dan Perkel on “Applications of Video- Content Analysis and Retrieval” –“We perceive a video program as a document” (p. 42) How does this assertion facilitate the development of useful video- content analysis and retrieval applications? How does this limit those applications? Are there other ways of treating a video (or multimedia) program that would lead to different types of applications than the ones described in the article?

58 2003.09.23 - SLIDE 57IS 202 – FALL 2003 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media –Mobile Media Metadata Project Discussion Questions Action Items for Next Time

59 2003.09.23 - SLIDE 58IS 202 – FALL 2003 Next Time Metadata for Motion Pictures: Media Streams Readings for next time –“Media Streams: An Iconic Visual Language for Video Representation” (M. Davis)


Download ppt "2003.09.23 - SLIDE 1IS 202 – FALL 2003 Lecture 10: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30."

Similar presentations


Ads by Google