Sensible Visual Search Shih-Fu Chang Digital Video and Multimedia Lab Columbia University www.ee.columbia.edu/dvmm June 2008 (Joint Work with Eric Zavesky.

Sensible Visual Search Shih-Fu Chang Digital Video and Multimedia Lab Columbia University www.ee.columbia.edu/dvmm June 2008 (Joint Work with Eric Zavesky and Lyndon Kennedy)

digital video | multimedia lab User Expectation for Web Search “…type in a few words at most, then expect the engine to bring back the perfect results. More than 95 percent of us never use the advanced search features most engines include, …” – The Search, J. Battelle, 2003 “…type in a few words at most, then expect the engine to bring back the perfect results. More than 95 percent of us never use the advanced search features most engines include, …” – The Search, J. Battelle, 2003 Keyword search is still the primary search method Straightforward extension to visual search

Keyword-based Visual Search Paradigm

digital video | multimedia lab Web Image Search Text Query : “Manhattan Cruise” over Goggle Image What are in the results? Why are these images returned? How to choose better search terms?

digital video | multimedia lab Minor Changes in Keywords  Big Difference Text Query : “Cruise around Manhattan”

When metadata are unavailable: Automatic Image Classification Audio-visual features Geo, social features SVM or graph models Context fusion... Rich semantic description based on content analysis Statistical models Semantic Indexes + - Anchor Snow Soccer Building Outdoor

digital video | multimedia lab A few good detectors for LSCOM concepts waterfrontbridgecrowdexplosion fireUS flagMilitary personnel Remember there are many not so good detectors.

Keyword Search over Statistical Detector Scores www.ee.columbia.edu/cuvidsearch Columbia374: Objects, people, location, scenes, events, etc Concepts defined by expert analysts over news video

Query “car crash snow” over TRECVID video using LSCOM concepts How are keywords mapped to concepts? What classifiers work? What don’t? How to improve the search terms?

Relevant Detectors carCar crashsnow How did individual detectors influence the search results?

Frustration of Uninformed Users of Keyword Search Difficult to choose meaningful words/concepts without in-depth knowledge of entire vocabulary

Pains of Uninformed Users Forced to take “one shot” searches, iterating queries with a trial and error approach...

Challenge: user frustration in visual search A lot of work on content analytics ? Research needed to address user frustration

Proposal: Sensible Search Make search experience more Sensible Help users stay “informed” in selecting effective keywords/concepts in understanding the search results in manipulating the search criteria rapidly and flexibly Keep users engaged Instant feedback with minimal disruption as opposed to “trial-and-error”

A prototype CuZero: Zero-Latency Informed Search & Navigation

Informed User: Instant Informed Query Formulation

Informed User for Visual Search: Instant visual concept suggestion query time concept mining Instant Concept Suggestion

Lexical mapping Mapping keywords to concept definition, synonyms, sense context, etc LSCOM

Co-occurrent concepts road car Basketball courts and the American won the Saint Denis on the Phoenix Suns because of the 50 point for 19 in their role within the National Association of Basketball George Rizq led Hertha for the local basketball game the wisdom and sports championship of the president Baghdad to attend the game I see more goals and the players did not offer great that Beijing Games as the beginning of his brilliance Nayyouf 10 this atmosphere the culture of sports championship images text

resultsresults visual mining dominant concepts person suits Query-Time Concept Mining

CuZero Real-Time Query Interface (demo) Instant Concept Suggestion Auto-complete speech transcripts

A prototype CuZero: Zero-Latency Informed Search & Navigation (Zavesky and Chang, MIR2008)

Informed User: Intuitive Exploration of Results only outdoor only people CMU Informedia Concept Filter linear browser restricts inspection flexibility

Informed User: Rapid Exploration of Results Media Mill Rotor Browser

Revisit the user struggle… Car detector Car crash detector Snow detector Query: {car, snow, car_crash} How did each concept influence the results?

CuZero: Real-Time Multi-Concept Navigation Map Create a multi-concept gradient map Direct user control: “nearness” = “more influence” Instant display for each location, without new query “boat” “sky” “water”

Achieve Breadth-Depth Flexibility by Dual Space Navigation (demo) Breadth: Quick scan of many permutations Depth: Instant exploration of results with fixed weights many query permutations Deep exploration of single permutation

navigation map concept placement Content planning to remove redundancy boat water scored result lists traditional list planning guaranteed unique list planning

execute query and download ranked concept list package results with scores transmit to client unpackage results at interface score images by concept weights; guarantee unique positions download images to interface in cached mode Latency Analysis: Workflow Pipeline time to execute is disproportional!

Pipelined processing for low latency Concept formulation (“car”) concept formulation (“snow”) Overlap (concept formulation) with (map rendering) Hide rendering latency during user interaction Course-to-fine concept map planning/rendering Speed optimization on-going …

Challenge: user frustration in visual search ? Research needed to address user frustration sensible search: (1) query (2) visualize + (3) analyze

Frequent reuse and dissemination of media objects on Web 2.0 Informed User: understand trend/history in search results channel time ABC CNN MSNBC FOX CCTV TVBS LBC Al-Jazeera Google.cn Google.com Yahoo news

DVMM Lab, Columbia University Help Users Make Sense of Image Trend Many re-used content found How did it occur? What manipulations? What distribution path? Correlation with perspective change? Query: “John Kennedy”

DVMM Lab, Columbia University Manipulation correlated with Perspective Raising the Flag on Iwo Jima Joe Rosenthal, 1945 Anti-Vietnam War, Ronald and Karen Bowen, 1969

DVMM Lab, Columbia University Reused Images Over Time

digital video | multimedia lab Question for Sensible Search: Insights from Plain Search Results? Issue a text query Find duplicate images, merge into clusters Explore history/trend Get top 1000 results from web search engine Rank clusters (size?, original rank?)

digital video | multimedia lab Duplicate Clusters Reveal Image Provenance Biggest Clusters Contain Iconic Images Smallest Clusters Contain Marginal Images

DVMM Lab, Columbia University A simple reranking application Detect duplicate pairs across video/image set Join duplicate pairs such that all pairs are in the same cluster Display only one image or shot per cluster (others will be duplicates / redundant) Rank results by ordering clusters (various approaches to ranking clusters)

DVMM Lab, Columbia University Example: Yahoo! Image Search ? ? ???

DVMM Lab, Columbia University Example: Reranked Results More Diverse and Relevant

DVMM Lab, Columbia University Deeper Analysis of Search Results: Visual Migration Map (VMM) Duplicate Cluster Visual Migration Map (Kennedy and Chang, ACM Multimedia 2008)

DVMM Lab, Columbia University Visual Migration Map (VMM) “Most Original” at the root “Most Divergent” at the leaves Images Derived through Series of Manipulations VMM uncovers history of image manipulation and plausible dissemination paths among content owners and users.

DVMM Lab, Columbia University Ground truth VMM is hard to get Hypothesis Approximation of history is feasible by visual analysis. Detect manipulation types between two images Derive large scale history among a large image set

DVMM Lab, Columbia University Basic Image Manipulation Operators Each is observable by inspecting the pair Each implies direction (one image derived from other) Other possible manipulations: color correction, multiple compression, sharpening, blurring OriginalScaledCroppedGrayOverlayInsertion

digital video | multimedia lab Detecting Near-Duplicates Duplicate detection is very useful and relatively reliable Remaining challenges: scalability/speed; video duplicates; object (sub-image) (TRECVID08) Graph Matching [Zhang & Chang, 2004]Matching SIFT points [Lowe, 1999]

DVMM Lab, Columbia University Automatic Manipulation Detectors Objective: automatically detect various types of image manipulations Context-free detectors: rely only on two images copy, scaling, grayscale Context-dependent detectors: rely on cues from a plurality of images cropping, insertion, overlay

DVMM Lab, Columbia University Simple Copy Detection Extract SIFT descriptors (local shape features) Detect correspondences between features between images Count total number of matched points + + + + + + + + + + ++

DVMM Lab, Columbia University Scale Detection Draw bounding box around matching points in each image Compare heights/widths of each box Relative difference in box size can be used to normalize scales

DVMM Lab, Columbia University Color Removal Simple case: image stored in single channel file Other cases: image is grayscale, but stored in 3- channel file Expect little difference in values in various channels within pixels

DVMM Lab, Columbia University More Challenging: Overlay Detection? Given two images, we can observe that a region is different between the two But how do we know which is the original?

DVMM Lab, Columbia University Cropping or Insertion? Can find differences in image area But is the smaller-area due to a crop or is the larger area due to an insertion? CroppingOriginalInsertion

DVMM Lab, Columbia University Use Context from Many Duplicates Normalize Scales and Positions Get average value for each pixel “Composite” image

DVMM Lab, Columbia University Cropping Detection w/ Context In cropping, we expect the content outside the crop area to be consistent with the composite image Image AComposite AResidue A Image BComposite BResidue B

DVMM Lab, Columbia University Overlay Detection w/ Context Comparing images against composite image reveals portions that differ from typical content Image with divergent content may have overlay Image AComposite AResidue A Image BComposite BResidue B

DVMM Lab, Columbia University Insertion Detection w/ Context In insertion, we expect the area outside the crop region to be different from the typical content Image AComposite AResidue A Image BComposite BResidue B

DVMM Lab, Columbia University 56 Evaluation: Manipulation Detection Context-Free detectors have near-perfect performance Context-Dependent detectors still have errors Consistency checking can further improve the accuracy Are these error-prone results sufficient to build manipulation histories? Context-DependentContext-Free

DVMM Lab, Columbia University Inferring Direction from Consistency Not Plausible 

DVMM Lab, Columbia University Manipulation Direction from Consistency  Plausible

DVMM Lab, Columbia University Derive Manipulation among Multiple Images

DVMM Lab, Columbia University Emerging Migration Map Individual parent-child relationships give rise to a manipulation history Relationships are only plausible (we don’t know for sure) Absences of relationships are more concrete (we can be more certain) Redundancy: plausible derivations from parents and ancestors of parents 

DVMM Lab, Columbia University Manipulation Graph Simplification It’s plausible to derive images from all ancestors Simplify graph by deleting short-cut paths Preserves overall structure: source / sink nodes All Plausible Edits Simplified Structure

DVMM Lab, Columbia University Experiments Select 22 iconic images Mostly political figures, culled from Google Zeitgeist and TRECVID queries Generate manipulation histories: through manual annotation and through fully-automatic mechanisms

DVMM Lab, Columbia University Automatic Visual Migration Map “Originals” at source nodes “Manipulated” at sink nodes

“Originals” at source nodes “Manipulated” at sink nodes

DVMM Lab, Columbia University Evaluation: Automatic Histories High agreement with manually-constructed histories Detect edits with Precision of 91% and Recall of 71% Automatically-ConstructedManually-Constructed  Deleted Inserted 

DVMM Lab, Columbia University Application: Summarizing Changes Analyze manipulation history graph structure to extract most-original and most highly- manipulated images

DVMM Lab, Columbia University Application: Finding Perspective Survey image type and corresponding perspective across many examples Find correlation between high manipulation and negative/critical opinion

DVMM Lab, Columbia University Joke Website: “Every time I get stoned, I go and do something stupid!” “Osama Bashed Laden” http://www.almostaproverb.com/captions2.html Democratic National Committee Site: “Capture Osama Bin Laden!” http://www.democrats.org/page/petition/osama Myspace Profile from Malaysia: “Osama Bin Laden - My Idol of All Time!” http://www.myspace.com/mamu_potnoi Daily Excelsior Newspaper: “Further Details of Bin Laden Plot Unearthed: ABC Report.” http://www.dailyexcelsior.com/00jan31/inter.htm Application: Finding Perspective

DVMM Lab, Columbia University Application of VMM beyond Search Track evolution of images across various directions Temporal evolution how does it appear over time? Geographical/Cultural dispersion how does it disperse around the world? Reverse profiling what is the history of a given image?

DVMM Lab, Columbia University Temporal Evolution Periodic crawls can give us clues about when instances of images first appeared Use this to better- inform parent-child relationships, changing interest

DVMM Lab, Columbia University Geographic/Cultural Dispersion

DVMM Lab, Columbia University Reverse Profiling

DVMM Lab, Columbia University Conclusions Advocate Focus on Sensible Visual Search Address user frustration in interactive keyword search In addition to work on content analytics Develop utilities for Informed Users Demo: CuZero prototype Instant query suggestion Rapid multi-concept result navigation

DVMM Lab, Columbia University Conclusions Explore Deeper Insight: Visual Migration Map Explore image reuse patterns to reveal image provenance Approximate image manipulation history from visual content, alone. Find “interesting” images at source and sink nodes within the image history Strong correlation with view point change Useful role in socio-cultural information dissemination (Web 2.0)

DVMM Lab, Columbia University References CuZero: Eric Zavesky and Shih-Fu Chang, “CuZero: Low-Latency Query Formulation and Result Exploration for Concept-Based Visual Search,” ACM Multimedia Information Retrieval Conference, Oct. 2008, Vancouver, Canada. Internet Image Manipulation History: Lyndon Kennedy and Shih-Fu Chang, “Internet Image Archaeology: Automatically Tracing the Manipulation History of Photographs on the Web,” ACM Multimedia Conference, Oct. 2008, Vancouver, Canada

Sensible Visual Search Shih-Fu Chang Digital Video and Multimedia Lab Columbia University www.ee.columbia.edu/dvmm June 2008 (Joint Work with Eric Zavesky.

Similar presentations

Presentation on theme: "Sensible Visual Search Shih-Fu Chang Digital Video and Multimedia Lab Columbia University www.ee.columbia.edu/dvmm June 2008 (Joint Work with Eric Zavesky."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sensible Visual Search Shih-Fu Chang Digital Video and Multimedia Lab Columbia University www.ee.columbia.edu/dvmm June 2008 (Joint Work with Eric Zavesky.

Similar presentations

Presentation on theme: "Sensible Visual Search Shih-Fu Chang Digital Video and Multimedia Lab Columbia University www.ee.columbia.edu/dvmm June 2008 (Joint Work with Eric Zavesky."— Presentation transcript:

Similar presentations

About project

Feedback