Collective Vision: Using Extremely Large Photograph Collections Mark Lenz CameraNet Seminar University of Wisconsin – Madison January 26, 2010 Acknowledgments:

Slides:

Advertisements

Similar presentations

Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.

Advertisements

Location-Based Social Networks Yu Zheng and Xing Xie Microsoft Research Asia Chapter 8 and 9 of the book Computing with Spatial Trajectories.

Office of SA to CNS GeoIntelligence Introduction Data Mining vs Image Mining Image Mining - Issues and Challenges CBIR Image Mining Process Ontology.

Wiki-Reality: Augmenting Reality with Community Driven Websites Speaker: Yi Wu Intel Labs/vision and image processing research Collaborators: Douglas Gray,

Yan-Tao Zheng1, Ming Zhao2, Yang Song2, Hartwig Adam2 Ulrich Buddemeier2, Alessandro Bissacco2, Fernando Brucher2 Tat-Seng Chua1, and Hartmut Neven2 1.

Presented by Xinyu Chang

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

UCB Computer Vision Animals on the Web Tamara L. Berg CSE 595 Words & Pictures.

Image Information Retrieval Shaw-Ming Yang IST 497E 12/05/02.

Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.

Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.

Large dataset for object and scene recognition A. Torralba, R. Fergus, W. T. Freeman 80 million tiny images Ron Yanovich Guy Peled.

Tour the World: building a web-scale landmark recognition engine ICCV 2009 Yan-Tao Zheng1, Ming Zhao2, Yang Song2, Hartwig Adam2 Ulrich Buddemeier2, Alessandro.

Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.

A Study of Approaches for Object Recognition

Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.

Kyle Heath, Natasha Gelfand, Maks Ovsjanikov, Mridul Aanjaneya, Leo Guibas Image Webs Computing and Exploiting Connectivity in Image Collections.

Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.

Smart Traveller with Visual Translator. What is Smart Traveller? Mobile Device which is convenience for a traveller to carry Mobile Device which is convenience.

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

Data Mining – Intro.

Overview of Search Engines

Retrieving Location-based Data on the Web Andrei Tabarcea,

Information Retrieval in Practice

Research Area B Leif Kobbelt. Communication System Interface Research Area B 2.

OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.

Hubert CARDOTJY- RAMELRashid-Jalal QURESHI Université François Rabelais de Tours, Laboratoire d'Informatique 64, Avenue Jean Portalis, TOURS – France.

Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.

Language Identification of Search Engine Queries Hakan Ceylan Yookyung Kim Department of Computer Science Yahoo! Inc. University of North Texas 2821 Mission.

Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.

Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.

EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.

Mining Interesting Locations and Travel Sequences from GPS Trajectories IDB & IDS Lab. Seminar Summer 2009 강 민 석강 민 석 July 23 rd,

PageRank for Product Image Search Kevin Jing (Googlc IncGVU, College of Computing, Georgia Institute of Technology) Shumeet Baluja (Google Inc.) WWW 2008.

Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.

Jong Y. Choi, Joshua Rosen, Siddharth Maini, Marlon E. Pierce, and Geoffrey C. Fox Community Grids Laboratory Indiana University.

Introduction to Web Mining Spring What is data mining? Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web,

Report on Intrusion Detection and Data Fusion By Ganesh Godavari.

Hierarchical Temporal Memory as a Means for Image Recognition by Wesley Bruning CHEM/CSE 597D Final Project Presentation December 10, 2008.

Prepared by: Mahmoud Rafeek Al-Farra College of Science & Technology Dep. Of Computer Science & IT BCs of Information Technology Data Mining

Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.

Amy Dai Machine learning techniques for detecting topics in research papers.

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

Collective Vision: Using Extremely Large Photograph Collections Mark Lenz CameraNet Seminar University of Wisconsin – Madison February 2, 2010 Acknowledgments:

Problems in large-scale computer vision David Crandall School of Informatics and Computing Indiana University.

CANONICAL IMAGE SELECTION FROM THE WEB ACM International Conference on Image and Video Retrieval, 2007 Yushi Jing Shumeet Baluja Henry Rowley.

Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.

Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.

Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.

DISCOVERING SPATIAL CO- LOCATION PATTERNS PRESENTED BY: REYHANEH JEDDI & SHICHAO YU (GROUP 21) CSCI 5707, PRINCIPLES OF DATABASE SYSTEMS, FALL 2013 CSCI.

Scene Reconstruction Seminar presented by Anton Jigalin Advanced Topics in Computer Vision ( )

Image Classification over Visual Tree Jianping Fan Dept of Computer Science UNC-Charlotte, NC

Lecture 08 27/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Acquisition of Categorized Named Entities for Web Search Marius Pasca Google Inc. from Conference on Information and Knowledge Management (CIKM) ’04.

Towards Total Scene Understanding: Classiﬁcation, Annotation and Segmentation in an Automatic Framework N 工科所錢雅馨 2011/01/16 Li-Jia Li, Richard.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Computer Vision Group Department of Computer Science University of Illinois at Urbana-Champaign.

Clustering (Search Engine Results) CSE 454. © Etzioni & Weld To Do Lecture is short Add k-means Details of ST construction.

1 Review and Summary We have covered a LOT of material, spending more time and more detail on 2D image segmentation and analysis, but hopefully giving.

Location-based Social Networks 6/11/20161 CENG 770.

Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.

CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.

WEB STRUCTURE MINING SUBMITTED BY: BLESSY JOHN R7A ROLL NO:18.

SIFT Scale-Invariant Feature Transform David Lowe

Content-Based Image Retrieval Readings: Chapter 8:

Data Warehousing and Data Mining

Web Page Cleaning for Web Mining

Panagiotis G. Ipeirotis Luis Gravano

Topic 5: Cluster Analysis

Presentation transcript:

Collective Vision: Using Extremely Large Photograph Collections Mark Lenz CameraNet Seminar University of Wisconsin – Madison January 26, 2010 Acknowledgments: These slides combine and modify slides provided by Yantao Zheng et al. (National University of Singapore/Google)

Introduction Distributed Collaboration Google Goggles –Personal object recognition World-Wide Landmark Recognition Building Rome in a Day –Distributed matching and reconstruction

Distributed Collaboration Disaster or emergency –Time is of the essence Telecommunication networks down No maps or GPS What can we do to help ourselves and those around us?

Mobile Phones for Distributed Collaboration Camera for collecting visual information Ad-hoc wireless LAN –e.g. Bluetooth Goals: –Determine location, exits and hazardous paths Have I or someone else been here before?

Model Scenarios Firefighters Trapped miners Natural Disasters –Large population exodus –Building collapse Multiple agents collaborating to traverse an unknown environment

Visual search using picture as query Combination of algorithms –Object recognition –Optical character recognition –Geo-location (GPS & compass) Identify –Books and products –Businesses and landmarks

A World-Wide Landmark Recognition Engine with Web Learning Goal: Build a landmark recognition engine at earth-scale

Challenge I No list of landmarks in the world We only have: noisy data on Internet Tourist web articles Tourist photos geographical location

Challenge II How to learn landmark visual models Image search engine Photo-sharing websites

Challenge III Efficiency –Learning from enormous data –Recognizing from huge model

Discovering landmarks in the world Two approaches:  Photos in photo sharing websites  Online tourist articles Geo-tagged Landmark name

Learning landmarks from GPS-Tagged photos 20M images from picasa.com panoramio.com Geo- clustering geo cluster = landmarks? validate by photo authors Noisy image pool Visual clustering Graph clustering based on local features Validate by photo authors Analyzing text tags Compute frequency of n-grams of text tags Premise: Landmark photos are geographically adjacent visually similar uploaded by diff. users

Landmarks from GPS-Tagged photos ~20 million GPS-tagged photos 140k geo-clusters and 14k visual clusters 2240 landmarks from 812 cities in 104 countries –biased distribution, mostly in Europe

Learning landmarks from tourist web articles Explore article corpus in wikitravel.com Assume a geographical hierarchy Landmark mining = named entity extraction HTML is a structure tree Node: a HTML tag Value: text Classify each tree node, based on semantic clues embedded in the document structure

Learning landmarks from tourist web articles Heuristic rules  nodes are in "To See" or "See" section  nodes are children of “bullet list” nodes.  Nodes indicate bold font format Extract all named entities as landmark candidates Validate by visual models

Learning landmarks from tourist web articles ~7000 landmarks from 787 cities in 145 countries More evenly distributed

Unsupervised learning of landmark images Geo- clusters Landmarks from tour articles Noisy image pool Visual clustering Premise: photos from landmark should be similar Clustering based on local features Validate and clean models Visual model validates landmarks! Photo v.s. non-photo classifer to filter out noisy images ……

Local Feature Detection Find invariant and robust features Create distinctive feature descriptions

Laplacian-of-Gaussian (LoG) Scale-invariant edge detection Gaussian image filter to remove noise Laplacian filter to find areas of rapid change

Local Feature Description Invariant and distinctive description Texture from 118 dimension Gabor wavelet

Object matching based on local features Sim( ) = image match score, Image representation Interest points: Laplacian-of-Gaussian (LoG) filter Local feature: Gabor wavelets match score = Probability that match of and is false positive Probability of at least m out of n features match, if Probability of a feature match by chance

Constructing match region graph Image matching Node is match region 2 types of edges: match edge: measures match confidence overlap region edge: measures spatial overlapping

Graph clustering on match regions Distance between any two regions = shortest path connecting them Why hierarchical agglomerative clustering?  but not K-means, GMM etc Because we don't have a priori knowledge of # of clusters.  Each cluster should correspond to one aspect of a landmark intuitively Agglomerative hierarchical clustering Match region graph Visual clusters

Visual cluster example Corcovado, Rio de Janeiro, BrazilAcropolis, Athens, Greece

Visual cluster validation and cleaning Validate by authors or hosting webs of images reflect the popular appeal of landmarks Filter out non-photographic images, like map, logo  train Adaboost classifier  features: color hist, hough transform, etc. Clean clusters by detecting large area human face

Efficiency issues Issue 1: learning landmark image 21.4M photos Recognition engine: ~5000 landmarks Issue 2: recognizing landmark Query image Parallel computing to learn true landmark images Efficient hierarchical clustering Indexing local feature for matching Query time: ~0.2 sec in a P4 computer kd-tree indexing

Experiments: statistics of learned landmarks From photos From articles Total Landmark # City # Country # small overlap: 174 landmarks shared China: 101 landmarks Under-counted! Why? U.S.- High internet penetration rate & enourmous tour site

Evaluation of landmark image learning Randomly select 1000 visual clusters 68 (0.68%) are outliers: maps, logos, human photos Apply photographic v.s. non-photographic classifier 37 outliers. 0.68%=>0.37%

Evaluation of landmark recognition Positive testing images: –728 images from 124 landmarks Negative testing images: Caltech-256 (30524 ) + Pascal VOC 07 (9986 ) = 40,510 images. For positive images: –417 images detected to be landmarks –337/417 (80.8%) are correct –Identification rate: 337/728 (46.3%) For negative images: –463 images detected to be landmarks –False acceptance rate: 1.1% Landmarks can be similar!

False detected images Match is technically correct, but match region is not landmark Match is technically false, due to visual similarity A problem of model generation A problem of image feature and matching mechanism