Problems in large-scale computer vision David Crandall School of Informatics and Computing Indiana University.

Slides:



Advertisements
Similar presentations
Linear Time Methods for Propagating Beliefs Min Convolution, Distance Transforms and Box Sums Daniel Huttenlocher Computer Science Department December,
Advertisements

Location Recognition Given: A query image A database of images with known locations Two types of approaches: Direct matching: directly match image features.
Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.
ONLINE ARABIC HANDWRITING RECOGNITION By George Kour Supervised by Dr. Raid Saabne.
Detecting Faces in Images: A Survey
Constrained Approximate Maximum Entropy Learning (CAMEL) Varun Ganapathi, David Vickrey, John Duchi, Daphne Koller Stanford University TexPoint fonts used.
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
SVM—Support Vector Machines
Machine learning continued Image source:
Computer vision: models, learning and inference Chapter 18 Models for style and identity.
3D Human Body Pose Estimation from Monocular Video Moin Nabi Computer Vision Group Institute for Research in Fundamental Sciences (IPM)
Complex Feature Recognition: A Bayesian Approach for Learning to Recognize Objects by Paul A. Viola Presented By: Emrah Ceyhan Divin Proothi Sherwin Shaidee.
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
Real-Time Human Pose Recognition in Parts from Single Depth Images Presented by: Mohammad A. Gowayyed.
Discrete-Continuous Optimization for Large-scale Structure from Motion David Crandall, Andrew Owens, Noah Snavely, Dan Huttenlocher Presented by: Rahul.
Belief Propagation on Markov Random Fields Aggeliki Tsoli.
Landmark Classification in Large- scale Image Collections Yunpeng Li David J. Crandall Daniel P. Huttenlocher ICCV 2009.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Week 9 Data Mining System (Knowledge Data Discovery)
1 Computer Vision Research  Huttenlocher, Zabih –Recognition, stereopsis, restoration, learning  Strong algorithmic focus –Combinatorial optimization.
GraphLab A New Parallel Framework for Machine Learning Carnegie Mellon Based on Slides by Joseph Gonzalez Mosharaf Chowdhury.
Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland Tuesday, June 29, 2010 This work is licensed.
Recognition Of Textual Signs Final Project for “Probabilistic Graphics Models” Submitted by: Ezra Hoch, Golan Pundak, Yonatan Amit.
Computational Vision Jitendra Malik University of California at Berkeley Jitendra Malik University of California at Berkeley.
Stereo Computation using Iterative Graph-Cuts
Map-Reduce and Parallel Computing for Large-Scale Media Processing Youjie Zhou.
KDD for Science Data Analysis Issues and Examples.
Exercise Session 10 – Image Categorization
Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland MLG, January, 2014 Jaehwan Lee.
Reconstructing Relief Surfaces George Vogiatzis, Philip Torr, Steven Seitz and Roberto Cipolla BMVC 2004.
Hubert CARDOTJY- RAMELRashid-Jalal QURESHI Université François Rabelais de Tours, Laboratoire d'Informatique 64, Avenue Jean Portalis, TOURS – France.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/31/15.
Final Exam Review CS485/685 Computer Vision Prof. Bebis.
Graphical models for part of speech tagging
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
A Comparison Between Bayesian Networks and Generalized Linear Models in the Indoor/Outdoor Scene Classification Problem.
RESOURCES, TRADE-OFFS, AND LIMITATIONS Group 5 8/27/2014.
CSE 486/586 CSE 486/586 Distributed Systems Graph Processing Steve Ko Computer Sciences and Engineering University at Buffalo.
December 4, 2014Computer Vision Lecture 22: Depth 1 Stereo Vision Comparing the similar triangles PMC l and p l LC l, we get: Similarly, for PNC r and.
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
Machine Learning Tutorial Amit Gruber The Hebrew University of Jerusalem.
In Defense of Nearest-Neighbor Based Image Classification Oren Boiman The Weizmann Institute of Science Rehovot, ISRAEL Eli Shechtman Adobe Systems Inc.
Protein Classification Using Averaged Perceptron SVM
Daphne Koller Message Passing Belief Propagation Algorithm Probabilistic Graphical Models Inference.
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
CISC Machine Learning for Solving Systems Problems Presented by: Satyajeet Dept of Computer & Information Sciences University of Delaware Automatic.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
CSE 140: Computer Vision Camillo J. Taylor Assistant Professor CIS Dept, UPenn.
 Frequent Word Combinations Mining and Indexing on HBase Hemanth Gokavarapu Santhosh Kumar Saminathan.
Big data Usman Roshan CS 675. Big data Typically refers to datasets with very large number of instances (rows) as opposed to attributes (columns). Data.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Markov Random Fields & Conditional Random Fields
HEMANTH GOKAVARAPU SANTHOSH KUMAR SAMINATHAN Frequent Word Combinations Mining and Indexing on HBase.
Quiz Week 8 Topical. Topical Quiz (Section 2) What is the difference between Computer Vision and Computer Graphics What is the difference between Computer.
Structured learning: overview Sunita Sarawagi IIT Bombay TexPoint fonts used in EMF. Read the TexPoint manual before.
Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.
1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.
Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
1 Review and Summary We have covered a LOT of material, spending more time and more detail on 2D image segmentation and analysis, but hopefully giving.
Multiple View Geometry
Image & Model Fitting Abstractions February 2017
Common Classification Tasks
Janardhan Rao (Jana) Doppa, Alan Fern, and Prasad Tadepalli
Radio Propagation Simulation Based on Automatic 3D Environment Reconstruction D. He A novel method to simulate radio propagation is presented. The method.
CS4670: Intro to Computer Vision
Domingo Mery Department of Computer Science
Presentation transcript:

Problems in large-scale computer vision David Crandall School of Informatics and Computing Indiana University

Research questions Given huge collections of images online, how can we analyze images and non-visual metadata to: – Help users organize, browse, search? – Mine information about the state of the world and human behavior?

Common computational problems 1. Image-by-image (e.g. recognition) – Large-scale, but easily parallelizable 2. Iterative algorithms (e.g. learning) – Sometimes few but long-running iterations – Sometimes many lightweight iterations 3. Inference on graphs (e.g. reconstruction, learning) – Small graphs with huge label spaces – Large graphs with small label spaces – Large graphs with large label spaces

Scene classification E.g.: Find images containing snow, in a collection of ~100 million images – Typical approach: extract features and run a classifier (typically SVM) on each image – We use Hadoop, typically with trivial Reducer, images in giant HDFS sequence files, and C++ Map-Reduce bindings

Geolocation Given an image, where on Earth was it taken? – Match against thousands of place models, or against hundreds of attribute classifiers (e.g. indoor vs outdoor, city vs rural, etc.) – Again use Hadoop with trivial mapper

Learning these models Many recognition approaches use “bags-of-words” – Using vector space model over “visual words” To learn, need to: 1.Generate vocabulary of visual words (e.g. with K-means) 2. Extract features from training images 3.Learn a classifier Our computational approach: 1.For k-means, use iterative Map-Reduce (Twister – J. Qiu) 2. For feature extraction, Map-Reduce with trivial reducer 3.For learning classifiers, we use off-the-shelf packages (can be quite slow)

Inference on graphical models Statistical graphical models are widely used in vision – Basic idea: vertices are variables, with some known and some unknown; edges are probabilistic relationships – Inference is NP hard in general Many approximation algorithms are based on message passing – e.g. Loopy Discrete Belief Propagation – # of Messages proportional to # of edges in graph – Messages can be large – size depends on variable label space – # of iterations depends (roughly) on diameter of graph

Pose and part-based recognition Represent objects in terms of parts Can be posed as graphical model inference problem – Small number of variables (vertices) and constraints (edges), but large label space (millions++) – We use single-node multi-threaded implementation, with barriers between iterations

Fine-grained object recognition Classify amongst similar objects (e.g. species of birds) – How can we learn discriminative properties of these objects automatically? – Model each training image as a node, edges between all pairs; goal is to label each image with a feature that is found in all positive examples and no negative examples We use off-the-shelf solver – With some additional multi- threading on single node; still very slow

Large-scale 3D reconstruction

Pose as inference problem View reconstruction as statistical inference over a graphical model –Vertices are cameras and points –Edges are relative camera/point correspondences (estimated through point matching) –Inference: Label each image with a camera pose and each point with a 3-d position, such that constraints are satisfied

Computation Our graphs have ~100,000 vertices, ~1,000,000 edges, ~100,000 possible discrete labels – Reduce computation using exact algorithmic tricks (min convolutions) from O(|E| |L| 2 ) to O(|E| |L|) – Huge amount of data: total message size >1TB per iteration Parallelize using iterative MapReduce – Hadoop plus shell scripts for iteration – Mappers take in messages from last iteration and compute outgoing messages – Reducers collate and route messages – Messages live on HDFS between iterations

Common computational problems 1. Image-by-image (e.g. recognition) – Large-scale, but easily parallelizable 2. Iterative algorithms (e.g. learning) – Sometimes few but long-running iterations – Sometimes many lightweight iterations 3. Inference on graphs (e.g. reconstruction, learning) – Small graphs with huge label spaces – Large graphs with small label spaces – Large graphs with large label spaces