Object recognition (part 2) CSE P 576 Larry Zitnick

Slides:

Advertisements

Similar presentations

Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.

Advertisements

Simplifications of Context-Free Grammars

Variations of the Turing Machine

Angstrom Care 培苗社 Quadratic Equation II

1 DSLs with Groovy Saager Mhatre. 2 github.com/dexterous code.google.com/u/saager.mhatre

3rd Annual Plex/2E Worldwide Users Conference 13A Batch Processing in 2E Jeffrey A. Welsh, STAR BASE Consulting, Inc. September 20, 2007.

AP STUDY SESSION 2.

1 WORKING WITH 2007 WORD Part 1 Developed October 2007 with lots of help from.

Select from the most commonly used minutes below.

Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.

STATISTICS HYPOTHESES TEST (I)

STATISTICS POINT ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.

David Burdett May 11, 2004 Package Binding for WS CDL.

Local Customization Chapter 2. Local Customization 2-2 Objectives Customization Considerations Types of Data Elements Location for Locally Defined Data.

The 5S numbers game..

1 00/XXXX © Crown copyright Carol Roadnight, Peter Clark Met Office, JCMM Halliwell Representing convection in convective scale NWP models : An idealised.

Media-Monitoring Final Report April - May 2010 News.

Chapter 7: Steady-State Errors 1 ©2000, John Wiley & Sons, Inc. Nise/Control Systems Engineering, 3/e Chapter 7 Steady-State Errors.

1 Combination Symbols A supplement to Greenleafs QR Text Compiled by Samuel Marateck ©2009.

Break Time Remaining 10:00.

The basics for simulations

EE, NCKU Tien-Hao Chang (Darby Chang)

Local features: detection and description Tuesday October 8 Devi Parikh Virginia Tech Slide credit: Kristen Grauman 1 Disclaimer: Most slides have been.

A sample problem. The cash in bank account for J. B. Lindsay Co. at May 31 of the current year indicated a balance of $14, after both the cash receipts.

Turing Machines.

PP Test Review Sections 6-1 to 6-6

Bellwork Do the following problem on a ½ sheet of paper and turn in.

Operating Systems Operating Systems - Winter 2010 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.

Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)

Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)

Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.

Adding Up In Chunks.

MaK_Full ahead loaded 1 Alarm Page Directory (F11)

1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.

Artificial Intelligence

1 Using Bayesian Network for combining classifiers Leonardo Nogueira Matos Departamento de Computação Universidade Federal de Sergipe.

ECOLOGY FINAL EXAM REVIEW.

Research Summary 08/2010 Dr. Andrej Mošat` Prof. A. Linninger, Laboratory for Product and Process Design, M/C 063 University of Illinois at Chicago 04.

1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.

DTU Informatics Introduction to Medical Image Analysis Rasmus R. Paulsen DTU Informatics TexPoint fonts.

Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.

1 Titre de la diapositive SDMO Industries – Training Département MICS KERYS 09- MICS KERYS – WEBSITE.

One-Degree Imager (ODI), WIYN Observatory What’s REALLY New in SolidWorks 2010 Richard Doyle, User Community Manager inspiration.

Converting a Fraction to %

Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)

Clock will move after 1 minute

Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 11 Simple Linear Regression.

Physics for Scientists & Engineers, 3rd Edition

Select a time to count down from the clock above

Copyright Tim Morris/St Stephen's School

1.step PMIT start + initial project data input Concept Concept.

A Power Tools Treasury: great tools that many folks haven't yet met (or don't know well enough) Presented by Mark Minasi 1 WSV350.

9. Two Functions of Two Random Variables

1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.

1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.

1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)

CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Classification - SVM CS 685: Special Topics in Data Mining Jinze Liu.

CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Classification - SVM CS 685: Special Topics in Data Mining Spring 2008 Jinze Liu.

1 CMSC 671 Fall 2010 Class #24 – Wednesday, November 24.

Support Vector Machines

Support Vector Machines

Class #212 – Thursday, November 12

Support Vector Machines

Presentation transcript:

Object recognition (part 2) CSE P 576 Larry Zitnick

Nov 23rd, 2001Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines Modified from the slides by Dr. Andrew W. Moore

Support Vector Machines: Slide 14 Copyright © 2001, 2003, Andrew W. Moore Linear Classifiers f x  y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) How would you classify this data?

Support Vector Machines: Slide 15 Copyright © 2001, 2003, Andrew W. Moore Linear Classifiers f x  y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) How would you classify this data?

Support Vector Machines: Slide 16 Copyright © 2001, 2003, Andrew W. Moore Linear Classifiers f x  y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) How would you classify this data?

Support Vector Machines: Slide 17 Copyright © 2001, 2003, Andrew W. Moore Linear Classifiers f x  y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) How would you classify this data?

Support Vector Machines: Slide 18 Copyright © 2001, 2003, Andrew W. Moore Linear Classifiers f x  y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) Any of these would be fine....but which is best?

Support Vector Machines: Slide 19 Copyright © 2001, 2003, Andrew W. Moore Classifier Margin f x  y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) Define the margin of a linear classifier as the width that the boundary could be increased by before hitting a datapoint.

Support Vector Machines: Slide 20 Copyright © 2001, 2003, Andrew W. Moore Maximum Margin f x  y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) The maximum margin linear classifier is the linear classifier with the, um, maximum margin. This is the simplest kind of SVM (Called an LSVM) Linear SVM

Support Vector Machines: Slide 21 Copyright © 2001, 2003, Andrew W. Moore Maximum Margin f x  y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) The maximum margin linear classifier is the linear classifier with the, um, maximum margin. This is the simplest kind of SVM (Called an LSVM) Support Vectors are those datapoints that the margin pushes up against Linear SVM

Support Vector Machines: Slide 22 Copyright © 2001, 2003, Andrew W. Moore Why Maximum Margin? denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) The maximum margin linear classifier is the linear classifier with the, um, maximum margin. This is the simplest kind of SVM (Called an LSVM) Support Vectors are those datapoints that the margin pushes up against 1.Intuitively this feels safest. 2.If we’ve made a small error in the location of the boundary (it’s been jolted in its perpendicular direction) this gives us least chance of causing a misclassification. 3.LOOCV is easy since the model is immune to removal of any non- support-vector datapoints. 4.There’s some theory (using VC dimension) that is related to (but not the same as) the proposition that this is a good thing. 5.Empirically it works very very well.

Support Vector Machines: Slide 23 Copyright © 2001, 2003, Andrew W. Moore Nonlinear Kernel (I)

Support Vector Machines: Slide 24 Copyright © 2001, 2003, Andrew W. Moore Nonlinear Kernel (II)

Support Vector Machines: Slide 25

Smallest category size is 31 images: Too easy? – left-right aligned – Rotation artifacts – Soon will saturate performance Caltech-101: Drawbacks

Antonio Torralba generated these average images of the Caltech 101 categories

Jump to Nicolas Pinto’s slides. (page 29)

32 Objects in Context R. Gokberk Cinbis MIT Object Recognition and Scene Understanding

33 Papers A. Torralba. Contextual priming for object detection. IJCV A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora and S. Belongie. Objects in Context. ICCV 2007

34 Object Detection Probabilistic Framework Object presence at a particular location/scal e Given all image features (local/object and scene/context) (Single Object Likelihood) v = v Local + v Contextual

35 Contextual Reasoning Scene Centered ? Contextual priming for object detection Object Centered Objects in Context 2D Reasoning2,5D / 3D Reasoning SUPPORT VERTICAL SKY Geometric context from a single image. Surface orientations w.r.t. camera

36 Preview: Contextual Priming for Object Detection Input test image

37 Preview: Contextual Priming for Object Detection Correlate with many filters

38 Preview: Contextual Priming for Object Detection Using previously collected statistics about filter output predict information about objects

39 Preview: Contextual Priming for Object Detection Where I can find the objects easily? Which objects do I expect to see? people carchair How large objects do I expect to see? Predict information about objects

40 Contextual Priming for Object Detection: Probabilistic Framework Local measurements (a lot in the literature) Contextual features

41 Contextual Priming for Object Detection: Contextual Features Gabor filters at 4 scales and 6 orientations Use PCA on filter output images to reduce the number of features (< 64) Use Mixture of Gaussians to model the probabilities. (Other alternatives include KNN, parzen window, logistic regression, etc)

42 Contextual Priming for Object Detection: Object Priming Results (o 1 =people, o 2 =furniture, o 3 =vehicles and o 4 =trees

43 Contextual Priming for Object Detection: Focus of Attention Results Heads

44 Contextual Priming for Object Detection: Conclusions Proves the relation btw low level features and scene/context Can be seen as a computational evidence for the (possible) existence of low-level feature based biological attention mechanisms Also a warning: Whether an object recognition system understands the object or works by lots bg features.

45 Preview: Objects in Context Input test image

46 Preview: Objects in Context Do segmentation on the image

47 Preview: Objects in Context Do classification (find label probabilities) in each segment only with local info Building, boat, motorbike Building, boat, person Water, sky Road

48 Preview: Objects in Context Building, boat, motorbike Building, boat, person Water, sky Road Most consistent labeling according to object co- occurrences & local label probabilities. Boat Building Water Road

49 Objects in Context: Local Categorization Building, boat, motorbike Building, boat, person Water, sky Road Extract random patches on zero-padded segments Calculate SIFT descriptors Use BoF: Training: - Cluster patches in training (Hier. K-means, K=10x3) - Histogram of words in each segment - NN classifier (returns a sorted list of categories) Each segment is classified independently

50 Objects in Context: Contextual Refinement Contextual model based on co-occurrences Try to find the most consistent labeling with high posterior probability and high mean pairwise interaction. Use CRF for this purpose. Boat Building Water Road Independent segment classification Mean interaction of all label pairs Φ(i,j) is basically the observed label co- occurrences in training set.

51 Objects in Context: Learning Context Using labeled image datasets (MSRC, PASCAL) Using labeled text based data (Google Sets): Contains list of related items  A large set turns out to be useless! (anything is related)

52 Objects in Context: Results

53 “Objects in Context” – Limitations: Context modeling Segmentation Categorization without context Local information only With co-occurrence context Means: P(person,dog) > P(person, cow) (Bonus Q: How did it handle the background?)

54 “Objects in Context” – Limitations: Context modeling Segmentation Categorization without context Local information only With co-occurrence context P(person,horse) > P(person, dog) But why? Isn’t it only a dataset bias? We have seen in the previous example that P(person,dog) is common too.

55 “Objects in Context” Object-Object or Stuff-Object ? Stuff Stuff-like Looks like “background” stuff – object (such as water-boat) does help rather than “foreground” object co- occurrences (such as person-horse) [but still car-person-motorbike is useful in PASCAL] Labels with high co-occurrences with other labels

56 “Objects in Context” – Limitations: Segmentation Too good: A few or many? How to select a good segmentation in multiple segmentations? Can make object recognition & contextual reasoning (due to stuff detection) much easier.

57 “Objects in Context” - Limitations No cue by unknown objects No spatial relationship reasoning Object detection part heavily depends on good segmentations Improvements using object co-occurrences are demonstrated with images where many labels are already correct.  How good is the model?

58 Contextual Priming vs. Objects in Context Scene->Object{Object,Stuff} Simpler training data (only target object’s labels are enough) May need huge amount of labeled data Scene information is view-dependent (due to gist) Can be more generic than scene->object with a very good model Object detector independent Contextual model is object detector independent, in theory. But: - uses segmentation  can be unreliable + use segmentation  easier to detect stuff

Finding the weakest link in person detectors Larry Zitnick Microsoft Research Devi Parikh TTI, Chicago

Object recognition We’ve come a long way… Fischler and Elschlager, 1973 Dollar et al., BMVC 2009

Still a ways to go… Dollar et al., BMVC 2009

Still a ways to go… Dollar et al., BMVC 2009

Part-based person detector 4 main components: Feature selection Felzenszwalb et al., 2005 Hoiem et al., 2006 Color Intensity Edges Part detection Spatial model NMS / context

How can we help? Humans supply training data… – 100,000s labeled images We design the algorithms. – Going on 40 years. Can we use humans to debug? Help me!

Human debugging Feature selection Part detection Spatial model NMS / context Feature selection Spatial model NMS / context Feature selection Part detection NMS / context Feature selection Part detection Amazon Mechanical Turk

Human performance Humans ~90% average precision Machines ~46% average precision PASCAL VOC dataset

Human debugging Feature selection Spatial model NMS / context Low resolution 20x20 pixels Head Leg Feet Head ? Is it a head, torso, arm, leg, foot, hand, or nothing? Nothing

Part detections Humans Machine Head Torso Arm Hand Leg Foot Person

Part detections Humans Machine

Part detections Humans Machine

Part detections Machine High res Low res

AP results

Spatial model High res Low res Feature selection Part detection NMS / context Person Not a person

Spatial model

Context/NMS vs. NMS / context

Conclusion

7:00min