Presentation on theme: "Kapitel 14 Recognition Scene understanding / visual object categorization Pose clustering Object recognition by local features Image categorization Bag-of-features."— Presentation transcript:
1Kapitel 14RecognitionScene understanding / visual object categorizationPose clusteringObject recognition by local featuresImage categorizationBag-of-features modelsLarge-scale image searchTexPoint fonts used in EMF.Read the TexPoint manual before you delete this box.: AA
2Scene understanding (1) Scene categorizationoutdoor/indoorcity/forest/factory/etc.
8Visual object categorization (4) Recognition is all about modeling variabilitycamera positionilluminationshape variationswithin-class variations
9Visual object categorization (5) Within-class variations (why are they chairs?)
10Pose clustering (1) Working in transformation space - main ideas: Generate many hypotheses of transformation image vs. model, each built by a tuple of image and model featuresCorrect transformation hypotheses appear many timesMain steps:Quantize the space of possible transformationsFor each tuple of image and model features, solve for the optimal transformation that aligns the matched featuresRecord a “vote” in the corresponding transformation space binFind "peak" in transformation space
11Pose clustering (2)Example: Rotation only. A pair of one scene segment and one model segment suffices to generate a transformation hypothesis.
12Pose clustering (3)C.F. Olson: Efficient pose clustering using a randomized algorithm. IJCV, 23(2): , 1997.2-tuples of image and model corner points are used to generate hypotheses. (a) The corners found in an image. (b) The four best hypotheses found with the edges drawn in. The nose of the plane and the head of the person do not appear because they were not in the models.
13Object recognition by local features (1) D.G. Lowe: Distinctive image features from scale-invariant keypoints. IJCV, 60(2): , 2004The SIFT features of training images are extracted and storedFor a query imageExtract SIFT featuresEfficient nearest neighbor indexingPose clustering by Hough transformFor clusters with >2 keypoints (object hypotheses): determine the optimal affine transformation parameters by least squares method; geometry verificationInput ImageStored Image
14Object recognition by local features (2) Cluster of 3 corresponding feature pairs
21Bag-of-features models (2) Origin 1: Text recognition by orderless document representation (frequencies of words from a dictionary), Salton & McGill (1983)US Presidential Speeches Tag Cloud
22Bag-of-features models (3) Origin 2: Texture recognitionTexture is characterized by the repetition of basic elements or textonsFor stochastic textures, it is the identity of the textons, not their spatial arrangement, that matters
39Large-scale image search (4) Moulin RougeOld Town Square (Prague)Tour MontparnasseColosseumViktualienmarkt MaypoleApplication: Image auto-annotationLeft: Wikipedia image Right: closest match from Flickr
40SourcesK. Grauman, B. Leibe: Visual Object Recognition. Morgen & Claypool Publishers, 2011R. Szeliski: Computer Vision: Algorithms and Applications. Springer, (Chapter 14 “Recognition”)Course materials from others (G. Bebis, J. Hays, S. Lazebnik, …)