Spatio-temporal constraints for recognizing 3D objects in videos Nicoletta Noceti Università degli Studi di Genova.

Spatio-temporal constraints for recognizing 3D objects in videos http://slipguru.disi.unige.it Nicoletta Noceti Università degli Studi di Genova

2 Outline of the presentation  3D object recognition View based approaches Local descriptors for object recognition Our approach  Spatio temporal models for 3D objects recognition Modeling sequences Representation of video sequences w.r.t. the model 2-stage matching procedure  Recognizing objects: experiments and results

3 Object recognition  Localisation means to determine the pose of each object relative to a sensor  Categorization means recognising the class to which an object belongs instead of recognising that particular object  The goal of recognition systems is to identify which objects are present in a scene Unlike ”merely” perceiving a shape, recognising it involves memory, that is accessing at representations of shapes seen in the past [Wittgenstein73]

4 Object recognition

5 View based object recognition  View based approaches to 3D object recognition gained attention as a way to deal with appearance variation [Murase et al. 95, Pontil et al. 98] no explicit model is required  Local approaches produce relatively compact descriptions of the image content and do not suffer from the presence of cluttered background and occlusions [Mikolajczyk et al. 03 ]  Local object models are often inspired by text categorization [Cristianini et al. 02]  Many view based local approach to recognition have been proposed [Leibe et al. 04, Csurka et al. 04]

6 Our approach to recognition  We observe an object from slightly different viewpoints and exploit local features distinctive in space and stable in time to perform recognition  Our approach shares some similarities with codebook methods but our method extends this concept also in the temporal domain

7 Our approach to recognition  View-based recognition systems do not need explicit computation of 3D object models  Local approaches produce compact descriptions and do not suffer from cluttered background and occlusions  Spatial constraints improve quality of recognition [Ferrari et al. 06]  Biological vision systems gather information by means of motion to include important cues for depth perception and object recognition [Stringer et al.06]

9 Drawbacks of locality His eyes would dart from one thing to another, picking up tiny features, individual features, as they had done with my face. A striking brightness, a colour, a shape would arrest his attention and elicit comment – but in no case did he get the scene-as-a-whole. He failed to see the whole, seeing only details, which he spotted like blips on a radar screen. He never entered into relation with the picture as a whole - never faced, so to speak, its physiognomy. He had no sense whatever of a landscape or a scene. ”The Man Who Mistook His Wife For A Hat: And Other Clinical Tales”, by Oliver Sacks, 1970

10 Ideas  Obtain a 3D object recognition method based on a compact description of image sequences  Exploit spatial information on proximity of features appearing contemporaneously  Exploit temporal continuity both on training and test E. Delponte, N. Noceti, F. Odone and A. Verri Spatio temporal constraints for matching view-based descriptions of 3D objects In WIAMIS 2007

11 Recognizing objects with ST models Video sequences Keypoints detection and description Keypoints tracking Cleaning procedure Building the spatio temporal model 2-stage matching procedure Object recognition Spatio-temporal model for training Spatio-temporal model for test

12 From sequence to spatio-temporal model

13  For each image of the sequence extract Harris corners assign them a scale and a principal direction assign them a SIFT descriptor  Tracking of keypoints with Kalman filter cleaning procedure based on length of trajectories and robustness of descriptors  Computation of time invariant features From sequence to spatio-temporal model Video sequences Keypoints detection and description Keypoints tracking Cleaning procedure Building the spatio temporal model

14 From sequence to spatio-temporal model

15 Time invariant feature  We obtain a set of time-invariant features: a spatial appearance descriptor, that is the average of all SIFT vectors of its trajectory a temporal descriptor, that contains information on when the feature first appeared in the sequence and on when it was last observed

16 The spatio-temporal model  The collection of time- invariant features constitutes a spatio- temporal model that we use to train our system  We emphasise the temporal coherence of the model and we exploit features appearing simultaneously

17 Matching spatio-temporal models 2-stage matching procedure Object recognition Spatio-temporal model for training Spatio-temporal model for test

18 Matching of sequence models  For each video sequence we compute its spatio-temporal model  Given a test sequence, we perform a two stage matching procedure by exploiting spatial and temporal coherence of time- invariant features we compute a first set of matches we reinforce the procedure by analising spatial and temporal matches neighborhood

19 Matching of sequence models

21 Experiments and results  Matching assessment Illumination, scale and background changes Changes in motion Increasing the number of objects  Object recognition on a 20 objects dataset  Recognition on a video streaming

22 3D objects

23 Matching assessment Matches obtained on sequences with simple changes compared w.r.t. ST models of 4 objects

24 Changing motion Matches obtained w.r.t. ST models of 4 objects Matches obtained in the first step of matching

25 Models of 3D objects Test sequences Matching assessment Models of 3D objects Test sequences

26 Recognizing 20 objects Book:  Bambi: + + Dewey: О О X Dewey:  Sully: X X Box: О О Donald:  Dewey:++ Scrooge:О О Book:  Bambi: + + Dewey: О О X Dewey:  Sully: X X Box: О О Donald:  Dewey:++ Scrooge:О О Book:  Bambi: + + Dewey: О О X Dewey:  Sully: X X Box: О О Donald:  Dewey:++ Scrooge:О О Book:  Bambi: + + Dewey: О О X Dewey:  Sully: X X Box: О О Donald:  Dewey:++ Scrooge:О О Number of experiments: 840 TP=51FN=13 FP=11TN=765

27 Recognition on a video stream

28 Conclusion and future work  We exploited the compactness and the expressiveness of local image descriptions to address the problem of 3D object recognition  We devised a system based on the use of spatial and temporal information and we have proved how the model of a 3D object benefit of both these information  The system could benefit from adding information on the image context [Tor03]

Thanks for your attention!

Spatio-temporal constraints for recognizing 3D objects in videos Nicoletta Noceti Università degli Studi di Genova.

Similar presentations

Presentation on theme: "Spatio-temporal constraints for recognizing 3D objects in videos Nicoletta Noceti Università degli Studi di Genova."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Spatio-temporal constraints for recognizing 3D objects in videos Nicoletta Noceti Università degli Studi di Genova.

Similar presentations

Presentation on theme: "Spatio-temporal constraints for recognizing 3D objects in videos Nicoletta Noceti Università degli Studi di Genova."— Presentation transcript:

Similar presentations

About project

Feedback