Presentation is loading. Please wait.

Presentation is loading. Please wait.

3rd Workshop On Semantic Perception, Mapping and Exploration (SPME) Karlsruhe, Germany,2013 Semantic Parsing for Priming Object Detection in RGB-D Scenes.

Similar presentations


Presentation on theme: "3rd Workshop On Semantic Perception, Mapping and Exploration (SPME) Karlsruhe, Germany,2013 Semantic Parsing for Priming Object Detection in RGB-D Scenes."— Presentation transcript:

1 3rd Workshop On Semantic Perception, Mapping and Exploration (SPME) Karlsruhe, Germany,2013 Semantic Parsing for Priming Object Detection in RGB-D Scenes Cesar Cadena and Jana Kosecka

2 Motivation 5/5/2013  Long-term robotic operation  The semantic information about the surrounding environment is important for high level robotic tasks.  It is difficult to know a priori all the possible instances or classes of objects that the robot will find in a real operation.  Even if we know a lot of them, it is unreasonable and expensive, run all specific object detectors at the same time. Semantic Parsing for Priming Object Detection in RGB-D Scenes

3 Motivation 5/5/2013  Long-term robotic operation  The semantic information about the surrounding environment is important for high level robotic tasks.  It is difficult to know a priori all the possible instances or classes of objects that the robot will find in a real operation.  Even if we know a lot of them, it is unreasonable and expensive, run all specific object detectors at the same time. Semantic Parsing for Priming Object Detection in RGB-D Scenes

4 Motivation 5/5/2013  Long-term robotic operation  The semantic information about the surrounding environment is important for high level robotic tasks.  It is difficult to know a priori all the possible instances or classes of objects that the robot will find in a real operation.  Even if we know a lot of them, it is unreasonable and expensive, run all specific object detectors at the same time. Semantic Parsing for Priming Object Detection in RGB-D Scenes

5 Motivation 5/5/2013  Long-term robotic operation  The semantic information about the surrounding environment is important for high level robotic tasks.  It is difficult to know a priori all the possible instances or classes of objects that the robot will find in a real operation.  Even if we know a lot of them, it is unreasonable and expensive, run all specific object detectors at the same time. Semantic Parsing for Priming Object Detection in RGB-D Scenes

6  However:  There are things we can assume to be present (almost) always  Generic “detachable” objects also share some characteristics Urban: GroundBuildingsSkyObjects Indoors:GroundWallsCeilingObjects Today: Ground – Structure – Furniture – Props Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors Motivation 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

7  However:  There are things we can assume to be present (almost) always  Generic “detachable” objects also share some characteristics Urban: GroundBuildingsSkyObjects Indoors:GroundWallsCeilingObjects Today: Ground – Structure – Furniture – Props Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors Motivation 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

8  However:  There are things we can assume to be present (almost) always  Generic “detachable” objects also share some characteristics Urban: GroundBuildingsSkyObjects Indoors:GroundWallsCeilingObjects Today: Ground – Structure – Furniture – Props Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors Motivation 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

9  However:  There are things we can assume to be present (almost) always  Generic “detachable” objects also share some characteristics Urban: GroundBuildingsSkyObjects Indoors:GroundWallsCeilingObjects Today: Ground – Structure – Furniture – Props Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors Motivation 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

10  However:  There are things we can assume to be present (almost) always  Generic “detachable” objects also share some characteristics Urban: GroundBuildingsSkyObjects Indoors:GroundWallsCeilingObjects Today: Ground – Structure – Furniture – Props Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors Our Problem 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

11  However:  There are things we can assume to be present (almost) always  Generic “detachable” objects also share some characteristics Urban: GroundBuildingsSkyObjects Indoors:GroundWallsCeilingObjects Today: Ground – Structure – Furniture – Props Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors Our Problem 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

12 NYU Depth v2 5/5/2013  1449 labeled frames.  26 scenes classes.  Labeling spans over 894 different classes. N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, Indoor segmentation and support inference from RGBD images, in ECCV, Thanks to N. Silberman for proving the mapping 894 to 4 classes. Semantic Parsing for Priming Object Detection in RGB-D Scenes

13 The System 5/5/2013 Semantic Segmentation MAP Marginals Semantic Parsing for Priming Object Detection in RGB-D Scenes

14 Different approaches 5/5/2013 Semantic Segmentation MAP Marginals  N. Silberman et al. ECCV 2012  C. Couprie et al. CoRR 2013  X. Ren et al. CVPR 2012  D. Munoz et al. ECCV 2010  I. Endres and D. Hoeim, ECCV 2010 They have at least one:  Expensive over-segmentation  Expensive features  Expensive Inference Semantic Parsing for Priming Object Detection in RGB-D Scenes

15 Our approach 5/5/2013 MAP Marginals Semantic Segmentation Conditional Random Fields Potentials Graph Structure InferencePreprocessing Semantic Parsing for Priming Object Detection in RGB-D Scenes

16 Outline 5/5/2013 MAP Marginals Conditional Random Fields Potentials Graph Structure InferencePreprocessing (1) (2) (3) (5) Results (6) Conclusions (4) Semantic Parsing for Priming Object Detection in RGB-D Scenes

17 Preprocessing: Over-segmentation 5/5/2013 SLIC superpixels R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk, SLIC superpixels compared to state-of-the-art superpixel methods, PAMI, Semantic Parsing for Priming Object Detection in RGB-D Scenes

18 Graph Structure 5/5/2013 Classical choice on images Semantic Parsing for Priming Object Detection in RGB-D Scenes

19 Graph Structure: Our choice 5/5/2013 Minimum Spanning Tree Over 3D Semantic Parsing for Priming Object Detection in RGB-D Scenes

20 Graph Structure: Our choice 5/5/2013 Minimum Spanning Tree Over 3D Semantic Parsing for Priming Object Detection in RGB-D Scenes

21 Potentials: Pairwise CRFs 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

22 Potentials: Pairwise CRFs 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

23 Potentials: Pairwise CRFs 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

24 Potentials: unary 5/5/2013 frequency of label j in a k-NN query frequency of label j the database J. Tighe and S. Lazebnik, Superparsing: Scalable nonparametric image parsing with superpixels, ECCV The database is a kd-tree of features from training data Semantic Parsing for Priming Object Detection in RGB-D Scenes

25 Features12D 5/5/2013  From Image:  mean of Lab color space3D  vertical pixel location1D  entropy from vanishing points1D  From 3D  height and depth2D  mean and std of differences on depth2D  local planarity1D  neighboring planarity1D  vertical orientation1D Semantic Parsing for Priming Object Detection in RGB-D Scenes

26 Features 5/5/2013  From Image:  entropy from vanishing points Semantic Parsing for Priming Object Detection in RGB-D Scenes

27 Features 5/5/2013  From 3D  mean and std of differences on depth Semantic Parsing for Priming Object Detection in RGB-D Scenes

28 Features 5/5/2013  From 3D  mean and std of differences on depth Semantic Parsing for Priming Object Detection in RGB-D Scenes

29 Features 5/5/2013  From 3D  mean and std of differences on depth  local planarity  neighboring planarity  vertical orientation Semantic Parsing for Priming Object Detection in RGB-D Scenes

30 Potentials: pairwise 5/5/2013 Lab color Semantic Parsing for Priming Object Detection in RGB-D Scenes

31 Inference 5/5/2013  We use belief propagation:  Exact results in MAP/marginals  Efficient computation, in Thanks to our graph structure choice! Semantic Parsing for Priming Object Detection in RGB-D Scenes

32 Results: NYU-D v2 Dataset 5/5/2013 GTMAP Semantic Parsing for Priming Object Detection in RGB-D Scenes

33 Results: NYU-D v2 Dataset 5/5/2013  Confusion matrix:  Comparisons: Semantic Parsing for Priming Object Detection in RGB-D Scenes

34 Results: NYU-D v2 Dataset 5/5/2013  Confusion matrix:  Comparisons: Semantic Parsing for Priming Object Detection in RGB-D Scenes

35 Results: NYU-D v2 Dataset 5/5/2013 GTMAP  Some failures: Semantic Parsing for Priming Object Detection in RGB-D Scenes

36 Results: NYU-D v2 Dataset 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

37 Marginal probabilities 5/5/2013  Provide very useful information for specific tasks, e.g. :  Specific object detection  Support inference P(Ground)P(Structure)P(Furniture)P(Props) Semantic Parsing for Priming Object Detection in RGB-D Scenes

38 Conclusions 5/5/2013  We have presented a computational efficient approach for semantic segmentation of priming objects in indoors.  Our approach effectively uses 3D and Images cues. Depth discontinuities are evidence for occlusions  The MST over 3D keeps intra-class components coherently connected. Semantic Parsing for Priming Object Detection in RGB-D Scenes

39 Discussion 5/5/2013  Features:  Local classifier:  Graph structure Bunch of engineered features (>1000D) Learned features (>1000D) Select meaningful features (12D) Logistic RegressionNeural Networksk-NN Dense Connections Image NoneMST over 3D Silberman et al. 2012Couprie et al. 2013Ours. Semantic Parsing for Priming Object Detection in RGB-D Scenes

40 Thanks!! 5/5/2013 Cesar Jana Funded by the US Army Research Office Grant W911NF Semantic Parsing for Priming Object Detection in RGB-D Scenes

41 Working on: 5/5/2013  People detection by Shenghui Zhou Semantic Parsing for Priming Object Detection in RGB-D Scenes

42 Multi-view and video: 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

43 Multi-view and video: 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

44 Multi-view and video: 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

45 Multi-view and video: 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

46 Multi-view and video: 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes

47 Multi-view and video: 5/5/2013Semantic Parsing for Priming Object Detection in RGB-D Scenes


Download ppt "3rd Workshop On Semantic Perception, Mapping and Exploration (SPME) Karlsruhe, Germany,2013 Semantic Parsing for Priming Object Detection in RGB-D Scenes."

Similar presentations


Ads by Google