Christian Siagian Laurent Itti Univ. Southern California, CA, USA

Slides:

Advertisements

Similar presentations

Visual Saliency: the signal from V1 to capture attention Li Zhaoping Head, Laboratory of natural intelligence Department of Psychology University College.

Advertisements

Texture. Limitation of pixel based processing Edge detection with different threshold.

Evaluating Color Descriptors for Object and Scene Recognition Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, and Cees G.M. Snoek,

Automatic scene inference for 3D object compositing Kevin Karsch (UIUC), Sunkavalli, K. Hadap, S.; Carr, N.; Jin, H.; Fonte, R.; Sittig, M., David Forsyth.

Vision Based Control Motion Matt Baker Kevin VanDyke.

1 Building a Dictionary of Image Fragments Zicheng Liao Ali Farhadi Yang Wang Ian Endres David Forsyth Department of Computer Science, University of Illinois.

Hierarchical Saliency Detection School of Electronic Information Engineering Tianjin University 1 Wang Bingren.

黃文中 Preview 2 3 The Saliency Map is a topographically arranged map that represents visual saliency of a corresponding visual scene. 4.

Uncertainty Representation. Gaussian Distribution variance Standard deviation.

Detecting Pedestrians by Learning Shapelet Features

Learning Convolutional Feature Hierarchies for Visual Recognition

A Novel Method for Generation of Motion Saliency Yang Xia, Ruimin Hu, Zhenkun Huang, and Yin Su ICIP 2010.

IEEE TCSVT 2011 Wonjun Kim Chanho Jung Changick Kim

Natan Jacobson, Yen-Lin Lee, Vijay Mahadevan, Nuno Vasconcelos, Truong Q. Nguyen IEEE, ICME 2010.

Michigan State University1 Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways Zhengping Ji Embodied Intelligence.

CS 561, Sessions 27 1 Towards intelligent machines Thanks to CSCI561, we now know how to… - Search (and play games) - Build a knowledge base using FOL.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 6: Low-level features 1 Computational Architectures in Biological.

1 Segmentation of Salient Regions in Outdoor Scenes Using Imagery and 3-D Data Gunhee Kim Daniel Huber Martial Hebert Carnegie Mellon University Robotics.

Visual Attention Jeremy Wyatt. Where to look? Many visual processes are expensive Humans don’t process the whole visual field How do we decide what to.

Michigan State University 1 “Saliency-Based Visual Attention” “Computational Modeling of Visual Attention”, Itti, Koch, (Nature Reviews – Neuroscience.

Extraction of Salient Contours in Color Images Vonikakis Vasilios, Ioannis Andreadis and Antonios Gasteratos Democritus University of Thrace

Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.

Special Topic on Image Retrieval

Spatial Pyramid Pooling in Deep Convolutional

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.

Oral Defense by Sunny Tang 15 Aug 2003

Introduction of Saliency Map

Computer Science Department, Duke UniversityPhD Defense TalkMay 4, 2005 Fast Extraction of Feature Salience Maps for Rapid Video Data Analysis Nikos P.

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.

Computer vision.

Studying Visual Attention with the Visual Search Paradigm Marc Pomplun Department of Computer Science University of Massachusetts at Boston

A Model of Saliency-Based Visual Attention for Rapid Scene Analysis Laurent Itti, Christof Koch, and Ernst Niebur IEEE PAMI, 1998.

CSNDSP’06 Visual Attention based Region of Interest Coding for Video - telephony Applications Nicolas Tsapatsoulis Computer Science Dept. University of.

EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.

SPIE'01CIRL-JHU1 Dynamic Composition of Tracking Primitives for Interactive Vision-Guided Navigation D. Burschka and G. Hager Computational Interaction.

S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.

Visual Attention Derek Hoiem March 14, 2007 Misc Reading Group.

黃文中 Introduction The Model Results Conclusion 2.

Visual Distinctness What is easy to find How to represent quantitity Lessons from low-level vision Applications in Highlighting Icon (symbol) design -

The University of Texas at Austin Vision-Based Pedestrian Detection for Driving Assistance Marco Perez.

November 13, 2014Computer Vision Lecture 17: Object Recognition I 1 Today we will move on to… Object Recognition.

Computer Vision Michael Isard and Dimitris Metaxas.

Putting Context into Vision Derek Hoiem September 15, 2004.

Pulkit Agrawal Y7322 BVV Sri Raj Dutt Y7110 Sushobhan Nayak Y7460.

Geodesic Saliency Using Background Priors

1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.

Modeling the Shape of a Scene: Seeing the trees as a forest Scene Understanding Seminar

Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.

Street Smarts: Visual Attention on the Go Alexander Patrikalakis May 13, XXX.

Efficient Color Boundary Detection with Color-opponent Mechanisms CVPR2013 Posters.

Visual Computing Computer Vision 2 INFO410 & INFO350 S2 2015

Overivew Occupancy Grids -Sonar Models -Bayesian Updating

A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

 Mentor : Prof. Amitabha Mukerjee Learning to Detect Salient Objects Team Members - Avinash Koyya Diwakar Chauhan.

Thrust IIA: Environmental State Estimation and Mapping Dieter Fox (Lead) Nicholas Roy MURI 8 Kickoff Meeting 2007.

Biologically Inspired Vision-based Indoor Localization Zhihao Li, Ming Yang

Face recognition using Histograms of Oriented Gradients

Computational Vision --- a window to our brain

Recognizing Deformable Shapes

Implementation of a Visual Attention Model

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Text Detection in Images and Video

CS 1674: Intro to Computer Vision Scene Recognition

Brief Review of Recognition + Context

Object Recognition Today we will move on to… April 12, 2018

Cognitive Processes PSY 334

Computer Vision Basics

Object Detection Implementations

Presentation transcript:

Christian Siagian Laurent Itti Univ. Southern California, CA, USA Gist: A Mobile Robotics Application of Context-Based Vision in Outdoor Environment Christian Siagian Laurent Itti Univ. Southern California, CA, USA

Outline Mobile robot localization Biological approach to vision Gist model Testing and results Discussion and conclusion

Mobile Robot Localization Where are we? Localization = identifying landmarks

Mobile Robot Localization Indoors: strong assumptions of flat walls, narrow hallways, and solid angles Ranging sensors (laser and sonar) for mapping Outdoors: less conforming set of surfaces Ranging sensors are less effective, vision is better

Robot Vision Localization Object-based Vision Localization Objects as landmarks Accuracy: Based on object observation model Selection of reliable objects Can accommodate metric & topological mapping Efficiency: Trade-off between efficiency and robustness within the localization framework Scalability: Generally, the size of environments scale with the number of objects in database The task of object selection becomes harder

Robot Vision Localization Region-based Vision Localization regions as landmarks Accuracy: Needs configuration of regions Prone to over/under segmentation Observation model is less sophisticated Efficiency: Can use lower resolutions although flexible matching is necessary Scalability: Need more expressive region signature and geometry More complex may mean less stable, however

Robot Vision Localization Scene-based Vision Localization Scenes as a whole as Landmarks Color histograms [Ulrich and Nourbakhsh 2000] Fourier Transform [Oliva & Torralba 2001] Wavelet pyramids [Torralba 2003] Histogram of Dominant features [Renniger & Malik 2004] Accuracy: Lends itself more to topological mapping Resolution: localization within place is needed Naturally view invariance Efficiency: Can be done in lower resolution Scalability: stability and uniqueness Learn a smaller set of scene features Addition of new environments present uniqueness problem Places can look more and more the same

Gist Definition and background Nature of tasks done with gist Essence, holistic characteristics of an image Context information obtained within a eye saccade (app. 150 ms.) Evidence of place recognizing cells at Parahippocampal Place Area (PPA) Biologically plausible models of Gist are yet to be proposed Nature of tasks done with gist Scene categorization/context recognition Region priming/layout recognition Resolution/scale selection

Human Vision Architecture Visual Cortex: Low level filters, center-surround, and normalization Saliency Model: Attend to pertinent regions Gist Model: Compute image general characteristics High Level Vision: Object recognition Layout recognition Scene understanding

Gist Model Utilize the same Visual Cortex raw features in the saliency model [Itti 2001] Gist is theoretically non-redundant with Saliency Gist vs. Saliency Instead of looking at most conspicuous locations in image, looks at scene as a whole Detection of regularities, not irregularities Cooperation (Accumulation) vs. competition (WTA) among locations More spatial emphasis in saliency Local vs. global/regional interaction

Gist Model Implementation V1 Raw image feature-Maps Orientation Channel Gabor filters at 4 angles (0,45,90,135) on 4 scales = 16 sub-channels Color: red-green and blue-yellow center surround each with 6 scale combinations = 12 sub-channels Intensity dark-bright center-surround with 6 scale combinations = 6 sub-channels = Total of 34 sub-channels

Gist Model Implementation Gist Feature Extraction Average values of predetermined grid

Gist Model Implementation Dimension Reduction Original: 34 sub-channels x 16 features = 544 features PCA/ICA reduction: 80 features Kept >95% of variance PCA/ICA reduction Too much redundancy Reduction matrix is too random to decipher

Gist Model Implementation Dimension Reduction Original: 34 sub-channels x 16 features = 544 features PCA/ICA reduction: 80 features Kept >95% of variance Place Classification Three-layer neural networks PCA/ICA reduction Too much redundancy Reduction matrix is too random to decipher

System Example Run

Testing & Results Site selection: Various lighting conditions Different challenges appearance-wise Variability in area covered/ path lengths Various lighting conditions Single-view filming Clean break between segments Scalability: combine all sites

Map of Experiment Sites

Site 1: Building Complex

Site 1 Experiment Input Image Gist Feature-vectors System Output PCA/ICA reduced features

Site 1 Results Output Label Assigned Label

Site 2:Vegetation-filled Park

Site 2 Result Output Label Assigned Label

Site 2 Experiment Input Image Gist Feature-vectors System Output PCA/ICA reduced features

Site 3: Open Field Park

Site 3 Experiment Input Image Gist Feature-vectors System Output PCA/ICA reduced features

Site 3 Result Output Label Assigned Label

Combined Sites Result

Discussion & Conclusion Result of current model: Success rate between 82.48% and 87.93% Combined rate of 85.96% 4.73% error in inter-site classification Integrating saliency for robot navigation Localization within segment Identifying discriminating cues in the environment Issues in object-based systems still applies Bad view detection Foreground objects sometimes occlude whole view Obstacle avoidance, exploration, etc.

Discussion Integration of gist and saliency in general Single representation of both models Influence of saliency to gist and vice versa Involvement of saliency in improving gist estimation Gist helpful in identifying/filtering salient location Testing the limits of Gist: psychophysics experiments Change blindness test for large scale layout changes Varying exposure time Isolation of bottom up - top down influences