Autonomous Robots Vision © Manfred Huber 2014.

Slides:



Advertisements
Similar presentations
November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.
Advertisements

Transforming images to images
DREAM PLAN IDEA IMPLEMENTATION Introduction to Image Processing Dr. Kourosh Kiani
October 2, 2014Computer Vision Lecture 8: Edge Detection I 1 Edge Detection.
Digital Image Processing
Image Processing IB Paper 8 – Part A Ognjen Arandjelović Ognjen Arandjelović
Image Segmentation Image segmentation (segmentace obrazu) –division or separation of the image into segments (connected regions) of similar properties.
Multimedia communications EG 371Dr Matt Roach Multimedia Communications EG 371 and EE 348 Dr Matt Roach Lecture 6 Image processing (filters)
Quadtrees, Octrees and their Applications in Digital Image Processing
Chapter 8. Basic Image Process The visual information can be recorded by a TV camera or a 2D array of CCD sensors. A digital image is formed.
1Ellen L. Walker Edges Humans easily understand “line drawings” as pictures.
December 5, 2013Computer Vision Lecture 20: Hidden Markov Models/Depth 1 Stereo Vision Due to the limited resolution of images, increasing the baseline.
Medical Imaging Mohammad Dawood Department of Computer Science University of Münster Germany.
6/9/2015Digital Image Processing1. 2 Example Histogram.
Image Filtering CS485/685 Computer Vision Prof. George Bebis.
Multimedia Data Introduction to Image Processing Dr Mike Spann Electronic, Electrical and Computer.
3. Introduction to Digital Image Analysis
EE663 Image Processing Edge Detection 2 Dr. Samir H. Abdul-Jauwad Electrical Engineering Department King Fahd University of Petroleum & Minerals.
Processing Digital Images. Filtering Analysis –Recognition Transmission.
Quadtrees, Octrees and their Applications in Digital Image Processing
(1) Feature-point matching by D.J.Duff for CompVis Online: Feature Point Matching Detection, Extraction.
CS443: Digital Imaging and Multimedia Filters Spring 2008 Ahmed Elgammal Dept. of Computer Science Rutgers University Spring 2008 Ahmed Elgammal Dept.
Image Enhancement.
Smart Traveller with Visual Translator. What is Smart Traveller? Mobile Device which is convenience for a traveller to carry Mobile Device which is convenience.
November 29, 2004AI: Chapter 24: Perception1 Artificial Intelligence Chapter 24: Perception Michael Scherger Department of Computer Science Kent State.
Introduction to Image Processing Grass Sky Tree ? ? Review.
CS 376b Introduction to Computer Vision 02 / 26 / 2008 Instructor: Michael Eckmann.
Neighborhood Operations
Machine Vision ENT 273 Image Filters Hema C.R. Lecture 5.
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Image Features Kenton McHenry, Ph.D. Research Scientist.
Multimedia and Time-series Data
Multimedia Data Introduction to Image Processing Dr Sandra I. Woolley Electronic, Electrical.
Quadtrees, Octrees and their Applications in Digital Image Processing.
November 13, 2014Computer Vision Lecture 17: Object Recognition I 1 Today we will move on to… Object Recognition.
Image Filtering Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/02/10.
Medical Image Analysis Dr. Mohammad Dawood Department of Computer Science University of Münster Germany.
December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.
1 Artificial Intelligence: Vision Stages of analysis Low level vision Surfaces and distance Object Matching.
COMP322/S2000/L171 Robot Vision System Major Phases in Robot Vision Systems: A. Data (image) acquisition –Illumination, i.e. lighting consideration –Lenses,
Copyright Howie Choset, Renata Melamud, Al Costa, Vincent Lee-Shue, Sean Piper, Ryan de Jonckheere. All Rights Reserved Computer Vision.
CSC508 Convolution Operators. CSC508 Convolution Arguably the most fundamental operation of computer vision It’s a neighborhood operator –Similar to the.
October 7, 2014Computer Vision Lecture 9: Edge Detection II 1 Laplacian Filters Idea: Smooth the image, Smooth the image, compute the second derivative.
Intelligent Vision Systems ENT 496 Image Filtering and Enhancement Hema C.R. Lecture 4.
1 Machine Vision. 2 VISION the most powerful sense.
Colour and Texture. Extract 3-D information Using Vision Extract 3-D information for performing certain tasks such as manipulation, navigation, and recognition.
CS 691B Computational Photography
October 1, 2013Computer Vision Lecture 9: From Edges to Contours 1 Canny Edge Detector However, usually there will still be noise in the array E[i, j],
CSE 6367 Computer Vision Image Operations and Filtering “You cannot teach a man anything, you can only help him find it within himself.” ― Galileo GalileiGalileo.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 4 – Audio and Digital Image Representation Klara Nahrstedt Spring 2010.
Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.
Visual Vocabulary Elements of art, principles of design, and much more!
Edge Segmentation in Computer Images CSE350/ Sep 03.
: Chapter 5: Image Filtering 1 Montri Karnjanadecha ac.th/~montri Image Processing.
CSE 185 Introduction to Computer Vision Image Filtering: Spatial Domain.
September 26, 2013Computer Vision Lecture 8: Edge Detection II 1Gradient In the one-dimensional case, a step edge corresponds to a local peak in the first.
1. 2 What is Digital Image Processing? The term image refers to a two-dimensional light intensity function f(x,y), where x and y denote spatial(plane)
Environmental Remote Sensing GEOG 2021
REMOTE SENSING Digital Image Processing Radiometric Enhancement Geometric Enhancement Reference: Chapters 4 and 5, Remote Sensing Digital Image Analysis.
Digital 2D Image Basic Masaki Hayashi
Feature description and matching
Fourier Transform: Real-World Images
Computer Vision Lecture 4: Color
Image Processing - in short
Range Imaging Through Triangulation
Computer Vision Lecture 9: Edge Detection II
Computer Vision Lecture 16: Texture II
CSC 381/481 Quarter: Fall 03/04 Daniela Stan Raicu
Object Recognition Today we will move on to… April 12, 2018
Image segmentation Grey scale image Binary image
Review and Importance CS 111.
Presentation transcript:

Autonomous Robots Vision © Manfred Huber 2014

Active vs. Passive Sensing Active sensors (sensors that transmit a signal) provide a relatively easy way to extract data Looking only for the data they transmitted makes identifying the information simpler Laser scanners need to extract only one light frequency Sonar “listens” only for a specific frequency and modulation Structured light (e.g. Kinect) only looks for specific patterns But active sensors interfere with the environment (and with each other) Two identical sensors interfere with each other’s sensing Active transmission of sound or light can interfere with objects Active sensors transmit their own signal and can thus be engineered in such a way that it is easy to detect and extract the information (e.g. transmission in a frequency band or with a modulation that occurs rarely (or never) in nature). Extracting the information is then independent of what is actually in the environment (objects, etc.). All that has to be done is looking for the special signal and compute either time of flight or to triangulate the (known) signal. But there are drawbacks. Active transmission can interfere both with other sensors and with objects or entities in the environment. E.g. sonar and dogs sometimes do not go together that well. © Manfred Huber 2014

Active vs. Passive Sensing Passive sensors use only the signals present in the actual environment Passive sensors do not interfere with each other and the environment Extraction of information is substantially harder The signal that has to be found is less predictable Need to find a “common signature” or common traits that objects of interest have so we can look for them Passive sensors are usually less precise but can often be used in a wider range of environments Passive sensors require to extract naturally occurring patterns which are generally not as repeatable and as reliable. Much higher computational effort but the signals do not get drowned out (e.g. active light-based sensors often don’t work in very bright or hot environments). © Manfred Huber 2014

Vision Vision is on of the sensor modalities that provides the most information if it can be extracted successfully Can provide remote information about existing objects Stereo vision can determine distances to objects Requires solution of the correspondence problem Requires to find patterns that indicate the presence and identity of objects Vision is one of the most important senses to us because it works at distance and it can provide a huge amount of information at once. In exchange this means that a lot of processing has to be done to work with vision. © Manfred Huber 2014

Spatial vs. Frequency Space As with auditory signals, images can be processed either in a spatial or in the frequency domain In the spatial domain, images are 2 dimensional arrays of intensity values In the frequency domain images are represented in terms of frequencies and phases Strong brightness changes are high frequencies while uniform intensities are low frequencies As with audio, images can be represented in terms of spatial representations or in the frequency domain. © Manfred Huber 2014

Spatial vs. Frequency Space Frequency domain (e.g. Fourier space) makes extracting certain properties easier Here frequency domain shows that most high contrast lines (sharp edges) are oriented at 45° Frequency analysis can make certain properties easier to extract but is generally less intuitive. While it is frequently used in computer vision we will in the robotics course concentrate on spatial analysis of images since it is more intuitive. © Manfred Huber 2014

Low Level Vision Image processing and computer vision deals with the automatic analysis of images For many techniques, images have to be represented as 2D arrays of numbers Color images have 3 components Red Green Blue in RGB space – a color is a point in 3D space Hue Saturation Value in HSV space – color becomes a single number, the others represent color intensity and brightness For many vision techniques we have to extract patterns in only one of them at a time (often intensity) Image processing techniques often have to work on a single number per pixel. Most often this is intensity (basically treating the image as grey-scale). But it could also be hue or anything else. © Manfred Huber 2014

Early Vision and Visual Processing Many models for visual processing in animals and humans have been devised A very common model sees vision and object recognition as a sequence of processing phases Different information is extracted in each stage The final goal is to recognize what objects are in the image and where they are © Manfred Huber 2014

Early Vision and Visual Processing Filtering Feature Extraction Grouping Box Recognition Filtering: Used to clean and normalize images Feature extraction: Finds locations where simple patterns are present in the image Grouping: Puts together local features into object parts Recognition: Identifies object identity A general vision model follows the filtering – feature extraction – grouping – recognition sequence. Starting with an image, * filtering normalizes and cleans image, * feature extraction pulls out local patterns resulting in a set of feature maps (note that these are not really images but rather they are arrays with pattern strength values – they do not have to look like the features. E.g. a corner feature results not in something that looks like a corner but rather like a single point). * grouping puts together local features into larger parts (either boundaries or surfaces * recognition, finally, identifies objects The first 2 steps are often called early vision and there is good evidence that they exist in the brain (e.g. there is evidence of edge detectors in the visual system. © Manfred Huber 2014

Filtering Filtering serves the purpose of making subsequent steps earlier Removing noise from the image Normalizing the images Adjusting for different brightness Adjusting for contrast differences In nature some filtering is performed in hardware and some in “software” Iris adjustment to normalize global brightness Local brightness adjustment Filtering is a first step. It mainly cleans up images and is intended to provide a more uniform/normalized image to simplify processing. © Manfred Huber 2014

Filtering Effect of local filtering can be profound Squares A and B are actually exactly the same brightness Filtering in the human visual system can be seen in certain visual illusions. In order to be able to better judge colors the human system normalizes based on local neighborhoods (thus making it impossible to see the actual brightness but only what it would represent given the rest of the image). © Manfred Huber 2014

Filtering Histogram equalization Adjusting brightness to be equally distributed While it doesn’t add information it can make it easier to find Histogram equalization tries to make sure that each intensity occurs about equally often (i.e. that the cumulative frequency distribution over the intensity values is about a line). Stretch and compress intensities to make each intensity interval equally likely © Manfred Huber 2014

Filtering Smoothing Removing noise to reduce effect of single pixels Replace each pixel with the average of the local neighborhood Convolution provides a mechanism for this Convolution computes cross-correlation value Smoothing allows to reduce noise. Smoothing can be uniform (like here) or it can use any other function (e.g. gaussian) © Manfred Huber 2014

Convolution Convolution is a very generic mechanism for filtering and feature extraction Cross-correlation computes the “similarity” between a local image region and a template Convolution computes cross-correlation for each pixel 112 85 112 112 85 112 85 1/9 Convolution is a general mechanism to computer simiarlity of each image region with a templeate Final value is a measure of similarity. The larger the more similar. Important to note that around the boundary no cross-correlation can be computed - either feature map has to be smaller than image - or default “padding” values have to be used for the parts tha are not on the image (e.g. average intensity of the image) 255 85 112 112 85 0 85 85 112 85 112 112 85 112 85 112 85 © Manfred Huber 2014

Feature Extraction Feature extraction is aimed at identifying locations in which particular patterns (features) occur Identifying parts that could indicate objects Features have to be simple so this can be done fast Edgels (small parts of edges) Corners Texture features Motion features … Convolution is often used for feature extraction There is good evidence that our visual system has specialized feature extractors (lines, motion, etc.) © Manfred Huber 2014

Edge Detection Edges are some of the most prominent low-level features Edges make up the outline of most objects Edges separate geometric shapes Edge detection using convolution requires edge templates Edge templates should look like edges Normalized templates average to 0 Edge detection is an important part. © Manfred Huber 2014

Edge Templates Edge templates can be distinguished based on their size and how sensitive to orientation they are Roberts templates Pewit templates Sobel templates 1 -1 1 -1 Roberts templates are the simplest but they are very sensitive to bad pixel values Pewit is more robust because it is larger. It is very sensitive to orientation. Sobel is less sensitive to orientation than Pewit (because of the increased emphasis on the center pivot. 1 -1 1 -1 1 -1 1 -1 1 -1 2 -2 1 2 -1 -2 2 1 -1 -2 1 2 -1 -2 © Manfred Huber 2014

Edge Templates Using just horizontal and vertical templates it is possible to estimate edge angles Horizontal cross-correlation is proportional to the horizontal component (cosine of the angle) Vertical cross-correlation is proportional to the vertical component (sine of the angle) Orientation-independent edge templates Laplacian template Sobel templates Angle estimation from horizontal and vertical templates can be performed. Laplacian templates look for contrast in all directions and are thus orientation independent. However, they are more noise sensitive than Pewit or Sobel. -1 4 © Manfred Huber 2014

Edge Detection Different edge detectors have different strengths Roberts: Sobel Shows that Roberts is more noisy than sobel. © Manfred Huber 2014

Template Matching Convolution is identifying image regions that look similar Can use this to find general patterns by making the template look like the pattern we are looking for Can use an image region directly as a template Computing cross-correlation between an image region and the template (target image piece) gives a measure of how similar it is Similarity is measured per pixel and thus it ahs to be the same size and orientation to be considered similar Template matching uses an image part as the template. © Manfred Huber 2014

Normalized Template Matching Intensity values of an image are all positive Thus templates are not natively normalized What influence does the brightness of the image have on the result ? Increasing the values of the pixels increases the correlation value Brighter image regions look more “similar” Normalization can be used to compensate for this Increasing the brightness of the template or the image increases cross-correlation value (twice as bright means twice as high a value). This makes it difficult to determine if a region looks similar to the template. Normalization for brightness differences can reduce this effect. Subtracting the average of the image from the image and subtracting the average of the template from the template (thus making them average 0). © Manfred Huber 2014

Normalized Template Matching What influence does the contrast of the image have on the result ? Contrast is usually measured in terms of the standard deviation of the pixels in the image Increasing contrast scales correlation Higher contrast yields stronger positive and negative “matches” Normalization can be used to compensate for this Reduced contrast (e.g. resulting in bad lighting conditions or when over-exposing, or when being out of focus) reduces the size of the positive and negative matches. Contrast (as standard deviation) is also used in most low-cost digital cameras to drive auto focus (that is why they always run a little bit too far and then back up – they search for the point where the contrast is maximum) © Manfred Huber 2014

Global vs. Local Normalization What if there are strong lighting differences across the image ? Normalization can be performed for each image region separately to address this Higher computational cost since mean and standard deviation have to be computed once per pixel Local normalization is usually better but much more expensive. © Manfred Huber 2014