Presentation is loading. Please wait.

Presentation is loading. Please wait.

Goals of Computer Vision

Similar presentations


Presentation on theme: "Goals of Computer Vision"— Presentation transcript:

1 Goals of Computer Vision
To make useful decisions based on sensed images To construct 3D structure from 2D images.

2 Related Areas of Computer Vision
Computer Graphics Image Processing Pattern Recognition Artificial Intelligence Virtual Reality

3 Applications of Computer Vision
Human Face Identification Image Database Query Inspecting Products Examining the Inside of a Human Head (MRI) Optic Character Recognition (OCR) Analyze Satellite Image Unmanned Autonomous Vehicle (UAV)

4 Human Visual Perception

5 Human eye

6 Human eye is more sensitive to luminance than to chrominance
How do we see an object? Light Eye Object Luminance  Lightness  Rods Chrominance  Color  Cones Human eye is more sensitive to luminance than to chrominance

7 Cones & Rods ( day & night )

8 Machine Visual Perception

9 Digital Image

10 Sampling & Quantization

11 Types of Images Analog Image Digital Image Binary Image
Gray-scale Image Color (Multispectral) Image

12 Image Formats Vector Image Bitmap Image RAW  no header
RLE (Run-Length Encoding) PGM,PPM,PNM (Portable Gray Map) GIF (Graphics Interchange Format)  no more than 256 colors TIF (Tag Image File Format)  Scanner EPS (Encapsulated Postscript)  Printer JPEG (Joint Photographic Experts Group)  Compression ratio MPEG (Motion Picture Experts Group)  Video

13 Comparison of Image Formats

14 Chap 3 : Binary Image Analysis

15 Counting Foreground Objects

16 Example

17 Counting Object Algorithm

18 Counting Background Objects

19 Connected Component Labeling

20 1. Recursive Connected Components Algorithm

21 Example

22 2. Classical Connected Components Algorithm

23 Size Filter To remove small size noise 
After connected component labeling, all components below T in size are removed by changing the corresponding pixels to 0.

24 Pepper & Salt Noise Reduction
Change a pixel from 0 to 1 if all neighborhood pixels of the pixel is 1 Change a pixel from 1 to 0 if all neighborhood pixels of the pixel is 0

25 Expanding & Shrinking

26 Example 1

27 Example 2

28 Morphological Filter

29 Example

30 Example

31 Closing & Opening

32 Opening Example

33 Morphological Filter Example 1

34 Structure Element Example 1

35 Morpho-logical Filter Example 2

36 Structure Element Example 2

37 Conditional Dilation

38 Conditional Dilation Example

39 Area & Centroid

40 Perimeter

41 Circularity

42 Second Moment

43 Orientation

44 Bounding Box

45 Thresholding

46 P-Tile Method

47 Mode Method

48 Mode Algorithm

49 Iterative Method

50 Adaptive Method

51 Adaptive Method Example

52 Variable Thresholding Example

53 Double Thresholding Method

54 Double Thresholding Example

55 Recursive Histogram Clustering

56 Chapter 4: Pattern Recognition

57 Classification Classification is a process that assigns a label to an object according to some representation of the object’s properties. Classifier is a device or algorithm that inputs an object representation and outputs a class label. Reject class is a generic class for objects that cannot be placed in any of the designated known classes.

58 Error & Reject rate Empirical error rate of a classification system is the number of errors made on independent test data divided by the number of classifications attempted. Empirical reject rate of a classification system is the number of rejects made on independent test data divided by the number of classifications attempted.

59 False Alarm & Miss Detection
Two-class problem example: whether a person has disease or not False Alarm (False Positive): The system incorrectly says that the person does have disease. Miss Detection (False Negative): The system incorrectly says that the person does not have disease.

60 Receiver Operating Curve (ROC)

61 Precision & Recall Example: the objective of document retrieval (image retrieval) is to retrieve interesting objects and not too many uninteresting objects according to features supplied in a user’s query. Precision is the number of relevant documents retrieved divided by the total number of documents retrieved. Recall is the number of relevant documents retrieved by the system divided by the total number of relevant documents in the database.

62 Example Suppose an image database contains 200 sunset images.
Suppose an automatic retrieval system retrieves 150 of those 200 relevant images and 100 other images.

63 Features used for representation
Area of the character in units of black pixels Height and Width of the bounding box of its pixels Number of holes inside the character Number of strokes forming the character Center (Centroid) of the set of pixels Best axis direction (Orientation) through the pixels as the axis of least inertia Second moments of the pixels about the axis of least inertia and most inertia

64 Example Features

65 Classification using nearest mean

66 Euclidean Distance

67 Example

68 Classification using nearest neighbors
A brute force approach computes the distance from x to all samples in the database and remembers the minimum distance. Then, x is classified into the same category as its nearest neighbor. Advantage  new labeled samples can be added to the database at any time. A better approach is the k nearest neighbors rule.

69 Structural Pattern Recognition
Graph matching algorithm can be used to perform structural pattern recognition.

70 Two characters with the same global features but different structure
Lid : a virtual line segment that closes up a bay. Left : specifies that one lid lies on the left of another. Right : specifies that one lid lies on the right of another.

71 Confusion Matrix Reject Rate = Error Rate =

72 Decision Tree 1

73 Decision Tree 2

74 Automatic Construction of a Decision Tree
Information content I(C;F) of the class variable C with respect to the feature variable F is defined by The feature variable F with maximum I(C,F) will be selected as the first feature to be tested.

75 Example I(C,X) = I(C,Y) = I(C,Z) =

76 General Case At any branch node of the tree when the selected feature does not completely separate a set of training samples into the proper classes  the tree construction algorithm is invoked recursively for the subsets of training samples at each of the child nodes.

77 Bayesian Decision Making

78 Bayesian classifier Bayesian classifier classifies an object into the class to which it is most likely to belong based on the observed features. In other words, it makes the classification decision wi for the maximum p(x) is the same for all the classes, so compare p(x|wi)P(wi) is enough. Poisson, Exponential, and Normal (Gaussian) distributions are commonly used for p(x|wi).

79 Chapter 5. Filtering Images

80 Image Enhancement & Restoration
Image Enhancement operators improve the detectability of important image details or objects by man or machine. Image Restoration attempts to restore a degraded image to an ideal condition.

81 Histogram Stretching

82 Histogram Equalization
The output image should use all available gray levels The output image has approximately the same number of pixels of each gray level.

83 Histogram Equalization Example

84 Steps to perform Histogram Equalization
Calculate cumulative histogram. Normalize by dividing by the total number of pixels. Multiply these values by the maximum gray-level values, then round the result to the closest integer. Map the original values to the results from step 3.

85 Practice Original Gray-Level Number of pixels Step 1 Step 2 Step 3 10
10 1 8 2 9 3 4 14 5 6 7

86 Removal of small regions
Removal of Salt-and-Pepper Noise Removal of Small Components

87 Convolution process Kernel or Mask

88 Convolution

89 Image Smoothing Mean Filtering Gaussian Filtering Median Filtering
Smoothing uniform regions Preserve edge structure

90 Mean Filtering Example

91 Gaussian Filtering Masks

92 Properties of smoothing masks
The amount of smoothing and noise reduction is proportional to the mask size. Step edges are blurred in proportion to the mask size.

93 Median Filtering Example

94 Detecting Edges

95 Edge Detection Masks

96 Properties of derivative masks
The sum of coordinates of derivative masks is zero so that a zero response is obtained on constant regions. First derivative masks produce high absolute values at point of high contrast. Second derivative masks produce zero-crossings at points of high contrast.

97 Edge Magnitude & Orientation

98 Edge Detection Example

99 Laplacian Of Gaussian (LOG)

100 Two equivalent methods
Convolve the image with a Gaussian smoothing filter and compute the Laplacian of the result. Convolve the image with the linear filter that is the Laplacian of the Gaussian filter. 1 2

101 Zero crossing detection
A zero crossing at a pixel implies that the values of the two opposing neighboring pixels in some direction have different signs. There four cases to test: up/down left/right up-left/down-right up-right/down-left

102 Gaussian Equations

103 Gaussian Plots

104 Gaussian Properties Symmetry matrix
95% of the total weight is contained within 2 of the center. In the first derivative of 1D Gaussian, extreme points are located at – and + . In the second derivative of 1D Gaussian, zero crossings are located at – and + . The LOG filter responds well to: small blobs coinciding with the center lobe. large step edges very close to the center lobe.

105 LOG Masks

106 LOG Example

107 Chapter 6. Color & Shading

108 Perception of objects

109 Perception of objects The spectrum (energy) of light source.
The spectral reflectance of the object surface. The spectral sensitivity of the sensor.

110 Human eye is more sensitive to luminance than to chrominance
How do we see an object? Light Eyes Object Luminance  Lightness Chrominance  Color Human eye is more sensitive to luminance than to chrominance

111 Light Spectrum

112 Chromaticity Diagram

113 RGB Model

114 RGB signals from a video camera

115 RGB Colors Colors specify: A mixture of red, green, and blue light
Values between 0.0 (none) and 1.0 (lots) Color Red Green Blue White Black Yellow Magenta Cyan

116 Normalized RGB r+g+b=1

117 HSI Model

118 Light vs. Pigment

119 CMY Model

120 YIQ Model TV transmission  digital space  YCBCR
 analog space  YIQ (NTSC)  YUV (PAL)

121 YUV & YCBCR Model

122 TV Broadcast

123 Color Histogram Color Histogram are relatively invariant to
Translation Rotation Scaling Simple methods for color histogram construction Concatenate the higher order two bits of each RGB color code.  64 bins Compute three separate RGB histograms (4 bits each) and just concatenate them into one.  48 bins

124 Similarity measure for histogram matching
It is common to smoothing the histogram before matching  to adapt minor shifts of the reflectance spectrum.

125 Color Segmentation A plot of pixels (r,g) taken from different images containing faces. (r,g) : normalized red and green values

126 Face Detection Face region classification. (R>G>B)
Connected Component Labeling. Select the largest component as face object assuming there is only one face in the image. Discard remaining components or merges them with the face object. Computing the location of eyes and nose.

127 Shading

128 Three types of Material Reflection
Diffuse  Color of reflected light from diffuse reflection (light scattered randomly) Ambient  Amount of background light the surface reflects Specular  Color of reflected light from specular reflection (light reflected in a regular manner)

129 Diffuse Reflection

130 Specular Reflection

131 Darkening with Distance

132 Complications The above models of illumination and reflection are simplified. Some objects reflect light as well as emit light. For example: light bulbs. In uncontrolled scenes, such as outdoor scenes, it is much more difficult to account for the different phenomena.

133 Chapter 7 Texture

134 Texture Analysis Structural approach Statistical approach

135 Structural approach

136 Shape from texture Size Shape Density

137 Statistical approach Edge density and direction Local binary partition
Co-occurrence matrices and features Laws texture energy measures Autocorrelation

138 Edge density and direction
Gradient magnitude : Mag(p) Gradient direction : Dir(p)

139 Histogram based texture description
Magnitude histogram: Hmag(R) Direction histogram: Hdir(R)

140 Example

141 L1 Distance between two histograms
Another similarity measure for histogram matching

142 Local binary partition
For each pixel p in the image, the eight neighbors are examined to see if their intensity is greater than that of p. The results from the eight neighbors are used to construct an eight-digit binary number b1b2b3b4b5b6b7b8 bi=0 if the intensity of the ith neighbor is less than or equal to that of p bi=1 otherwise A histogram of these numbers is used to represent the texture of the image.

143 Practice

144 Co-occurrence matrices

145 Practice

146 Two variations of the co-occurrence matrix
Normalized co-occurrence matrix Symmetric co-occurrence matrix

147 Features derived from co-occurrence matrix

148 Laws texture energy measures
Remove the effects of illumination by moving a small window around the image, and subtracting the local average from each pixel. Nine different 5x5 masks are applied to the preprocessed image, producing 9 images. Smooth the images using absolute mean filter. The output is a single image with a vector of nine texture attributes at each pixel.

149 Laws nine 5x5 masks L5 (Level) = [1 4 6 4 1] E5 (Edge) = [-1 -2 0 2 1]
S5 (Spot) = [ ] R5 (Ripple) = [ ] L5E5/E5L5 L5S5/S5L5 L5R5/R5L5 E5S5/S5E5 E5R5/R5E5 S5R5/R5S5 E5E5 S5S5 R5R5

150 Autocorrelation To describe the fineness/coarseness of the texture.

151 Texture Segmentation

152 Chapter 8 Content-Based Image Retrieval

153 Image Database Queries
Query By Keyword: Some textual attributes (keywords) should be maintained for each image. The image can be indexed according to these attributes, so that they can be rapidly retrieved when a query is issued. This type of query can be expressed in Structured Query Language (SQL). Query By Example (QBE): User just show the system a sample image, then the system should be able to return similar images or images containing similar objects.

154 Image Distance & Similarity Measures
Color Similarity Texture Similarity Shape Similarity Object & Relationship similarity

155 Color Similarity Color percentages matching: R:20%, G:50%, B:30%
Color histogram matching Dhist(I,Q)=(h(I)-h(Q))TA(h(I)-h(Q)) A is a similarity matrix  colors that are very similar should have similarity values close to one.

156 Color Similarity Color layout matching: compares each grid square of the query to the corresponding grid square of a potential matching image and combines the results into a single image distance  where CI(g) represents the color in grid square g of a database image I and CQ(g) represents the color in the corresponding grid square g of the query image Q. some suitable representations of color are Mean Mean and standard deviation Multi-bin histogram

157 IQ based on Color Layout

158 Texture Similarity Pick and click
Suppose T(I) is a texture description vector which is a vector of numbers that summarizes the texture in a given image I (for example: Laws texture energy measures), then the texture distance measure is defined by Texture layout

159 IQ based on Pick and Click

160 Shape Similarity Shape Histogram Boundary Matching Sketch Matching

161 1. Shape Histogram Projection matching
Horizontal & vertical projection: Each row and each column become a bin in the histogram. The count that is stored in a bin is the number of 1-pixels that appear in that row or column. Diagonal projection: An alternative is to define the bins from the top left to the bottom right of the shape. Size invariant  the number of row bins and the number of column bins in the bounding box can be fixed, histograms can be normalized before matching. Translation invariant Rotation invariant  compute the axis of the best-fitting ellipse and rotate the shape

162 Horizontal and vertical projections

163 Diagonal projection

164 1. Shape Histogram Orientation histogram
Construct a histogram over the tangent angle at each pixel on the boundary of the shape. Size invariant  histograms can be normalized before matching. Translation invariant Rotation invariant  choosing the bin with the largest count to be the first bin. Starting point invariant

165 2. Boundary Matching 1D Fourier Transform on the boundary

166 Fourier Descriptors If only the first M coefficients (a0, a1, …, aM-1) are used, then is an approximation of un the coefficients (a0, a1, …, aM-1) is called Fourier Descriptors The Fourier distance measure is defined as:

167 Properties of Fourier Descriptors
Simple geometric transformations of a boundary, such as translation, rotation, and scaling, are related to simple operations of the boundary’s Fourier descriptors.

168 A secret formula A formula for Fourier Descriptor that is invariant to translation, scaling, rotation, and starting point.

169 IQ based on Boundary Matching

170 3. Sketch Matching Affine transformation to specified size and applying median filter. Edge detection using a gradient-based edge-finding algorithm  Refined edge image Thinning and shrinking  Abstract image The images are divided into grid squares and matching is performed based on local correlation.

171 3. Sketch Matching The sketch distance measure is the inverse of the sum of each of the local correlations  where I(g) refres to grid square g of the abstract image I, Q(g) refers to grid square g of the linear sketch resulting from query image Q.

172 Object and Relational Similarity
Face finding: Neural net classifier Flesh finding: then threshold based on

173 Chapter 9. Motion from 2D Image Sequence

174 Cases of motion Still camera, still object Still camera, moving object
Moving camera, still object Moving camera, moving object Single object Multiple objects

175 Applications Many Safety and Security applications
Automatically switch on a light upon detection of significant motion. Tracks the players and ball in a tennis match and provides an analysis of the elements of the game.

176 Uses of motion Create more observations than available from a single viewpoint. Provide information to compute relative depth of objects since the images of close objects change faster than the images of remote objects. Shape from Motion  the multiple viewpoints allow for a triangulating computation similar to binocular stereo.

177 Image Subtraction Subtract the image It from the previous image It-1
Subtract the image I from the background image B

178 Algorithm for Image Subtraction

179 Motion Vectors Motion field Focus of expansion (FOE)
Focus of contraction (FOC) Image flow

180 Decathelete Game

181 Detecting interesting points

182 Search Rectangle

183 Algorithm for Motion Vectors

184 Computing the paths of moving points
 Track by motion only, no texture or color information for moving objects  Assumptions The location of a physical object changes smoothly over time. The velocity of a physical object changes smoothly over time (including both speed and direction) An object can be at only one location in space at a given time. Two objects cannot occupy the same location at the same time.

185 Trajectories of two objects

186 Smoothness If an object i is observed at time instants t=1,2,…,n, then the sequence of image points Ti=(pi,1, pi,2, …, pi,n) is called the trajectory of i.

187 Greedy-Exchange Algorithm

188 Example of Greedy-Exchange Algorithm

189 Detecting significant changes in videos
Scene change Shot change Camera pan Camera zoom Camera effects: fade, dissolve, and wipe  Segment and store video subsequences in digital libraries for random access.

190 Segmenting Video Sequences
The transitions can be used to segment the video and can be detected by large changes in the features of the images over time.

191 Similarity measure by histogram

192 Similarity measure by likelihood ratio
Break the image into larger blocks and test to see if a majority of the blocks are essentially the same in both images.

193 Moving cursor by tracking face

194 Demo

195 Chapter 10. Image Segmentation

196 10.1 Identifying Regions Clustering Region Growing

197 Clustering

198 Iterative K-Means Clustering

199 Isodata Clustering

200 Region Growing (Split & Merge Algorithm)
Split the image into equally sized regions. Calculate the gray level variance for each region If the gray level variance is larger than a threshold, then split the region. Otherwise, an effort is made to merge the region with its neighbors. Repeat Step 2 & 3. Gray level variance :

201 Region Growing

202 Recursive Histogram Clustering

203 10.2 Representing Regions Overlays Labeled Images Boundary Coding
Chain Code Polygonal Approximation Quadtrees Property Tables

204 Overlaying

205 Boundary Coding

206 Quadtrees

207 10.3 Identifying Contours Tracking Existing Region Boundaries (labeled image as input) Canny Edge Detector (gray-scale image as input) Aggregating Consistent Neighboring Edges into Curves (binary edge image as input) Hough Transform (gray-scale image as input)

208 Finding the Borders of Labeled Regions

209 Canney Edge Detector

210 Canny Edge Detector Example

211 Tracking Edges of a Binary Edge Image

212 Hough Transform

213 Accumulator array for Hough Transform

214 Hough Transform for Accumulating Straight Lines

215 Hough Transform Example

216 Hough Transform for Extracting Straight Lines

217 Chap 12. Perceiving 3D from 2D Images

218 Geometric model

219 Epipolar geometry (trivial)

220 Epipolar geometry (general)

221 Epipolar lines and Epipolar curves
圖十、Epipolar Constraint e2 e1 (a) (b)


Download ppt "Goals of Computer Vision"

Similar presentations


Ads by Google