Download presentation
Presentation is loading. Please wait.
1
Goals of Computer Vision
To make useful decisions based on sensed images To construct 3D structure from 2D images.
2
Related Areas of Computer Vision
Computer Graphics Image Processing Pattern Recognition Artificial Intelligence Virtual Reality
3
Applications of Computer Vision
Human Face Identification Image Database Query Inspecting Products Examining the Inside of a Human Head (MRI) Optic Character Recognition (OCR) Analyze Satellite Image Unmanned Autonomous Vehicle (UAV)
4
Human Visual Perception
5
Human eye
6
Human eye is more sensitive to luminance than to chrominance
How do we see an object? Light Eye Object Luminance Lightness Rods Chrominance Color Cones Human eye is more sensitive to luminance than to chrominance
7
Cones & Rods ( day & night )
8
Machine Visual Perception
9
Digital Image
10
Sampling & Quantization
11
Types of Images Analog Image Digital Image Binary Image
Gray-scale Image Color (Multispectral) Image
12
Image Formats Vector Image Bitmap Image RAW no header
RLE (Run-Length Encoding) PGM,PPM,PNM (Portable Gray Map) GIF (Graphics Interchange Format) no more than 256 colors TIF (Tag Image File Format) Scanner EPS (Encapsulated Postscript) Printer JPEG (Joint Photographic Experts Group) Compression ratio MPEG (Motion Picture Experts Group) Video
13
Comparison of Image Formats
14
Chap 3 : Binary Image Analysis
15
Counting Foreground Objects
16
Example
17
Counting Object Algorithm
18
Counting Background Objects
19
Connected Component Labeling
20
1. Recursive Connected Components Algorithm
21
Example
22
2. Classical Connected Components Algorithm
23
Size Filter To remove small size noise
After connected component labeling, all components below T in size are removed by changing the corresponding pixels to 0.
24
Pepper & Salt Noise Reduction
Change a pixel from 0 to 1 if all neighborhood pixels of the pixel is 1 Change a pixel from 1 to 0 if all neighborhood pixels of the pixel is 0
25
Expanding & Shrinking
26
Example 1
27
Example 2
28
Morphological Filter
29
Example
30
Example
31
Closing & Opening
32
Opening Example
33
Morphological Filter Example 1
34
Structure Element Example 1
35
Morpho-logical Filter Example 2
36
Structure Element Example 2
37
Conditional Dilation
38
Conditional Dilation Example
39
Area & Centroid
40
Perimeter
41
Circularity
42
Second Moment
43
Orientation
44
Bounding Box
45
Thresholding
46
P-Tile Method
47
Mode Method
48
Mode Algorithm
49
Iterative Method
50
Adaptive Method
51
Adaptive Method Example
52
Variable Thresholding Example
53
Double Thresholding Method
54
Double Thresholding Example
55
Recursive Histogram Clustering
56
Chapter 4: Pattern Recognition
57
Classification Classification is a process that assigns a label to an object according to some representation of the object’s properties. Classifier is a device or algorithm that inputs an object representation and outputs a class label. Reject class is a generic class for objects that cannot be placed in any of the designated known classes.
58
Error & Reject rate Empirical error rate of a classification system is the number of errors made on independent test data divided by the number of classifications attempted. Empirical reject rate of a classification system is the number of rejects made on independent test data divided by the number of classifications attempted.
59
False Alarm & Miss Detection
Two-class problem example: whether a person has disease or not False Alarm (False Positive): The system incorrectly says that the person does have disease. Miss Detection (False Negative): The system incorrectly says that the person does not have disease.
60
Receiver Operating Curve (ROC)
61
Precision & Recall Example: the objective of document retrieval (image retrieval) is to retrieve interesting objects and not too many uninteresting objects according to features supplied in a user’s query. Precision is the number of relevant documents retrieved divided by the total number of documents retrieved. Recall is the number of relevant documents retrieved by the system divided by the total number of relevant documents in the database.
62
Example Suppose an image database contains 200 sunset images.
Suppose an automatic retrieval system retrieves 150 of those 200 relevant images and 100 other images.
63
Features used for representation
Area of the character in units of black pixels Height and Width of the bounding box of its pixels Number of holes inside the character Number of strokes forming the character Center (Centroid) of the set of pixels Best axis direction (Orientation) through the pixels as the axis of least inertia Second moments of the pixels about the axis of least inertia and most inertia
64
Example Features
65
Classification using nearest mean
66
Euclidean Distance
67
Example
68
Classification using nearest neighbors
A brute force approach computes the distance from x to all samples in the database and remembers the minimum distance. Then, x is classified into the same category as its nearest neighbor. Advantage new labeled samples can be added to the database at any time. A better approach is the k nearest neighbors rule.
69
Structural Pattern Recognition
Graph matching algorithm can be used to perform structural pattern recognition.
70
Two characters with the same global features but different structure
Lid : a virtual line segment that closes up a bay. Left : specifies that one lid lies on the left of another. Right : specifies that one lid lies on the right of another.
71
Confusion Matrix Reject Rate = Error Rate =
72
Decision Tree 1
73
Decision Tree 2
74
Automatic Construction of a Decision Tree
Information content I(C;F) of the class variable C with respect to the feature variable F is defined by The feature variable F with maximum I(C,F) will be selected as the first feature to be tested.
75
Example I(C,X) = I(C,Y) = I(C,Z) =
76
General Case At any branch node of the tree when the selected feature does not completely separate a set of training samples into the proper classes the tree construction algorithm is invoked recursively for the subsets of training samples at each of the child nodes.
77
Bayesian Decision Making
78
Bayesian classifier Bayesian classifier classifies an object into the class to which it is most likely to belong based on the observed features. In other words, it makes the classification decision wi for the maximum p(x) is the same for all the classes, so compare p(x|wi)P(wi) is enough. Poisson, Exponential, and Normal (Gaussian) distributions are commonly used for p(x|wi).
79
Chapter 5. Filtering Images
80
Image Enhancement & Restoration
Image Enhancement operators improve the detectability of important image details or objects by man or machine. Image Restoration attempts to restore a degraded image to an ideal condition.
81
Histogram Stretching
82
Histogram Equalization
The output image should use all available gray levels The output image has approximately the same number of pixels of each gray level.
83
Histogram Equalization Example
84
Steps to perform Histogram Equalization
Calculate cumulative histogram. Normalize by dividing by the total number of pixels. Multiply these values by the maximum gray-level values, then round the result to the closest integer. Map the original values to the results from step 3.
85
Practice Original Gray-Level Number of pixels Step 1 Step 2 Step 3 10
10 1 8 2 9 3 4 14 5 6 7
86
Removal of small regions
Removal of Salt-and-Pepper Noise Removal of Small Components
87
Convolution process Kernel or Mask
88
Convolution
89
Image Smoothing Mean Filtering Gaussian Filtering Median Filtering
Smoothing uniform regions Preserve edge structure
90
Mean Filtering Example
91
Gaussian Filtering Masks
92
Properties of smoothing masks
The amount of smoothing and noise reduction is proportional to the mask size. Step edges are blurred in proportion to the mask size.
93
Median Filtering Example
94
Detecting Edges
95
Edge Detection Masks
96
Properties of derivative masks
The sum of coordinates of derivative masks is zero so that a zero response is obtained on constant regions. First derivative masks produce high absolute values at point of high contrast. Second derivative masks produce zero-crossings at points of high contrast.
97
Edge Magnitude & Orientation
98
Edge Detection Example
99
Laplacian Of Gaussian (LOG)
100
Two equivalent methods
Convolve the image with a Gaussian smoothing filter and compute the Laplacian of the result. Convolve the image with the linear filter that is the Laplacian of the Gaussian filter. 1 2
101
Zero crossing detection
A zero crossing at a pixel implies that the values of the two opposing neighboring pixels in some direction have different signs. There four cases to test: up/down left/right up-left/down-right up-right/down-left
102
Gaussian Equations
103
Gaussian Plots
104
Gaussian Properties Symmetry matrix
95% of the total weight is contained within 2 of the center. In the first derivative of 1D Gaussian, extreme points are located at – and + . In the second derivative of 1D Gaussian, zero crossings are located at – and + . The LOG filter responds well to: small blobs coinciding with the center lobe. large step edges very close to the center lobe.
105
LOG Masks
106
LOG Example
107
Chapter 6. Color & Shading
108
Perception of objects
109
Perception of objects The spectrum (energy) of light source.
The spectral reflectance of the object surface. The spectral sensitivity of the sensor.
110
Human eye is more sensitive to luminance than to chrominance
How do we see an object? Light Eyes Object Luminance Lightness Chrominance Color Human eye is more sensitive to luminance than to chrominance
111
Light Spectrum
112
Chromaticity Diagram
113
RGB Model
114
RGB signals from a video camera
115
RGB Colors Colors specify: A mixture of red, green, and blue light
Values between 0.0 (none) and 1.0 (lots) Color Red Green Blue White Black Yellow Magenta Cyan
116
Normalized RGB r+g+b=1
117
HSI Model
118
Light vs. Pigment
119
CMY Model
120
YIQ Model TV transmission digital space YCBCR
analog space YIQ (NTSC) YUV (PAL)
121
YUV & YCBCR Model
122
TV Broadcast
123
Color Histogram Color Histogram are relatively invariant to
Translation Rotation Scaling Simple methods for color histogram construction Concatenate the higher order two bits of each RGB color code. 64 bins Compute three separate RGB histograms (4 bits each) and just concatenate them into one. 48 bins
124
Similarity measure for histogram matching
It is common to smoothing the histogram before matching to adapt minor shifts of the reflectance spectrum.
125
Color Segmentation A plot of pixels (r,g) taken from different images containing faces. (r,g) : normalized red and green values
126
Face Detection Face region classification. (R>G>B)
Connected Component Labeling. Select the largest component as face object assuming there is only one face in the image. Discard remaining components or merges them with the face object. Computing the location of eyes and nose.
127
Shading
128
Three types of Material Reflection
Diffuse Color of reflected light from diffuse reflection (light scattered randomly) Ambient Amount of background light the surface reflects Specular Color of reflected light from specular reflection (light reflected in a regular manner)
129
Diffuse Reflection
130
Specular Reflection
131
Darkening with Distance
132
Complications The above models of illumination and reflection are simplified. Some objects reflect light as well as emit light. For example: light bulbs. In uncontrolled scenes, such as outdoor scenes, it is much more difficult to account for the different phenomena.
133
Chapter 7 Texture
134
Texture Analysis Structural approach Statistical approach
135
Structural approach
136
Shape from texture Size Shape Density
137
Statistical approach Edge density and direction Local binary partition
Co-occurrence matrices and features Laws texture energy measures Autocorrelation
138
Edge density and direction
Gradient magnitude : Mag(p) Gradient direction : Dir(p)
139
Histogram based texture description
Magnitude histogram: Hmag(R) Direction histogram: Hdir(R)
140
Example
141
L1 Distance between two histograms
Another similarity measure for histogram matching
142
Local binary partition
For each pixel p in the image, the eight neighbors are examined to see if their intensity is greater than that of p. The results from the eight neighbors are used to construct an eight-digit binary number b1b2b3b4b5b6b7b8 bi=0 if the intensity of the ith neighbor is less than or equal to that of p bi=1 otherwise A histogram of these numbers is used to represent the texture of the image.
143
Practice
144
Co-occurrence matrices
145
Practice
146
Two variations of the co-occurrence matrix
Normalized co-occurrence matrix Symmetric co-occurrence matrix
147
Features derived from co-occurrence matrix
148
Laws texture energy measures
Remove the effects of illumination by moving a small window around the image, and subtracting the local average from each pixel. Nine different 5x5 masks are applied to the preprocessed image, producing 9 images. Smooth the images using absolute mean filter. The output is a single image with a vector of nine texture attributes at each pixel.
149
Laws nine 5x5 masks L5 (Level) = [1 4 6 4 1] E5 (Edge) = [-1 -2 0 2 1]
S5 (Spot) = [ ] R5 (Ripple) = [ ] L5E5/E5L5 L5S5/S5L5 L5R5/R5L5 E5S5/S5E5 E5R5/R5E5 S5R5/R5S5 E5E5 S5S5 R5R5
150
Autocorrelation To describe the fineness/coarseness of the texture.
151
Texture Segmentation
152
Chapter 8 Content-Based Image Retrieval
153
Image Database Queries
Query By Keyword: Some textual attributes (keywords) should be maintained for each image. The image can be indexed according to these attributes, so that they can be rapidly retrieved when a query is issued. This type of query can be expressed in Structured Query Language (SQL). Query By Example (QBE): User just show the system a sample image, then the system should be able to return similar images or images containing similar objects.
154
Image Distance & Similarity Measures
Color Similarity Texture Similarity Shape Similarity Object & Relationship similarity
155
Color Similarity Color percentages matching: R:20%, G:50%, B:30%
Color histogram matching Dhist(I,Q)=(h(I)-h(Q))TA(h(I)-h(Q)) A is a similarity matrix colors that are very similar should have similarity values close to one.
156
Color Similarity Color layout matching: compares each grid square of the query to the corresponding grid square of a potential matching image and combines the results into a single image distance where CI(g) represents the color in grid square g of a database image I and CQ(g) represents the color in the corresponding grid square g of the query image Q. some suitable representations of color are Mean Mean and standard deviation Multi-bin histogram
157
IQ based on Color Layout
158
Texture Similarity Pick and click
Suppose T(I) is a texture description vector which is a vector of numbers that summarizes the texture in a given image I (for example: Laws texture energy measures), then the texture distance measure is defined by Texture layout
159
IQ based on Pick and Click
160
Shape Similarity Shape Histogram Boundary Matching Sketch Matching
161
1. Shape Histogram Projection matching
Horizontal & vertical projection: Each row and each column become a bin in the histogram. The count that is stored in a bin is the number of 1-pixels that appear in that row or column. Diagonal projection: An alternative is to define the bins from the top left to the bottom right of the shape. Size invariant the number of row bins and the number of column bins in the bounding box can be fixed, histograms can be normalized before matching. Translation invariant Rotation invariant compute the axis of the best-fitting ellipse and rotate the shape
162
Horizontal and vertical projections
163
Diagonal projection
164
1. Shape Histogram Orientation histogram
Construct a histogram over the tangent angle at each pixel on the boundary of the shape. Size invariant histograms can be normalized before matching. Translation invariant Rotation invariant choosing the bin with the largest count to be the first bin. Starting point invariant
165
2. Boundary Matching 1D Fourier Transform on the boundary
166
Fourier Descriptors If only the first M coefficients (a0, a1, …, aM-1) are used, then is an approximation of un the coefficients (a0, a1, …, aM-1) is called Fourier Descriptors The Fourier distance measure is defined as:
167
Properties of Fourier Descriptors
Simple geometric transformations of a boundary, such as translation, rotation, and scaling, are related to simple operations of the boundary’s Fourier descriptors.
168
A secret formula A formula for Fourier Descriptor that is invariant to translation, scaling, rotation, and starting point.
169
IQ based on Boundary Matching
170
3. Sketch Matching Affine transformation to specified size and applying median filter. Edge detection using a gradient-based edge-finding algorithm Refined edge image Thinning and shrinking Abstract image The images are divided into grid squares and matching is performed based on local correlation.
171
3. Sketch Matching The sketch distance measure is the inverse of the sum of each of the local correlations where I(g) refres to grid square g of the abstract image I, Q(g) refers to grid square g of the linear sketch resulting from query image Q.
172
Object and Relational Similarity
Face finding: Neural net classifier Flesh finding: then threshold based on
173
Chapter 9. Motion from 2D Image Sequence
174
Cases of motion Still camera, still object Still camera, moving object
Moving camera, still object Moving camera, moving object Single object Multiple objects
175
Applications Many Safety and Security applications
Automatically switch on a light upon detection of significant motion. Tracks the players and ball in a tennis match and provides an analysis of the elements of the game.
176
Uses of motion Create more observations than available from a single viewpoint. Provide information to compute relative depth of objects since the images of close objects change faster than the images of remote objects. Shape from Motion the multiple viewpoints allow for a triangulating computation similar to binocular stereo.
177
Image Subtraction Subtract the image It from the previous image It-1
Subtract the image I from the background image B
178
Algorithm for Image Subtraction
179
Motion Vectors Motion field Focus of expansion (FOE)
Focus of contraction (FOC) Image flow
180
Decathelete Game
181
Detecting interesting points
182
Search Rectangle
183
Algorithm for Motion Vectors
184
Computing the paths of moving points
Track by motion only, no texture or color information for moving objects Assumptions The location of a physical object changes smoothly over time. The velocity of a physical object changes smoothly over time (including both speed and direction) An object can be at only one location in space at a given time. Two objects cannot occupy the same location at the same time.
185
Trajectories of two objects
186
Smoothness If an object i is observed at time instants t=1,2,…,n, then the sequence of image points Ti=(pi,1, pi,2, …, pi,n) is called the trajectory of i.
187
Greedy-Exchange Algorithm
188
Example of Greedy-Exchange Algorithm
189
Detecting significant changes in videos
Scene change Shot change Camera pan Camera zoom Camera effects: fade, dissolve, and wipe Segment and store video subsequences in digital libraries for random access.
190
Segmenting Video Sequences
The transitions can be used to segment the video and can be detected by large changes in the features of the images over time.
191
Similarity measure by histogram
192
Similarity measure by likelihood ratio
Break the image into larger blocks and test to see if a majority of the blocks are essentially the same in both images.
193
Moving cursor by tracking face
194
Demo
195
Chapter 10. Image Segmentation
196
10.1 Identifying Regions Clustering Region Growing
197
Clustering
198
Iterative K-Means Clustering
199
Isodata Clustering
200
Region Growing (Split & Merge Algorithm)
Split the image into equally sized regions. Calculate the gray level variance for each region If the gray level variance is larger than a threshold, then split the region. Otherwise, an effort is made to merge the region with its neighbors. Repeat Step 2 & 3. Gray level variance :
201
Region Growing
202
Recursive Histogram Clustering
203
10.2 Representing Regions Overlays Labeled Images Boundary Coding
Chain Code Polygonal Approximation Quadtrees Property Tables
204
Overlaying
205
Boundary Coding
206
Quadtrees
207
10.3 Identifying Contours Tracking Existing Region Boundaries (labeled image as input) Canny Edge Detector (gray-scale image as input) Aggregating Consistent Neighboring Edges into Curves (binary edge image as input) Hough Transform (gray-scale image as input)
208
Finding the Borders of Labeled Regions
209
Canney Edge Detector
210
Canny Edge Detector Example
211
Tracking Edges of a Binary Edge Image
212
Hough Transform
213
Accumulator array for Hough Transform
214
Hough Transform for Accumulating Straight Lines
215
Hough Transform Example
216
Hough Transform for Extracting Straight Lines
217
Chap 12. Perceiving 3D from 2D Images
218
Geometric model
219
Epipolar geometry (trivial)
220
Epipolar geometry (general)
221
Epipolar lines and Epipolar curves
圖十、Epipolar Constraint e2 e1 (a) (b)
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.