Download presentation
Presentation is loading. Please wait.
Published byStephen Carson Modified over 6 years ago
1
CSE 484: Computer Vision Fatoş T. Yarman Vural Atıl İşçen Güzelyurt, Ankara Spring 2009
2
Textbook: 1. L. Shapiro and Stockman, Computer Vision Reccomended Books: R. Szeliski, Computer Vision: Algorithms and Applications, Dec 23, 2008 D. Forsyth, J. Ponce, Computer Vision: Modern Approach B. Jahne, H. Haubacker, Computer Vision and Applications
3
Grading: Midterm: 30% Final: 40% Homework: 30% with your partner
4
What is Computer Vision?
Make the computer SEE SEE: Extracting Visual information from any sensed data Goal : Make useful decisions about objects and scenes based on sensed data
5
OBJECT perceptible material thing vision
6
Object According to Plato
Things consisting of forms and matter Forms are proper subjects of philosophical investigation, for they have the highest degree of reality. Matter is the ordinary substace
7
OBJECTS ANIMALS PLANTS INANIMATE ….. TAPIR BOAR GROUSE CAMERA NATURAL
MAN-MADE ….. VERTEBRATE MAMMALS BIRDS TAPIR BOAR GROUSE CAMERA
8
How many object categories are there?
~10,000 to 30,000 Biederman 1987
9
SCENE Consists of multiple objects
Goal : Make useful decisions about objects and scenes based on sensed data
10
Bruegel, 1564
11
Sensed Data: Images All sorts of sensor data carying visual info Optic
Thermal IR MR SAR …. Goal : Make useful decisions about objects and scenes based on sensed data
12
IMAGES: Sattelite,CT, SAR, Thermal, scientific
13
Useful Decisions Recognize, classify, detect, locallize, retrieve, annotate, varify Goal : Make useful decisions about objects and scenes based on sensed data
14
So what does recognition involve?
15
Verification: is that a lamp?
16
Detection: are there people?
17
Identification: is that Potala Palace?
18
Object categorization
mountain tree building banner street lamp vendor people
19
Scene and context categorization
outdoor city …
20
Application Domains of computer vISION
21
Traffics Assisted driving Pedestrian and car detection Lane detection
meters Ped Car Pedestrian and car detection Assisted driving Lane detection Collision warning systems with adaptive cruise control, Lane departure warning systems, Rear object detection systems,
22
Retrieval: Improving online search
Query: STREET Digital Album
23
Similarity Retrieval of Brain Data
24
Image Databases: Content-Based Retrieval
Images from my Ground-Truth collection. What categories of image databases exist today?
25
Abstract Regions for Object Recognition
Original Images Color Regions Texture Regions Line Clusters Caption!
26
Insect Identification for Ecology Studies
Doroneuria (Dor) Calineuria (Cal) Yoraperla (Yor)
27
Document Analysis
28
Surveillance: Object and Event Recognition in Aerial Videos
Original Video Frame Color Regions Structure Regions
29
Video Analysis What are the objects? What are the events?
30
3D Reconstruction of the Blood Vessel Tree
31
Recognition of 3D Object Classes from Range Data
32
3D Scanning Scanning Michelangelo’s “The David”
The Digital Michelangelo Project - UW Prof. Brian Curless, collaborator 2 BILLION polygons, accuracy to .29mm
33
The Digital Michelangelo Project, Levoy et al.
37
Tasks in Computer Vision
Segment an image into useful regions Perform measurements on certain areas Determine what object(s) are in the scene Calculate the precise location(s) of objects Visually inspect a manufactured object Construct a 3D model of the imaged object Find “interesting” events in a video liver kidney spleen
38
HISTORY OF COMPUTER VISION
39
1970
40
1980s
41
1990
42
2000s
43
Why is it Difficult? What are the Challenges
44
Challenges 1: view point variation
Michelangelo
45
Challenges 2: illumination
slide credit: S. Ullman
46
Challenges 3: occlusion
Magritte, 1957
47
Challenges 4: scale
48
Challenges 5: deformation
Xu, Beihong 1943
49
Challenges 6: background clutter
Klimt, 1913
50
Challenges 7: intra-class variation
51
image image The Three Stages of Computer Vision low-level mid-level
high-level image image image features features analysis
52
Low-Level sharpening blurring
53
Low-Level Mid-Level Canny original image edge image ORT data structure
circular arcs and line segments
54
Mid-level K-means clustering (followed by connected component
analysis) regions of homogeneous color original color image data structure
55
Low- to High-Level edge image consistent line clusters
low-level edge image mid-level consistent line clusters high-level Building Recognition
56
Recognition Scale / orientation range to search over Speed Context
57
Course content Image representatiın Matrices, functions
Image file formats Binary Image Analysis Pixel and neighborhood Masks and convolution Counting and labeling Morphological operations
58
Thresholding Object Recognition conceps Representation Classification Measures Gray-level Image Analysis Gray level mapping Noise removal, Smoothing
59
Color and shading Color spaces Shades Texture Texels, texture description Texture measure Segmentation Clustering Region Growing Content Based Image retrieval
60
Imaging and Image Representation Ch:2 Shapiro et al.
61
Classical Imaging Process
Light reaches surfaces in 3D Surfaces reflect Sensor element receives light energy Intensity counts Angles count Material counts What are radiance and irradiance?
62
Radiometry and Computer Vision*
Radiometry is a branch of physics that deals with the measurement of the flow and transfer of radiant energy. Radiance is the power of light that is emitted from a unit surface area into some spatial angle; the corresponding photometric term is brightness. Irradiance is the amount of energy that an image- capturing device gets per unit of an efficient sensitive area of the camera. Quantizing it gives image gray tones. From Sonka, Hlavac, and Boyle, Image Processing, Analysis, and Machine Vision, ITP, 1999.
63
Sensors: Image acquisition Devices
CCD (Charged Couple Device ) X-Ray Devices Microwave Devices UV Devices Thermal Cameras IR Devices 3-D scanners
64
CCD type camera: Commonly used in industrial applications
Array of small fixed elements Each element converts the light energy to electric charge 1x1 cm Can add refracting elements to get color in 2x2 neighborhoods 8-bit intensity common
65
Computer Vision Algorithms Main concern of CV is to develop Algorithms
66
LIDAR also senses surfaces
Single sensing element scans scene Laser light reflected off surface and returned Phase shift codes distance Brightness change codes albedo (surface reflectance) Stockman MSU/CSE Fall 2008
67
2.5D face image from Minolta Vivid 910 scanner
A rotating mirror scans a laser stripe across the object. 320x240 rangels obtained in about 2 seconds. [x,y,z,R,G,B] image. Stockman MSU/CSE Fall 2008
68
3D scanning technology 3D image of voxels obtained
Usually computationally expensive reconstruction of 3D from many 2D scans (CAT computer-aided-tomography) Stockman MSU/CSE Fall 2008
69
Magnetic Resonance Imaging
Sense density of certain chemistry S slices x R rows x C columns Volume element (voxel) about 2mm per side At left is shaded 2D image created by “volume rendering” a 3D volume: darkness codes depth Stockman MSU/CSE Fall 2008
70
Single slice through human head
MRIs are computed structures, computed from many views. At left is MRA (angiograph), which shows blood flow. CAT scans are computed in much the same manner from X-ray transmission data. Stockman MSU/CSE Fall 2008
71
Problems in Image Acquisition
73
Human eye as a spherical camera
millionRods sense intensity 6-7 million Cones sense color Fovea has tightly packed area, more cones Periphery has more rods Focal length is about 20mm Pupil/iris controls light entry Eye scans, or saccades to image details on fovea 100M sensing cells funnel to 1M optic nerve connections to the brain Stockman MSU/CSE Fall 2008
74
RODES AND CONES
75
Cones
76
Image Formation
77
Problems in HVS Mach Band Effect
78
Contrast
79
Illusions
81
Images: 2D projections of 3D
The 3D world has color, texture, surfaces, volumes, light sources, temperature, reflectance, … A 2D image is a projection of a scene from a specific viewpoint.
82
Digital Images form arrays
83
Digitizing- SAmpling
84
Quantization
85
Digital Image: Sampled and quantized
86
Sampling at different resolution
87
Sampling
88
Quantization
89
What is the appropriate sampling and quantization rates?
90
Resolution resolution: precision of the sensor
nominal resolution: size of a single pixel in scene coordinates (ie. meters, mm) common use of resolution: num_rows X num_cols (ie. 515 x 480) field of view (FOV): size of the scene a sensor can sense
92
Images as Functions g(x,y) = val or f(row, col) = val
A gray-tone image is a function: g(x,y) = val or f(row, col) = val A color image is just three functions or a vector-valued function: f(row,col) =(r(row,col), g(row,col), b(row,col)) Multi-spectral Image: f(row,col) =(f1(row,col), f2(row,col),…, fn(row,col))
93
Gray-tone Image as Function
94
Image vs Matrix There are many different file formats.
95
Digital Image Terminology:
pixel (with value 94) its 3x3 neighborhood region of medium intensity resolution (7x7) binary image gray-scale (or gray-tone) image color image multi-spectral image range image labeled image
96
Image File Formats Portable Gray Map (PGM) older form
GIF was early commercial version JPEG (JPG) is modern version MPEG for motion Many others exist: header plus data Do they handle color? Do they provide for compression? Are there good packages that use them or at least convert between them?
97
Commpression: Reduce the redundancy
Lossy Lossless
98
Run Coding Row Row Row Code 1: 3(0)1(1)2(0)1(1)6(0) Or Code2: (4,4)(7,7)
99
PGM image with ASCII info.
P2 means ASCII gray Comments W=16; H=8 192 is max intensity Can be made with editor Large images are usually not stored as ASCII
100
PBM/PGM/PPM Codes P1: ascii binary (PBM) P2: ascii grayscale (PGM)
P3: ascii color (PPM) P4: byte binary (PBM) P5: byte grayscale (PGM) P6: byte color (PPM)
101
JPG current popular form
Public standard Allows for image compression; often 10:1 or :1 are easily possible 8x8 intensity regions are fit with basis of cosines Error in cosine fit coded as well Parameters then compressed with Huffman coding Common for most digital cameras
103
From 3D Scenes to 2D Images
Object World Camera Real Image Pixel Image
104
Binary Image Analysis
105
Binary image analysis consists of a set of image analysis operations that are used to produce or process binary images, usually images of 0’s and 1’s. 0 represents the background 1 represents the foreground
106
Binary Image Analysis is used in a number of practical applications, e.g. part inspection riveting fish counting document processing
107
What kinds of operations?
Separate objects from background and from one another Aggregate pixels for each object Compute features for each object
108
Example: red blood cell image
Many blood cells are separate objects Many touch – bad! Salt and pepper noise from thresholding How useable is this data?
109
Results of analysis 63 separate objects detected
Single cells have area about 50 Noise spots Gobs of cells
110
Useful Operations 1. Thresholding a gray-tone image
2. Determining good thresholds 3. Connected components analysis 4. Binary mathematical morphology 5. All sorts of feature extractors (area, centroid, circularity, …)
111
1. Thresholding Convert gray level or color image into binary image
Use histogram
112
Histogram Background is black Healthy cherry is bright
Bruise is medium dark Histogram shows two cherry regions (black background has been removed) pixel counts 256 gray-tone values
113
Histogram-Directed Thresholding
How can we use a histogram to separate an image into 2 (or several) different regions? Is there a single clear threshold? 2? 3?
114
Automatic Thresholding: Otsu’s Method
Grp 1 Grp 2 Assumption: the histogram is bimodal t Method: find the threshold t that minimizes the weighted sum of within-group variances for the two groups that result from separating the gray tones at value t.
115
Thresholding Example original gray tone image binary thresholded image
116
2. Connected Components Labeling
Once you have a binary image, you can identify and then analyze each connected set of pixels. The connected components operation takes in a binary image and produces a labeled image in which each pixel has the integer label of either the background (0) or a component. connected components binary image after morphology
117
Methods for CC Analysis
Recursive Tracking (almost never used) Parallel Growing (needs parallel hardware) Row-by-Row (most common) Classical Algorithm (see text) Efficient Run-Length Algorithm (developed for speed in real industrial applications)
118
Equivalent Labels Original Binary Image
119
Equivalent Labels The Labeling Process
1 2 1 3
120
Run-Length Data Structure
1 2 3 4 row scol ecol label Binary Image 1 2 3 4 5 6 7 U N U S E D 0 Rstart Rend 1 2 3 4 2 4 6 0 0 7 7 Row Index Runs
121
Run-Length Algorithm /* Pass 1 (by rows) */
Procedure run_length_classical { initialize Run-Length and Union-Find data structures count <- 0 /* Pass 1 (by rows) */ for each current row and its previous row move pointer P along the runs of current row move pointer Q along the runs of previous row
122
Case 1: No Overlap Q Q |/////| |/////| |////| |///| |///| |/////| P P
|/////| |////| |///| |///| |/////| P P /* new label */ count <- count + 1 label(P) <- count P <- P + 1 /* check Q’s next run */ Q <- Q + 1
123
Case 2: Overlap Q Q |///////| |/////| |/////////////|
Subcase 2: P’s run has a label that is different from Q’s run Subcase 1: P’s run has no label yet Q Q |///////| |/////| |/////////////| |///////| |/////| |/////////////| P P label(P) <- label(Q) move pointer(s) union(label(P),label(Q)) move pointer(s) }
124
Pass 2 (by runs) /* Relabel each run with the name of the
equivalence class of its label */ For each run M { label(M) <- find(label(M)) } where union and find refer to the operations of the Union-Find data structure, which keeps track of sets of equivalent labels.
125
Labeling shown as Pseudo-Color
connected components of 1’s from thresholded image connected components of cluster labels
126
Mathematical Morphology
Binary mathematical morphology consists of two basic operations dilation and erosion and several composite relations closing and opening conditional dilation . . .
127
Dilation Dilation expands the connected sets of 1s of a binary image.
It can be used for 1. growing features 2. filling holes and gaps
128
Erosion Erosion shrinks the connected sets of 1s of a binary image.
It can be used for 1. shrinking features 2. Removing bridges, branches and small protrusions
129
Structuring Elements A structuring element is a shape mask used in
the basic morphological operations. They can be any shape and size that is digitally representable, and each has an origin. box disk hexagon something box(length,width) disk(diameter)
130
Dilation with Structuring Elements
The arguments to dilation and erosion are a binary image B a structuring element S dilate(B,S) takes binary image B, places the origin of structuring element S over each 1-pixel, and ORs the structuring element S into the output image at the corresponding position. dilate 1 1 1 S B B S origin
131
Erosion with Structuring Elements
erode(B,S) takes a binary image B, places the origin of structuring element S over every pixel position, and ORs a binary 1 into that position of the output image only if every position of S (with a 1) covers a 1 in B. origin 1 erode B S B S
132
Example to Try S B 1 1 1 erode dilate with same structuring element
133
Opening and Closing Closing is the compound operation of dilation followed by erosion (with the same structuring element) Opening is the compound operation of erosion followed by dilation (with the same structuring element)
134
Use of Opening Original Opening Corners
What kind of structuring element was used in the opening? How did we get the corners?
135
Gear Tooth Inspection original binary image How did they do it?
detected defects
136
Some Details
137
Region Properties Properties of the regions can be used to recognize objects. geometric properties (Ch 3) gray-tone properties color properties texture properties shape properties (a few in Ch 3) motion properties relationship properties (1 in Ch 3)
138
Geometric and Shape Properties
area centroid perimeter perimeter length circularity elongation mean and standard deviation of radial distance bounding box extremal axis length from bounding box second order moments (row, column, mixed) lengths and orientations of axes of best-fit ellipse Which are statistical? Which are structural?
139
Region Adjacency Graph
A region adjacency graph (RAG) is a graph in which each node represents a region of the image and an edge connects two nodes if the regions are adjacent. 1 1 2 2 4 3 4 3
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.