Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE 484: Computer Vision Fatoş T. Yarman Vural Atıl İşçen

Similar presentations


Presentation on theme: "CSE 484: Computer Vision Fatoş T. Yarman Vural Atıl İşçen"— Presentation transcript:

1 CSE 484: Computer Vision Fatoş T. Yarman Vural Atıl İşçen Güzelyurt, Ankara Spring 2009

2 Textbook: 1. L. Shapiro and Stockman, Computer Vision Reccomended Books: R. Szeliski, Computer Vision: Algorithms and Applications, Dec 23, 2008 D. Forsyth, J. Ponce, Computer Vision: Modern Approach B. Jahne, H. Haubacker, Computer Vision and Applications

3 Grading: Midterm: 30% Final: 40% Homework: 30% with your partner

4 What is Computer Vision?
Make the computer SEE SEE: Extracting Visual information from any sensed data Goal : Make useful decisions about objects and scenes based on sensed data

5 OBJECT perceptible material thing vision

6 Object According to Plato
Things consisting of forms and matter Forms are proper subjects of philosophical investigation, for they have the highest degree of reality. Matter is the ordinary substace

7 OBJECTS ANIMALS PLANTS INANIMATE ….. TAPIR BOAR GROUSE CAMERA NATURAL
MAN-MADE ….. VERTEBRATE MAMMALS BIRDS TAPIR BOAR GROUSE CAMERA

8 How many object categories are there?
~10,000 to 30,000 Biederman 1987

9 SCENE Consists of multiple objects
Goal : Make useful decisions about objects and scenes based on sensed data

10 Bruegel, 1564

11 Sensed Data: Images All sorts of sensor data carying visual info Optic
Thermal IR MR SAR …. Goal : Make useful decisions about objects and scenes based on sensed data

12 IMAGES: Sattelite,CT, SAR, Thermal, scientific

13 Useful Decisions Recognize, classify, detect, locallize, retrieve, annotate, varify Goal : Make useful decisions about objects and scenes based on sensed data

14 So what does recognition involve?

15 Verification: is that a lamp?

16 Detection: are there people?

17 Identification: is that Potala Palace?

18 Object categorization
mountain tree building banner street lamp vendor people

19 Scene and context categorization
outdoor city

20 Application Domains of computer vISION

21 Traffics Assisted driving Pedestrian and car detection Lane detection
meters Ped Car Pedestrian and car detection Assisted driving Lane detection Collision warning systems with adaptive cruise control, Lane departure warning systems, Rear object detection systems,

22 Retrieval: Improving online search
Query: STREET Digital Album

23 Similarity Retrieval of Brain Data

24 Image Databases: Content-Based Retrieval
Images from my Ground-Truth collection. What categories of image databases exist today?

25 Abstract Regions for Object Recognition
Original Images Color Regions Texture Regions Line Clusters Caption!

26 Insect Identification for Ecology Studies
Doroneuria (Dor) Calineuria (Cal) Yoraperla (Yor)

27 Document Analysis

28 Surveillance: Object and Event Recognition in Aerial Videos
Original Video Frame Color Regions Structure Regions

29 Video Analysis What are the objects? What are the events?

30 3D Reconstruction of the Blood Vessel Tree

31 Recognition of 3D Object Classes from Range Data

32 3D Scanning Scanning Michelangelo’s “The David”
The Digital Michelangelo Project - UW Prof. Brian Curless, collaborator 2 BILLION polygons, accuracy to .29mm

33 The Digital Michelangelo Project, Levoy et al.

34

35

36

37 Tasks in Computer Vision
Segment an image into useful regions Perform measurements on certain areas Determine what object(s) are in the scene Calculate the precise location(s) of objects Visually inspect a manufactured object Construct a 3D model of the imaged object Find “interesting” events in a video liver kidney spleen

38 HISTORY OF COMPUTER VISION

39 1970

40 1980s

41 1990

42 2000s

43 Why is it Difficult? What are the Challenges

44 Challenges 1: view point variation
Michelangelo

45 Challenges 2: illumination
slide credit: S. Ullman

46 Challenges 3: occlusion
Magritte, 1957

47 Challenges 4: scale

48 Challenges 5: deformation
Xu, Beihong 1943

49 Challenges 6: background clutter
Klimt, 1913

50 Challenges 7: intra-class variation

51 image image The Three Stages of Computer Vision low-level mid-level
high-level image image image features features analysis

52 Low-Level sharpening blurring

53 Low-Level Mid-Level Canny original image edge image ORT data structure
circular arcs and line segments

54 Mid-level K-means clustering (followed by connected component
analysis) regions of homogeneous color original color image data structure

55 Low- to High-Level edge image consistent line clusters
low-level edge image mid-level consistent line clusters high-level Building Recognition

56 Recognition Scale / orientation range to search over Speed Context

57 Course content Image representatiın Matrices, functions
Image file formats Binary Image Analysis Pixel and neighborhood Masks and convolution Counting and labeling Morphological operations

58 Thresholding Object Recognition conceps Representation Classification Measures Gray-level Image Analysis Gray level mapping Noise removal, Smoothing

59 Color and shading Color spaces Shades Texture Texels, texture description Texture measure Segmentation Clustering Region Growing Content Based Image retrieval

60 Imaging and Image Representation Ch:2 Shapiro et al.

61 Classical Imaging Process
Light reaches surfaces in 3D Surfaces reflect Sensor element receives light energy Intensity counts Angles count Material counts What are radiance and irradiance?

62 Radiometry and Computer Vision*
Radiometry is a branch of physics that deals with the measurement of the flow and transfer of radiant energy. Radiance is the power of light that is emitted from a unit surface area into some spatial angle; the corresponding photometric term is brightness. Irradiance is the amount of energy that an image- capturing device gets per unit of an efficient sensitive area of the camera. Quantizing it gives image gray tones. From Sonka, Hlavac, and Boyle, Image Processing, Analysis, and Machine Vision, ITP, 1999.

63 Sensors: Image acquisition Devices
CCD (Charged Couple Device ) X-Ray Devices Microwave Devices UV Devices Thermal Cameras IR Devices 3-D scanners

64 CCD type camera: Commonly used in industrial applications
Array of small fixed elements Each element converts the light energy to electric charge 1x1 cm Can add refracting elements to get color in 2x2 neighborhoods 8-bit intensity common

65 Computer Vision Algorithms Main concern of CV is to develop Algorithms

66 LIDAR also senses surfaces
Single sensing element scans scene Laser light reflected off surface and returned Phase shift codes distance Brightness change codes albedo (surface reflectance) Stockman MSU/CSE Fall 2008

67 2.5D face image from Minolta Vivid 910 scanner
A rotating mirror scans a laser stripe across the object. 320x240 rangels obtained in about 2 seconds. [x,y,z,R,G,B] image. Stockman MSU/CSE Fall 2008

68 3D scanning technology 3D image of voxels obtained
Usually computationally expensive reconstruction of 3D from many 2D scans (CAT computer-aided-tomography) Stockman MSU/CSE Fall 2008

69 Magnetic Resonance Imaging
Sense density of certain chemistry S slices x R rows x C columns Volume element (voxel) about 2mm per side At left is shaded 2D image created by “volume rendering” a 3D volume: darkness codes depth Stockman MSU/CSE Fall 2008

70 Single slice through human head
MRIs are computed structures, computed from many views. At left is MRA (angiograph), which shows blood flow. CAT scans are computed in much the same manner from X-ray transmission data. Stockman MSU/CSE Fall 2008

71 Problems in Image Acquisition

72

73 Human eye as a spherical camera
millionRods sense intensity 6-7 million Cones sense color Fovea has tightly packed area, more cones Periphery has more rods Focal length is about 20mm Pupil/iris controls light entry Eye scans, or saccades to image details on fovea 100M sensing cells funnel to 1M optic nerve connections to the brain Stockman MSU/CSE Fall 2008

74 RODES AND CONES

75 Cones

76 Image Formation

77 Problems in HVS Mach Band Effect

78 Contrast

79 Illusions

80

81 Images: 2D projections of 3D
The 3D world has color, texture, surfaces, volumes, light sources, temperature, reflectance, … A 2D image is a projection of a scene from a specific viewpoint.

82 Digital Images form arrays

83 Digitizing- SAmpling

84 Quantization

85 Digital Image: Sampled and quantized

86 Sampling at different resolution

87 Sampling

88 Quantization

89 What is the appropriate sampling and quantization rates?

90 Resolution resolution: precision of the sensor
nominal resolution: size of a single pixel in scene coordinates (ie. meters, mm) common use of resolution: num_rows X num_cols (ie. 515 x 480) field of view (FOV): size of the scene a sensor can sense

91

92 Images as Functions g(x,y) = val or f(row, col) = val
A gray-tone image is a function: g(x,y) = val or f(row, col) = val A color image is just three functions or a vector-valued function: f(row,col) =(r(row,col), g(row,col), b(row,col)) Multi-spectral Image: f(row,col) =(f1(row,col), f2(row,col),…, fn(row,col))

93 Gray-tone Image as Function

94 Image vs Matrix There are many different file formats.

95 Digital Image Terminology:
pixel (with value 94) its 3x3 neighborhood region of medium intensity resolution (7x7) binary image gray-scale (or gray-tone) image color image multi-spectral image range image labeled image

96 Image File Formats Portable Gray Map (PGM) older form
GIF was early commercial version JPEG (JPG) is modern version MPEG for motion Many others exist: header plus data Do they handle color? Do they provide for compression? Are there good packages that use them or at least convert between them?

97 Commpression: Reduce the redundancy
Lossy Lossless

98 Run Coding Row Row Row Code 1: 3(0)1(1)2(0)1(1)6(0) Or Code2: (4,4)(7,7)

99 PGM image with ASCII info.
P2 means ASCII gray Comments W=16; H=8 192 is max intensity Can be made with editor Large images are usually not stored as ASCII

100 PBM/PGM/PPM Codes P1: ascii binary (PBM) P2: ascii grayscale (PGM)
P3: ascii color (PPM) P4: byte binary (PBM) P5: byte grayscale (PGM) P6: byte color (PPM)

101 JPG current popular form
Public standard Allows for image compression; often 10:1 or :1 are easily possible 8x8 intensity regions are fit with basis of cosines Error in cosine fit coded as well Parameters then compressed with Huffman coding Common for most digital cameras

102

103 From 3D Scenes to 2D Images
Object World Camera Real Image Pixel Image

104 Binary Image Analysis

105 Binary image analysis consists of a set of image analysis operations that are used to produce or process binary images, usually images of 0’s and 1’s. 0 represents the background 1 represents the foreground

106 Binary Image Analysis is used in a number of practical applications, e.g. part inspection riveting fish counting document processing

107 What kinds of operations?
Separate objects from background and from one another Aggregate pixels for each object Compute features for each object

108 Example: red blood cell image
Many blood cells are separate objects Many touch – bad! Salt and pepper noise from thresholding How useable is this data?

109 Results of analysis 63 separate objects detected
Single cells have area about 50 Noise spots Gobs of cells

110 Useful Operations 1. Thresholding a gray-tone image
2. Determining good thresholds 3. Connected components analysis 4. Binary mathematical morphology 5. All sorts of feature extractors (area, centroid, circularity, …)

111 1. Thresholding Convert gray level or color image into binary image
Use histogram

112 Histogram Background is black Healthy cherry is bright
Bruise is medium dark Histogram shows two cherry regions (black background has been removed) pixel counts 256 gray-tone values

113 Histogram-Directed Thresholding
How can we use a histogram to separate an image into 2 (or several) different regions? Is there a single clear threshold? 2? 3?

114 Automatic Thresholding: Otsu’s Method
Grp 1 Grp 2 Assumption: the histogram is bimodal t Method: find the threshold t that minimizes the weighted sum of within-group variances for the two groups that result from separating the gray tones at value t.

115 Thresholding Example original gray tone image binary thresholded image

116 2. Connected Components Labeling
Once you have a binary image, you can identify and then analyze each connected set of pixels. The connected components operation takes in a binary image and produces a labeled image in which each pixel has the integer label of either the background (0) or a component. connected components binary image after morphology

117 Methods for CC Analysis
Recursive Tracking (almost never used) Parallel Growing (needs parallel hardware) Row-by-Row (most common) Classical Algorithm (see text) Efficient Run-Length Algorithm (developed for speed in real industrial applications)

118 Equivalent Labels Original Binary Image

119 Equivalent Labels The Labeling Process
1  2 1  3

120 Run-Length Data Structure
1 2 3 4 row scol ecol label Binary Image 1 2 3 4 5 6 7 U N U S E D 0 Rstart Rend 1 2 3 4 2 4 6 0 0 7 7 Row Index Runs

121 Run-Length Algorithm /* Pass 1 (by rows) */
Procedure run_length_classical { initialize Run-Length and Union-Find data structures count <- 0 /* Pass 1 (by rows) */ for each current row and its previous row move pointer P along the runs of current row move pointer Q along the runs of previous row

122 Case 1: No Overlap Q Q |/////| |/////| |////| |///| |///| |/////| P P
|/////| |////| |///| |///| |/////| P P /* new label */ count <- count + 1 label(P) <- count P <- P + 1 /* check Q’s next run */ Q <- Q + 1

123 Case 2: Overlap Q Q |///////| |/////| |/////////////|
Subcase 2: P’s run has a label that is different from Q’s run Subcase 1: P’s run has no label yet Q Q |///////| |/////| |/////////////| |///////| |/////| |/////////////| P P label(P) <- label(Q) move pointer(s) union(label(P),label(Q)) move pointer(s) }

124 Pass 2 (by runs) /* Relabel each run with the name of the
equivalence class of its label */ For each run M { label(M) <- find(label(M)) } where union and find refer to the operations of the Union-Find data structure, which keeps track of sets of equivalent labels.

125 Labeling shown as Pseudo-Color
connected components of 1’s from thresholded image connected components of cluster labels

126 Mathematical Morphology
Binary mathematical morphology consists of two basic operations dilation and erosion and several composite relations closing and opening conditional dilation . . .

127 Dilation Dilation expands the connected sets of 1s of a binary image.
It can be used for 1. growing features 2. filling holes and gaps

128 Erosion Erosion shrinks the connected sets of 1s of a binary image.
It can be used for 1. shrinking features 2. Removing bridges, branches and small protrusions

129 Structuring Elements A structuring element is a shape mask used in
the basic morphological operations. They can be any shape and size that is digitally representable, and each has an origin. box disk hexagon something box(length,width) disk(diameter)

130 Dilation with Structuring Elements
The arguments to dilation and erosion are a binary image B a structuring element S dilate(B,S) takes binary image B, places the origin of structuring element S over each 1-pixel, and ORs the structuring element S into the output image at the corresponding position. dilate 1 1 1 S B B  S origin

131 Erosion with Structuring Elements
erode(B,S) takes a binary image B, places the origin of structuring element S over every pixel position, and ORs a binary 1 into that position of the output image only if every position of S (with a 1) covers a 1 in B. origin 1 erode B S B S

132 Example to Try S B 1 1 1 erode dilate with same structuring element

133 Opening and Closing Closing is the compound operation of dilation followed by erosion (with the same structuring element) Opening is the compound operation of erosion followed by dilation (with the same structuring element)

134 Use of Opening Original Opening Corners
What kind of structuring element was used in the opening? How did we get the corners?

135 Gear Tooth Inspection original binary image How did they do it?
detected defects

136 Some Details

137 Region Properties Properties of the regions can be used to recognize objects. geometric properties (Ch 3) gray-tone properties color properties texture properties shape properties (a few in Ch 3) motion properties relationship properties (1 in Ch 3)

138 Geometric and Shape Properties
area centroid perimeter perimeter length circularity elongation mean and standard deviation of radial distance bounding box extremal axis length from bounding box second order moments (row, column, mixed) lengths and orientations of axes of best-fit ellipse Which are statistical? Which are structural?

139 Region Adjacency Graph
A region adjacency graph (RAG) is a graph in which each node represents a region of the image and an edge connects two nodes if the regions are adjacent. 1 1 2 2 4 3 4 3


Download ppt "CSE 484: Computer Vision Fatoş T. Yarman Vural Atıl İşçen"

Similar presentations


Ads by Google