Presentation is loading. Please wait.

Presentation is loading. Please wait.

2007Theo Schouten1 Segmentation, contour based A segmented image contains groupings of parts of an image that are homogenous in one or more properties:

Similar presentations


Presentation on theme: "2007Theo Schouten1 Segmentation, contour based A segmented image contains groupings of parts of an image that are homogenous in one or more properties:"— Presentation transcript:

1 2007Theo Schouten1 Segmentation, contour based A segmented image contains groupings of parts of an image that are homogenous in one or more properties: intensity or color texture (the fine structure in intensity) movement (a vector value per pixel) We want the groupings to coincide with (parts of) objects or situations in the portrayed scene. The goal is often to divide the entire image into disjoint connected regions: I mage = k  R k with R i  Rj =  for i  j R is a connected region if for each x i and x j in R there is an array {x i,..., x k, x k+1,..., x j } in R where each consecutive pair (x k, x k+1 ) is connected (4,8 or mixed).

2 2007Theo Schouten2 Boundary and regions We can try to find both the boundaries of the regions and the regions themselves. Perfect boundaries and regions are redundant, from one you can derive the other The methods for finding them differ largely in character and suitability for application in particular concrete cases. Boundary- and area-finding techniques can be combined (hybrid segmentation) to yield a more reliable segmented image. In this chapter "knowledge" becomes important. This can be defined as implicit or explicit limits to the probability of a given grouping in an image. This knowledge can be domain dependent, for example: this is an image of blocks there is an airplane to the top left, etc. It can also be general, physical or heuristical knowledge most humans have two arms the maximum velocity or acceleration with movement preference for the shortest edge between two points

3 2007Theo Schouten3 Edges Edges of objects are important for the human visual system, often objects can already be recognized by simply a rough contour. It is difficult to detect the contours of objects directly from an intensity image. It's a better idea to first convert the image to one that shows local discontinuities (edges) in the intensity. An edge is a vector that shows a particular position, size and direction of a discontinuity. Sometimes only the size is determined. The "direction" of the edge is perpendicular to the "direction" of the contour of the object, pay close attention to the directions used. An edge can be determined per pixel, but also between connected pixels, the so-called crack edges. Sometimes the position of an edge is determined with a higher precision than one pixel.

4 2007Theo Schouten4 Edge operator An edge operator is a mathematical function that detects local discontinuities in a limited space. The edge operators can be classified into: approximation of the gradient operator template matching, check if edge-models fit fit with parameterized edge-models, when more is known about the edges which one wants to find. All edge operators have a certain underlying model about the discontinuities which they detect. They yield numbers for the size and direction of the discontinuities, independent of how well that local image piece satisfies the model. This quality of the "match" is often hidden in the size, but sometimes also in separated quality or threshold values.

5 2007Theo Schouten5 Parameterized edge model operators These operators cost a lot of calculation time and their benefit is fairly limited; especially as a general edge operator, which can be used without a lot of a priori information about the image scene. They can yield more information about the discontinuity than direction and size alone, such as the width of an edge and the size of intensity transitions to the left and right of the image.

6 2007Theo Schouten6 Points and lines Isolated pixels are often detected with masks that approximate the Laplacian. These operations are very sensitive to noise. Thresholding yields the pixels that drastically deviate from their neighborhood. Lines that are one pixel broad can be found using the masks below. Select direction i if: |R i | > |R j | for all j, possibly (weighted with |R k |) averaging values when two directions close to each other yield almost the same R. Thresholding (absolute and relative) is used to remove non- relevant line-elements.

7 2007Theo Schouten7 Gradient Using the image function f(x,y) one can determine the vector gradient image:  f(x,y) = (  f/  x,  f/  y )  = arctan2(  f/  x,  f/  y ) direction  ( (  f/  x) 2 + (  f/  y) 2 ) size |  f/  x| + |  f/  y| often used as approximation  f/  x = f(x+1,y) - f(x,y),  f/  y = f(x,y +1) - f(x,y) “crack” edges

8 2007Theo Schouten8 Roberts, Prewitt, Sobel masks Prewitt and Sobel take more pixels into account and are thereby less sensitive to noise. Variants with  2 are also used a lot. Larger masks, for example 5 by 5, can be used, if by approximation the edges are straight over such a large area.

9 2007Theo Schouten9 Example Sobel Original edge size, 3x3 Sobel x and y components of Sobel

10 2007Theo Schouten10 Laplacian example Landsat image (channel 5) 4-connected Laplacian Part Laplacian with zero- crossing

11 2007Theo Schouten11 Laplacian of Gaussian (LoG) Marr and Hildreth used the Laplacian of Gaussian function: h(x,y) = exp( - (x 2 +y 2 ) / 2  2 )  2 h(r) = ( (r 2 -  2 ) /  2 ) exp(-r 2 / 2  2 ) the "mexican hat" function, and determined the convolution of it with an image. This is the same as first determining the convolution of the image with the Gaussian (=smoothing) and then taking the Laplacian of it. The convolution matrices are large ( 9x9 for  = 1, 43x43 for  = 5), but the calculations can be made faster because the LoG is separable: LoG(x,y) = h12(x,y) + h21(x,y) with h12(x,y) = h1(x)h2(y) and h21(x,y) =h2(x)h1(y). The LoG can also be approximated with a DoG ( Difference of Gaussian’s with different  ’s). There are indications that biological systems also do this.

12 2007Theo Schouten12 Example LoG Original image Sobel gradient Gaussian smoothing Laplacian LoG thresholded LoG zero-crossings

13 2007Theo Schouten13 Canny Canny (1986) uses a first order derivative. Starting with a 1-D step edge around 0 with white Gaussian noise and a convolution with an antisymmetric function I(x), the following maxima yield the 1-D edges:  (x 0 ) = -   +  I(x) f(x-x 0 ) dx He first determined the best I(x) for efficient edge detection assuming certain criteria and expressed them as mathematical functions: good detection: small chance of missing real edges and finding false ones. good localization: small difference found-real edges just one position per edge His best I(x) can be approximated (20% worse) by the first derivative of a Gaussian: (x /  2 ) exp( -x 2 /  2 )

14 2007Theo Schouten14 Canny 2D In 2-D we want to execute a convolution with the first derivative of a 2-D Gaussian in a direction n perpendicular to the edge: G n =  G/  n = n.  G with  G = (  G/  x,  G/  y) n =  (G  Im) / |  (G  Im) | (this is true for approximation)  ( G n  Im) /  n = 0 thus  2 (G  Im) /  n 2 = 0 (local maximum) In his implementation Canny used simple masks to calculate n and a simple peak- determination with one threshold in the direction of n. There now exists better methods to axproximate this. Deriche (1987) found an I(x) that was 90% better than the derivative of the Gaussian and can also be implemented rapidly. In 2-D the derivatives can be found by convolution with masks that are separable (13 * and 12 + per pixel).

15 2007Theo Schouten15 Example Canny Landsat image Canny edges Edge directions after thinning

16 2007Theo Schouten16 Templates Often motivated by the Kirsch operator: S(x) = max k k-1  k+1 |f(x k )-f(x)|  (x) = k max * 45° k walks around x : 4 3 2 5 x 1 6 7 8 Possible implementation: |-3 -3 5| |-3 5 5| | 5 5 5| |-3 -3 -3| |-3 5| |-3 5| |-3 -3|... |- 3 5| |-3 -3 5| |-3 -3 -3| |-3 -3 -3| |-3 5 5| This uses 8 templates, so 8 values are calculated for each pixel in the image. The template with the highest value defines the edge strength (equal to that value) and the edge direction (quantized in steps of 45°). Edges with a small magnitude are often caused by noise or small fluctuations. Thresholding is then used to remove weak edges: S'(x) = 0 if S(x)  Threshold otherwise S(x)

17 2007Theo Schouten17 Frei and Chen The image function around point x 0 is factorized as a sum over 9 basis functions: f(x) = k=0  8 (f, h k ) h k (x- x0 ) / (h k, h k ) around x 0 with (f, h k ) = d  f(x) h k (x- x 0 ) Frei and Chen took the following basis functions: |1 1 1| |-1 -2 -1| | 0 -1 2| | 0 1 0| | 1 -2 1| |1 1 1| | 0 0 0| | 1 0 -1| |-1 0 1| |-2 4 -2| |1 1 1| | 1 2 1| |-2 1 0| | 0 -1 0| | 1 -2 1| |-1 0 1| | 2 -1 0| |-1 0 1| |-2 1 -2| |-2 0 2| |-1 0 1| | 0 0 0| | 1 4 1| |-1 0 1| | 0 1 -2| | 1 0 -1| |-2 1 -2| no structure gradient ripple line point Every basis function corresponds to a certain local shape in the image, the corresponding coefficient indicates the strength of it.

18 2007Theo Schouten18 Frei and Chen, thresholding How much the image around x 0 looks like an edge is then determined as E= k=1  2 (f, h k ) 2 and compared with how much it looks like a non-edge (uniform + ripple + line + point): NE = k !=1,2  (f, h k ) 2. The Frei-Chen threshold then becomes a corner in the NonEdge - Edge space instead of only a threshold value in the Edge direction. Another way of removing noise and double edges is: S'(x) = S(x) if S(x) is a local maximum, else 0 To determine a local maximum one can look at the 4-connected or 8-connected neighboring pixels.

19 2007Theo Schouten19 Edge thinning A simple way of thinning is comparing the pixel strength in the gradient direction (perpendicular to the edge) of each edge pixel to its neighboring pixels. An edge not having the maximal strength is removed. Problems often arise when boundaries come together: (î : an arrow pointing upwards, /: arrow pointing to the top right)) pixels direction magnitude thinned edges 0 0 0 0 0 0 0 0 0 0 0 0 0 0 î î î î î 5 4 3 3 3 0 0 0 + + 2 2 1 1 1 1 1 î î î î î 6 5 4 3 3 + + + + + 2 2 2 1 1 1 1 / / î î / 1 3 3 2 1 0 0 0 0 0 2 2 2 2 2 1 1 / î î î 0 1 2 3 3 0 0 0 + + 2 2 2 2 2 2 2 î î 0 0 0 1 2 0 0 0 0 0 2 2 2 2 2 2 2

20 2007Theo Schouten20 Lacroix LBE thinnng Lacroix (1988) determines a LBE (likelihood of being a edge) per pixel. Every pixel has two counters: v (visited) and m (maximum). While scanning the image a 3x1 window is placed over every pixel in the gradient direction. Every pixel in the window gets the value v incremented by 1, only the pixel(s) with the highest value get the value m incremented by 1. After the scan LBE becomes LBE = m / v : v m LBE 2 2 2 2 2 0 0 0 2 2 0 0 0 1 1 1 2 3 2 1 1 2 3 2 1 1 1 1 1 1 2 2 4 3 3 0 0 2 0 0 0 0 1/2 0 0 1 1 2 4 2 0 0 0 4 2 0 0 0 1 1 1 0 1 2 2 0 0 0 0 0 0 0 0 0 0 LBEs of 0 are obviously not edges, so LBEs of 1 are then used to start following new contours and lower LBEs are only used to continue with already existing contours. Naturally, during contour-following, different thresholds can be applied to the edge strength.

21 2007Theo Schouten21 Edge relaxation An iterative method to improve edge values by adjusting them depending on the measured edges in the neighborhood. The confidence we have in detecting an edge becomes dependent on the strengths of the edges in the neighborhood: 0 Initial confidence C 0 (e) e.g.: magnitude / maximal magnitude. 1 k=1 2 for each edge, use the confidences of the neighborhood edges to calculate a type. 3 calculate C k (e)= function { type, C k-1 (e) } 4 evaluate convergency criteria (e.g. all the confidences are near to 0 or 1, or the maximal number of iterations has been reached); stop or ( k++ ) and go back to 2. C k (e) = C k-1 (e) +  C for type (1,1) (1,2) (1,3) and reversibly C k-1 (e) -  C for type (0,0) (0,2) (0,3) and reversibly C k-1 (e) all other cases Type=(strong edges left, strong edges right)

22 2007Theo Schouten22 Edge linking Edges of neighboring pixels can be combined if they appear similar: |  f(x,y) -  f(x',y') | < T |  (x,y) -  (x',y') | < A The first or last edge of each contour can be viewed, possibly taking an average  and  and adjusting the thresholds to what one already knows about the contour. Can be adapted to detect circles.

23 2007Theo Schouten23 Graph methods Construct a graph from edge values and directions. Use graph algorithms to link edges to contours. Example of a noisy chromosone silhouette determined by graph search.

24 2007Theo Schouten24 Hough transform Both m and c can attain any value from -  to + , what gives problems. In this aspect, a better way to parameterize the line is: x cos  + y sin  = r The  's : from -90° to +90° and r: ± 1/2 D, where D is the diagonal of the image. We have the following Hough algorithm to determine lines: - initialize A(r d,  d )=0 for all r d and  d (make the accumulator matrix discrete) - for every point (x,y) having a value > Threshold : calculate the r’s and  ’s for all the possible lines through (x,y), discrete the values to r d and  d, then set A(r d,  d ) := A(r d,  d ) + 1 for all r d and  d - the local maximum in A yields the parameters of lines where a lot of points lie on. Look at all the possible lines which can go through an image point (s,t): t = m s + c. The parameters of all these lines form a straight line in the parameter space m,c.

25 2007Theo Schouten25 Hough on points

26 2007Theo Schouten26 Hough on edges For every point (x,y) with edge G(x,y) > Threshold and angle  : m = tg (  -  /2 ) and c = y - m x Angle  is not exact: take a range, e.g.  45  same for x,y: e.g.  1

27 2007Theo Schouten27 Hough transform for circles Circular figures: x = a + r cos  y = b + r sin  A static r belongs to a 2-D parameter space A(a,b), a variable r belongs to a 3-D parameter space A(a,b,r). If we want to find both light and dark circles, two sides of every edge must be viewed. If we look at two edges in an image then the number of possible (a,b,r) values strongly decrease. The local maximums in the parameter space are then easier to find. With n edge points (stronger than the threshold) in the image, there are n(n-1)/2 pairs to be viewed. Boundaries on r and testing on the  ’s can restrict the number of (a,b,r) values to be calculated. In general, any work done in the parameter space (calculating and tracking down the local maximums) can be replaced by work in the image space. Over the last years the Hough methods have been of much interest because of the development of efficient data structures to save fairly empty A matrixes and to find the local maximums in it.


Download ppt "2007Theo Schouten1 Segmentation, contour based A segmented image contains groupings of parts of an image that are homogenous in one or more properties:"

Similar presentations


Ads by Google