Interactive Image Segmentation 1. Image Segmentation Problem Segmentation refers to the process of partitioning an image into multiple non-overlapping.

Interactive Image Segmentation 1

Image Segmentation Problem Segmentation refers to the process of partitioning an image into multiple non-overlapping segments The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to use. Image Segmentation is a traditional Computer Vision problem. However, it is also an ``unsolved’’ problem. Problem of Automatic segmentation: –Different segmentation method favors different criterions –Different applications / human have their own different segmentation criterions –An universal segmentation algorithm does not exist 2

Interactive Image Segmentation Different user has different preferences on segmentation Why don’t we ask user to guide the segmentation process ? Interactive Image Segmentation Goal is to provide an intelligent tool that minimize the amount of user’s works Topics in this class –Image Snapping [Siggraph’95] –Intelligent Scissors [Siggraph’95] –Lazy Snapping [Siggraph’04] Extra reading materials –Grabcut [Siggraph’04] –Paint Selection [Siggraph’09] –Video Snap Cut [Siggraph’09] 3

4 Image Snapping SIGGRAPH, 1995 –Michael Gleicher, Apple Computer in 95, now Assoc. Prof. at U. Wisconsin Idea Cursor snapping is a technique that helps a mouse “snap” to a widget The cursor is the pointer device used in “direct manipulation” type of user interfaces (Windows is a direct manipulation interface) In many graphics applications (esp CAD/CAM), the mouse will “snap” when it gets close to a button or object This allows the user not to be so accurate Image snapping Have the mouse cursor snap to image features –Same idea, you don’t have to be very precise

5 Image Snapping – figures from paper Mouse position Snapped position

6 Basic Idea ? User clicks on an image point. How do we find “feature” to snap to?

7 Basic Idea ? Naïve approach? Search in a “spiral” pattern, increasing in distance from the point, until “something” is found.

8 Problems with Naïve Approach No method for rejecting noise Needs a preset stopping criteria No way to trade off certainty and size for distance Stops at first feature it finds, a better feature may be equal distance away CAN WE DO BETTER?

9 Image Snapping Approach Consider the “feature map” as a height-field Follow the feature map gradient (1 st derivation) from point position until we find a suitable feature Analogy: Point is a ball. Drop it and let it roll down the hill until it finds a stopping place. 1D example: http://mathworld.wolfram.com/MethodofSteepestDescent.html Similar in idea to numerical “gradient decent” methods.

10 Snapping Approach Search follows gradient Stopping criteria is where we can’t search in a downward direction anymore Blur image to make gradients smooth. Sharp images can cause problems.

11 Blurring Image First Blurred image helps remove noise We can blur at different amounts, this can help the level of detail we want to snap too Input blurred blurred more

12 Basic Algorithm Input image -> feature map –Blur image -> apply sobel mask -> get gradient map Finding a feature (snapping) When mouse is clicked –Follow gradient path until we find a “minimum” Minimum is where there is either no more gradients (constant region) Or the gradients are very different (line) –If no minimum is found within a certain distance, stop search too (nothing to snap too)

13 Image Snapping Gradient Map Click here and follow the gradient. One problem is that stopping position may be between 2 pixels. (that is, not aligned on a pixel) Note that the original image has been blurred. 2-pixel black line is now gray.

14 Alignment issues We can compute “sub-pixel” accurate snapping. Or we can force the snapping to be on a pixel.

15 Sub-pixel Snapping Green line: what the user draws Yellow line: Sub-pixel snapping

16 Adding Hysteresis One problem is that the snapping has no “memory” of where it snapped last If you are tracing, this can cause the cursor to jump around A kind of “hysteresis” can be introduced that gives preference to the last feature snapped too Hysteresis def: “History dependence of a physical system”

17 Example Without adding Hysteresis effect With Hysteresis effect

18 How to Add Hysteresis? The idea is to pull the snapped cursor instead of snap from the input cursor –Potential problem If you pull the cursor too much, it may jump to the wrong place If you pull too little, it will snap back to the starting place –Solution Initialize: Δp=1 Initial snapped cursor position (x,y) Step 1: Pull Δp-pixels towards the user-cursor if it snaps back to (x,y)? Δp=Δp+1 goto Step 1: else you have the new (x,y) position ΔpΔp old position move new position

19 Image Snapping Summary Interesting idea aiding the user’s image manipulation Reduces the level of accuracy Aid in “tracing” objects (Hysteresis + snapping) This is one of the early “Graphics meets Vision” paper –I recommend you read it, you’ll see how the author describes many of the vision terms –Now (more than 10 years later), SIGGRAPH readers are assumed to have a much better background in vision/image processing

20 Intelligent Scissors SIGGRAPH, 1995 –Eric Mortensen and William A. Barrett Idea -Treat segmentation as a graph-search problem -Between two starting points (which dynamically change) -Find shortest path on the graph -Path edge costs are related to image content

21 Segmentation as a graph path problem Treat entire image as a graph Pixels are nodes, eight edges, e, between every pixels Clicking on two points, you want to find the minimum cost path between A and B

22 Edge costs? Edge cost are based on the image content. In particular, gradient information (edge) Thresholded 2 nd Derivate 1 st Derivate magnitude (i.e. edge strength) Similarity of gradient direction between two nodes

23 Pre-processing Edge costs are pre-computed when the image is loaded Requires 3x3 Laplacian and 2 3x3 Gradient (Sobel) filtering to be performed Some additional calculations to compute the angle cost (see paper) Similar to the “Image Snapping”, authors mention blurring image first.

24 Shortest path graph algorithm. Note that this produces a spanning tree from the seed point s, to all nodes in the graph! Also see: http://en.wikipedia.org/wiki/Minimum_spanning_tree

25 Example of spanning tree expansion Initial input (with costs) Technically costs should be edges, not pixels First step of algorithm, all edge cost to the seed point are computed. [Note that diagonals are being waited by sqrt(2), thus 8 becomes 8*sqrt(2)=11] List L, now has 8 nodes on it (see algorithm) L = {1, 2, 4, 7, 7, 7, 11, 13} // these represent cost, but they are also linked back // to the pixel that they represent. (list is sorted!)

26 Minimum cost, place at the top of the list, lets call this “r”. Now all costs through “r” are computed. Recompute all costs. L = {1, 2, 4, 7, 7, 7, 11, 13} Expansion 1: Remove lowest value, and expand by unvisited neighbors Expansion 2: L = {2, 4, 5, 6, 6, 7, 7,9,13,14} // new list afterwards. Look very // carefully Notice,these pixels changed their cost. Its cheaper for them to go this way than diagonal (see first expansion). Their cost have to change in list L too. Example of spanning tree expansion

27 Next Several Expansion Next several expansions by “2”, “4”, “5” Final expansions. Note, we now have a path from every pixel back to the seed “s”.

28 Live Wire User clicks a seed point. - spanning tree is quickly calculated. Then moves the mouse to “free points”. The ‘wire’ magically find the path based on spanning tree.

29 Have you seen this before? Photoshop “magnetic lasso”

30 Some tricks for improvement Hard to specify the initial seed point on the objects boundary –So, “snapping” is used to snap the first seed point Periodically, a new seed point is automatically introduced (which re-computes the spanning tree). –This is called “cooling”

31 More tricks Dynamic training (or feature cost adjustment) Wire keeps snapping to man’s shoulder because it is darker (stronger gradient) Adjust (normalize) all gradient close to the live-wire (in a local region) to strengthen them. This allows the wire to snap back to the face.

32 Some results (final output)

33 Intelligent Scissors Summary Effective tool for object extraction Combines real-time graph-search with user interactive (and updating) Adopted by Photoshop (with variations I’m sure) Is it better than Image Snapping? –Notice they were published at the same conference, same year.

34 Lazy-Snapping SIGGRAPH, 2004 –Yin Li, Jian Sun, Chi-Keung Tang, Harry Shum Idea: –Problem with Image-snapping and Intelligent Scissors? You still need to trace the object This takes time –Can we do better? How about just supply very rough scribble to denote the background and foreground? (this is lazy) input markup output

35 Treat object extraction as a 2-class classification problem background foreground Result Each pixel in the image should be labeled as either “background” or “foreground”

36 Training examples via markup Pixels along user-scribble provide “supervised” RGB training-data Blue = background ( B ), yellow = foreground ( F ) R G B F B n F pixels m B pixels Result:

37 Simple 1-NN Classifier R G B ? Given an unlabeled pixel, C(i). Decide whether is background or foreground. Compute new pixels RGB Euclidean distance (L2-norm) to all labeled B pixels, and all labeled F pixels. Select nearest from each. F B

38 Assigning a score For each pixel, we can assign a score that this pixel is either foreground or background. We will use these scores to “minimize” a function, so the lower the score the more confident a pixel is to below to a class. Notation of the following equation: x i is a pixel label (not its color). 1=foreground, 0=background E 1 is the score F, B represent the training-data (already labeled). U are unlabeled/uncertain pixels. Read carefully. Pixel that is already labeled as foreground has a score of 0 for being foreground, infinite as being labeled background. Background pixels are defined similarly. Uncertain regions

39 Adding a Markov Random Field The per-pixel score is not enough To perform the final labeling, an MRF is used – this enforces spatial constraints Excellent MRF code: http://vision.middlebury.edu/MRF/ Edges have to vertices x i and x j. Cost for an edge depend on what labels are assigned to x i and x j. We will call this cost E2(x i,x j ) (defined on next slide) Edges cost often called “smoothness term”, or “smoothness prior”, or “prior energy” Cost for labeling a node is E 1 (x i ) (as defined on the previous slide) Node cost often called the “data cost” or “likelihood energy” MRF Nodes MRF Edges This is a graph, with { V,E } V = nodes (vertices) E = edges

40 Edge Costs where F B F F B B Cost = 0 |1-1| = 0 |0-0| = 0 1/[Color Difference] Small Color Difference = Large Cost Large Color Difference = Small Cost (Ask yourself why?) Possible Edge Configurations and cost: Configuration 1 Configuration 2 Configuration 3

41 How Edge Cost work B F B ? B F ? F B F ? F If color difference is large, then cost (B,F) = small If color difference is small, then Cost (B,F) = large Label with The value that results in the smallest cost For: E(x 9,x 10 ) and E(x 10,x 11 ) x9x9 x 10 x 11 ?

42 Solving MRF Put all of these costs together and find the optimal labeling for the whole network ? ? B ? ? ? ? B F ? ? ? Remember, some points are already labeled (from markup), so they are fixed. Solution is the label set that minimizes cost function E(X). Solution is often an approximation. Many approaches for solving minimizing MRF E(X).

43 Speedup up the computation Labeling an MRF by each pixel is slow –Will not allow for interactive rates Speed up? –First segment the image into larger regions –Use the “watershed algorithm” to pre-segment –Nodes are the centers of the watershed Here, each pixel is a V, and there are edges between all pixels Here, only the centers of the segmented regions are V, and the edges are the connection. Presumption is that the segmentation preserves boundaries well.

44 Clean up This classification using MRF is not perfect Any mistakes can be corrected manually Li et al introduce a “boundary editing” approach to help –It allows the user to draw the boundary, pixels re-labled near the boundary –Edge energy is modified to incorporate the drawn boundary Only pixels within the “yellowish” region are processed (See paper for more details, you should be able to understand it based on the notes)

45 Final Results Demo.

46 Lazy Snapping Summary Scribble based segmentation –Very fast and intuitive –Avoids having to get too close to the edge This scribble user interface have been used in many later applications Extended to Video Snapping in next year Siggraph A continue work of “Paint Selection” based on instant feedback and multi-core computation published in this year Siggraph

47 Summary We’ve discussed some computer- aided/user-assisted approaches for interactive image segmentation Combine computer-vision/image processing with user-assistance –Some people are calling this “Interactive Computer Vision” Greatly helps the processing of photos

Interactive Image Segmentation 1. Image Segmentation Problem Segmentation refers to the process of partitioning an image into multiple non-overlapping.

Similar presentations

Presentation on theme: "Interactive Image Segmentation 1. Image Segmentation Problem Segmentation refers to the process of partitioning an image into multiple non-overlapping."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Interactive Image Segmentation 1. Image Segmentation Problem Segmentation refers to the process of partitioning an image into multiple non-overlapping.

Similar presentations

Presentation on theme: "Interactive Image Segmentation 1. Image Segmentation Problem Segmentation refers to the process of partitioning an image into multiple non-overlapping."— Presentation transcript:

Similar presentations

About project

Feedback