Presentation is loading. Please wait.

Presentation is loading. Please wait.

Haim Kaplan and Uri Zwick

Similar presentations


Presentation on theme: "Haim Kaplan and Uri Zwick"— Presentation transcript:

1 Haim Kaplan and Uri Zwick
Local Search (part 2) Haim Kaplan and Uri Zwick Algorithms in Action Tel Aviv University Last updated: March

2 The general scheme Need to define a neighborhood relation on solutions
Need to find a starting solution Need to define which neighbor we choose if we have several “improving” ones Then we “walk’’ from the starting solution to a local optimum We can choose a starting solution either at random or by another clever algorithm When choosing a move among many we can choose at random or we can do the best improving move

3 Neighborhood size The neighborhood relation induces a tradeoffs between time and quality More neighbors ⇒ less local optima but more time per step

4 Back to max-cut Can improve the solution by considering swaps of 2 items Neighborhood size increases from 𝑂(𝑛) to 𝑂 𝑛 2 An efficient implementation for finding an improving neighbor may be crucial

5 Restarts If the procedure is fast we can run it from many different starting solutions and pick the best solution found Starting solutions can be random or perturbations of previous local optima

6 Example: Min-Bisection
Partition 𝐺 into 2 sets of equal size Minimize the # of edges crossing the cut We’ll see a local search procedure by Lin and Kernighan with a clever choice of neighborhoods Kernighan and Lin, “An efficient heuristic procedure for partitioning graphs” Bell system technical journal, 1970 (~5000 citations by google scholar)

7 Bisection 𝑆, 𝑆 =𝑛/2 𝑆 =𝑉∖𝑆

8 Find the pair with the best “improvement’’
Bisection 𝑆, 𝑆 =𝑛/2 𝑆 =𝑉∖𝑆 Find the pair with the best “improvement’’

9 Find the pair with the best “improvement’’
Bisection 𝑆, 𝑆 =𝑛/2 𝑆 =𝑉∖𝑆 Find the pair with the best “improvement’’ swap it

10 Bisection 𝑆, 𝑆 =𝑛/2 𝑆 =𝑉∖𝑆 Iterate this

11 Bisection 𝑆, 𝑆 =𝑛/2 𝑆 =𝑉∖𝑆 Iterate this

12 Bisection 𝑆, 𝑆 =𝑛/2 𝑆 =𝑉∖𝑆 Iterate this

13 Bisection 𝑆, 𝑆 =𝑛/2 𝑆 =𝑉∖𝑆 Iterate this

14 Bisection 𝑆, 𝑆 =𝑛/2 𝑆 =𝑉∖𝑆 Iterate this

15 Bisection 𝑆, 𝑆 =𝑛/2 𝑆 =𝑉∖𝑆 Iterate this

16 Bisection 𝑆, 𝑆 =𝑛/2 𝑆 =𝑉∖𝑆 Iterate this

17 Bisection 𝑆, 𝑆 =𝑛/2 𝑆 =𝑉∖𝑆 Iterate this

18 Bisection Δ 1 Δ 2 Δ 3 Δ 4 .. .. Δ 𝑛

19 The neighborhood Δ 1 Swap the blue pair The change is Δ 1 Δ 2 Δ 3 Δ 4
.. .. Δ 𝑛

20 Swap the blue pair and the red pair.
The neighborhood Δ 1 Swap the blue pair and the red pair. The change is Δ 1 + Δ 2 Δ 2 Δ 3 Δ 4 .. .. Δ 𝑛

21 Swap the blue pair and the red pair and the pink pair.
The neighborhood Δ 1 Swap the blue pair and the red pair and the pink pair. The change is Δ 1 + Δ 2 +Δ 3 Δ 2 Δ 3 Δ 4 .. and so on…… .. Δ 𝑛

22 The neighborhood Δ 1 Let 𝑗 be the index that minimizes ∑ 𝑖=1 𝑗 Δ 𝑖 Δ 2
Δ 3 If ∑ 𝑖=1 𝑗 Δ 𝑖 ≤0 swap the first j pairs Δ 4 .. .. Δ 𝑛

23 Kerninghan-Lin Unfortunately, a little is known about it theoretically, but Its quite effective in experiments

24 Kerninghan-Lin Histogram of 1000 runs (random start) on a random graph with 500 nodes and p=0.01 Johnson, Aragon, McGeoch, Schevon, 1989, Optimization by simulated annealing: An experimental evaluation, Part I, graph partitioning

25 What is the `local search’ in the experiment ?
Use single swaps (as for max-cut) and allow unbalanced partitions Minimize the cost 𝑢,𝑣 ∣𝑢∈𝑆, 𝑣∈ 𝑆 +𝑐 𝑆 − 𝑆 2 Typically c=0.05 in their experiments “Greedily fix it” means that we move a vertex from the large side to the small side such that the cut increases by the least amount and repeat this until the cut is a bisection If the partition is not balanced at the end then greedily fix it

26 Average running times:
Is this the whole story? Average running times: Annealing 6 min Local search 1 sec KL 3.7 sec Johnson, Aragon, McGeoch, Schevon, 1989, Optimization by simulated annealing: An experimental evaluation, Part I, graph partitioning

27 Take time into account Best cut in 3.6 million runs of local opt was 232 whereas the worse annealing cut among the 1000 was 225 For local search even if we take running time into account results are not as good

28 Examples that you have seen..
The simplex algorithm for linear programming, here the only local 𝑂𝑃𝑇 is the global one… (nice feature to have…) Bellman-Ford single source shortest path

29 Applications in graphics
Image segmentation (2D, 3D) Stereo – depth estimation There are many others…

30 Separate foreground (object) from background
Image segmentation Separate foreground (object) from background

31 Computer graphics Using tools form computer graphics we compute for each pixel 𝑣: 𝑐 𝑏 𝑣 : Penalty to pay if 𝑣 is background 𝑐 𝑓 𝑣 : Penalty to pay if 𝑣 is foreground For each pair of neighboring pixels 𝑣,𝑤: 𝑝 𝑣,𝑤 : Penalty to pay if 𝑣 and 𝑤 are of different types

32 The problem Find a partition 𝐹,𝐵 of the pixels that achieves:
𝑀𝑖 𝑛 𝐹,𝐵 𝑣∈𝐵 𝑐 𝑏 𝑣 + 𝑣∈𝐹 𝑐 𝑓 𝑣 + 𝑣∈𝐵,𝑤∈𝐹 𝑣,𝑤 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟𝑠 𝑝(𝑣,𝑤)

33 Use graph cuts Define a graph such that each pixel is a vertex
connected to the vertices of its neighboring pixels

34 Add two special vertices 𝑠 and 𝑡
A grid-like graph Add two special vertices 𝑠 and 𝑡 Connect 𝑠 and 𝑡 to all pixel vertices t s

35 Capacities Penalty if belongs to background 𝑐 𝑏 (𝑣) s 𝑝(𝑣,𝑤)
Penalty if belongs to foreground 𝑐 𝑓 (𝑤)

36 Gives the classification of the pixels
A minimum cut s t Gives the classification of the pixels

37 𝑀𝑖 𝑛 𝑆,𝑇 𝑣∈𝑇 𝑐 𝑏 𝑣 + 𝑣∈𝑆 𝑐 𝑓 𝑣 + { 𝑣,𝑤 ∣𝑣∈𝑆,𝑤∈𝑇} 𝑝(𝑣,𝑤)
A minimum cut s t 𝑀𝑖 𝑛 𝑆,𝑇 𝑣∈𝑇 𝑐 𝑏 𝑣 + 𝑣∈𝑆 𝑐 𝑓 𝑣 + { 𝑣,𝑤 ∣𝑣∈𝑆,𝑤∈𝑇} 𝑝(𝑣,𝑤)

38 Stereo vision – computing depths
Left camera Right camera Image 1 Image 2 𝑥1 𝑦1 𝐷𝑖𝑠𝑝𝑎𝑟𝑖𝑡𝑦1 = 𝑥1−𝑦1 Real world point 1

39 Stereo vision – computing depths
Left camera Right camera 𝒙𝟐 𝒚𝟐 Image 1 Image 2 𝒙𝟏 𝒚𝟏 𝑫𝒊𝒔𝒑𝒂𝒓𝒊𝒕𝒚𝟏 = 𝒙𝟏−𝒚𝟏 Real world point 1 𝑫𝒊𝒔𝒑𝒂𝒓𝒊𝒕𝒚𝟏 = 𝒙𝟐−𝒚𝟐 Real world point 2

40 Multi label problem Disparity is usually a small number of pixels
We want to label each pixel with its disparity A multi-label problem

41 Compute penalties For each pixel 𝑣 compute 𝑐 𝑑 𝑣 , 𝑑=−2,−1, 0, 1, 2
the penalty we pay if we assign a disparity 𝑑 to 𝑣 For each pair of pixels 𝑣,𝑤 compute 𝑝 𝑣,𝑤 the penalty we pay if we assign different disparities to 𝑣 and 𝑤

42 𝑀𝑖 𝑛 𝑓 𝑑 {𝑣∣𝑓 𝑣 =𝑑} 𝑐 𝑑 𝑣 + { 𝑣,𝑤 ∣𝑓 𝑣 ≠𝑓 𝑤 } 𝑝(𝑣,𝑤)
The task -1 -1 2 2 2 2 -1 -1 2 2 2 2 2 -1 2 1 1 2 1 -1 -1 -1 2 1 1 2 1 2 2 2 2 1 𝑀𝑖 𝑛 𝑓 𝑑 {𝑣∣𝑓 𝑣 =𝑑} 𝑐 𝑑 𝑣 + { 𝑣,𝑤 ∣𝑓 𝑣 ≠𝑓 𝑤 } 𝑝(𝑣,𝑤)

43 Local search We can try to improve by relabeling individual pixels
We can restrict ourselves to relabeling an arbitrary subset of a pair of disparities 𝛼,𝛽

44 𝛼𝛽 - swap For each pair of disparities 𝛼,𝛽 compute the best relabeling of the pixels labeled 𝛼 or 𝛽 Perform the relabeling that improves the most

45 𝛼𝛽 - swap -1 -1 2 2 2 2 -1 -1 2 2 2 2 2 -1 2 1 1 2 1 -1 -1 -1 2 1 1 2 1 2 2 2 2 1

46 𝛼𝛽 - swap 𝑐 −1 (𝑣) 𝑝(𝑣,𝑤) 𝑐 0 (𝑤) 2 2 2 2 2 2 2 2 2 2 1 1 2 1 2 1 1 2
𝑐 −1 (𝑣) 2 2 2 2 2 2 2 2 2 𝑝(𝑣,𝑤) 2 1 1 2 1 2 1 1 2 1 2 2 2 2 1 𝑐 0 (𝑤) -1

47 Theory ?

48 Bad example for 𝛼𝛽 - swap
t z

49 Bad example for 𝛼𝛽 - swap
t 𝑂𝑃𝑇 z

50 Bad example for 𝛼𝛽 - swap
t 𝐿𝑜𝑐𝑎𝑙 𝑚𝑖𝑛𝑖𝑚𝑎 z

51 𝛼 - expansion For each disparity 𝛼, compute the best relabeling obtained by flipping pixels not labeled 𝛼 to 𝛼 Perform the relabeling that improves the most

52 0 - expansion ∞ 𝑐 0 (𝑤) -1 -1 2 2 2 2 -1 -1 2 2 2 2 2 -1 2 1 1 2 1 -1
-1 -1 2 2 2 2 -1 -1 2 2 2 2 2 -1 2 1 1 2 1 -1 -1 -1 2 2 1 2 1 𝑐 0 (𝑤) 2 2 2 2 1

53 0 - expansion 𝑐 2 (𝑣) ∞ 𝑐 −1 (𝑢) 𝑐 0 (𝑤) -1 -1 2 2 2 2 -1 -1 2 2 2 2 2
𝑐 2 (𝑣) -1 -1 2 2 2 2 𝑐 −1 (𝑢) -1 -1 2 2 2 2 2 -1 2 1 1 2 1 -1 -1 -1 2 2 1 2 1 2 2 2 2 1 𝑐 0 (𝑤)

54 0 - expansion 𝑐 2 (𝑣) ∞ 𝑐 −1 (𝑢) 𝑐 0 (𝑤) -1 -1 2 2 2 2 -1 -1 2 2 2 2 2
𝑐 2 (𝑣) -1 -1 2 2 2 2 𝑐 −1 (𝑢) 𝑝(𝑣,𝑤) -1 -1 2 2 2 2 2 -1 2 1 1 2 1 -1 -1 -1 2 2 1 2 1 2 2 2 2 1 𝑐 0 (𝑤)

55 0 - expansion 𝑐 2 (𝑣) ∞ 𝑐 −1 (𝑢) 𝑐 0 (𝑤) -1 -1 2 2 2 2 -1 -1 2 2 2 2 2
𝑐 2 (𝑣) -1 -1 2 2 2 2 𝑐 −1 (𝑢) 𝑝(𝑣,𝑤) -1 -1 2 2 2 2 2 -1 2 1 1 2 1 -1 -1 -1 2 2 1 2 1 2 2 2 2 1 𝑐 0 (𝑤)

56 Not good enough we pay 𝑝(𝑣,𝑤) also in case they are both in 𝑇
0 - expansion 𝑐 2 (𝑣) Not good enough we pay 𝑝(𝑣,𝑤) also in case they are both in 𝑇 -1 -1 2 2 2 2 𝑐 −1 (𝑢) -1 -1 2 2 2 2 2 𝑝(𝑣,𝑤) -1 2 1 1 2 1 -1 -1 -1 2 2 1 2 1 2 2 2 2 1 𝑐 0 (𝑤)

57 0 - expansion 𝑐 2 (𝑣) ∞ 𝑐 −1 (𝑢) 𝑐 0 (𝑤) -1 -1 2 2 2 2 -1 -1 2 2 2 2 2
𝑐 2 (𝑣) -1 -1 2 2 2 2 𝑐 −1 (𝑢) 𝑝(𝑣,𝑤) -1 -1 2 2 2 2 2 𝑝(𝑣,𝑤) 𝑝(𝑣,𝑤) -1 2 1 1 2 1 -1 -1 -1 2 2 1 2 1 2 2 2 2 1 𝑐 0 (𝑤)

58 Recap, 𝛼-expansion For a fixed 𝛼 we defined a graph in which the min st-cut corresponds to the best relabeling of vertices with label different from 𝛼 to label 𝛼

59 Recap, 𝛼-expansion Let A be the set of vertices labeled 𝛼
Any subset 𝐵 of vertices not labeled 𝛼 corresponds to an st-cut 𝑆,𝑇 such that 𝑆= 𝑠 ∪𝐴∪𝐵. The capacity of the cut is exactly the cost of the labeling after changing the labels of the vertices in 𝐵 to 𝛼 All other cuts are of larger cost

60 Theory ?

61 Analysis 𝑂𝑃𝑇 𝐿 -1 -1 -1 2 2 2 1 2 -2 -1 -1 2 2 2 -1 -1 -1 2 2 2 2 2 -2
2 2 2 1 2 -2 -1 -1 2 2 2 -1 -1 -1 2 2 2 2 2 -2 2 2 1 1 1 -1 2 2 1 1 1

62 Analysis 𝐿 1 𝐿 1 2 1 2

63 Analysis 𝐿 1 𝐿 1 2 1 2 0≤𝑐𝑜𝑠𝑡 𝐿 1 −𝑐𝑜𝑠𝑡 𝐿

64 Analysis 𝐿 1 𝐿 1 2 1 2 0≤𝑐𝑜𝑠𝑡 𝐿 1 −𝑐𝑜𝑠𝑡 𝐿 ≤ 𝑐 −𝑐 − 𝑐 2 1 2

65 Analysis 𝐿 1 𝐿 1 2 1 2 0≤𝑐𝑜𝑠𝑡 𝐿 1 −𝑐𝑜𝑠𝑡 𝐿 = 𝑐 −𝑐 − 𝑐 2 1 2 + 𝑝 −𝑝

66 Analysis 𝐿 1 𝐿 1 2 1 2 0≤𝑐𝑜𝑠𝑡 𝐿 1 −𝑐𝑜𝑠𝑡 𝐿 ≤ 𝑐 −𝑐 − 𝑐 2 1 2 + 𝑝 −𝑝

67 Analysis 𝐿 2 𝐿 0≤𝑐𝑜𝑠𝑡 𝐿 2 −𝑐𝑜𝑠𝑡 𝐿 ≤ 𝑐 2 −𝑐 2 −𝑐 1 − c 0 ( ) + 𝑝 −𝑝 2
-1 2 1 -1 2 2 1 2 0≤𝑐𝑜𝑠𝑡 𝐿 2 −𝑐𝑜𝑠𝑡 𝐿 ≤ 𝑐 −𝑐 −𝑐 − c 0 ( ) + 𝑝 −𝑝

68 Analysis 0≤ 𝑑 𝑣 opt 𝑣 =𝑑 𝑐 𝑑 𝑣 − 𝑑 𝑣 L 𝑣 =𝑑 𝑐 𝑑 𝑣 𝑣,𝑤 opt 𝑣 ≠opt 𝑤 𝑝 𝑣,𝑤 −2 { 𝑣,𝑤 ∣L 𝑣 ≠L 𝑤 opt 𝑣 ≠opt 𝑤 } 𝑝 𝑣,𝑤 − { 𝑣,𝑤 ∣L 𝑣 ≠L 𝑤 opt 𝑣 =opt 𝑤 } 𝑝 𝑣,𝑤

69 Analysis 0≤ 𝑑 𝑣 opt 𝑣 =𝑑 𝑐 𝑑 𝑣 − 𝑑 𝑣 L 𝑣 =𝑑 𝑐 𝑑 𝑣 𝑣,𝑤 opt 𝑣 ≠opt 𝑤 𝑝 𝑣,𝑤 −2 { 𝑣,𝑤 ∣L 𝑣 ≠L 𝑤 opt 𝑣 ≠opt 𝑤 } 𝑝 𝑣,𝑤 − { 𝑣,𝑤 ∣L 𝑣 ≠L 𝑤 opt 𝑣 =opt 𝑤 } 𝑝 𝑣,𝑤 ≤2𝑂𝑃𝑇−𝐿


Download ppt "Haim Kaplan and Uri Zwick"

Similar presentations


Ads by Google