Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSCI 256 Data Structures and Algorithm Analysis Lecture 14 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.

Similar presentations


Presentation on theme: "CSCI 256 Data Structures and Algorithm Analysis Lecture 14 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some."— Presentation transcript:

1 CSCI 256 Data Structures and Algorithm Analysis Lecture 14 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some by Iker Gondra

2 Elements of DP From an engineering perspective, when should we look for a DP solution to a problem? –Optimal substructure: The first step in solving an optimization problem by DP is to characterize the structure of an optimal solution. A problem exhibits optimal structure if an optimal solution to the problem contains within it optimal solutions to subproblems –Overlapping subproblems: The space of subproblems must be “small” in the sense that a recursive algorithm for the problem solves the same subproblems over and over again, rather than always generating new subproblems. Typically, total number of distinct subproblems is a polynomial in the input size. DP algorithms take advantage of this by solving each subproblem once and storing the solution in a table

3 Least Squares Least squares –Foundational problem in statistics and numerical analysis –Given n points in the plane: (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) –Find a line y = ax + b that minimizes the sum of the squared error –Solution: Calculus  min error is achieved when x y

4 Least Squares Solution? Sensible?? x y

5 Segmented Least Squares Segmented least squares (first attempt) –Points lie roughly on a sequence of several line segments –Given n points in the plane (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) with x 1 < x 2 <... < x n, find a sequence of lines that minimizes SSE which we could call the error x y

6 Segmented Least Squares -- How many line segments should we choose?? To optimize, we want to give assume a greater penalty for a larger number of segments as well as for the error—the squared deviations of the points from its corresponding line. Penalty of a partition is the sum of: –the number of segments into which we partition the points times a given multiplier, c –For each segment the error value of the optimal line through that segment This problem is a partitioning problem. This is an important problem in data mining and statistics known as change detection: given a sequence of data points, identify a few points in the sequence at which a discrete change occurs (in this case a change from one linear approximation to another)

7 Segmented Least Squares Goal in segmented Least Squares Problem: find a partition of minimal penalty

8 What is the optimal linear interpolation with two line segments?

9 Optimal interpolation with two segments Give an equation for the error of the optimal line ( having minimal least squares error ) through p 1,…,p n with two line segments. Let E i,j be the least squares error for the optimal line through p i,... p j (DONE IN CLASS)

10 What is the optimal linear interpolation with three line segments?

11 Optimal interpolation with three segments Give an equation for the error of the optimal line ( having minimal least squares error ) through p 1,…,p n with three line segments. Let E i,j be the least squares error for the optimal line through p i,... p j Need to find i and j which minimize (E j+1,n + E i+1,j + E 1,i ) (Note we haven’t included a penalty term accounting for the number of segments) Can we do this recursively?

12 What is the optimal linear interpolation with n line segments?

13 Segmented Least Squares Segmented least squares –Points lie roughly on a sequence of several line segments –Given n points in the plane (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) with x 1 < x 2 <... < x n, find a sequence of lines that minimizes f(x) Question: What's a reasonable choice for f(x) to balance accuracy and parsimony? goodness of fitnumber of lines x y

14 Segmented Least Squares Segmented least squares –Points lie roughly on a sequence of several line segments –Given n points in the plane (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) with x 1 < x 2 <... < x n, find a sequence of lines that minimizes the sum of the sums of the squared errors E in each segment the number of lines L Tradeoff function: E + cL, for some constant c > 0 x y

15 Optimal substructure property Optimal solution with k line segments extends an optimal solution of k-1 line segments on a smaller problem

16 DP: Multiway Choice Notation –OPT[j] = minimum cost for points p 1, p 2,..., p j –E i,j = minimum sum of squares for points p i, p i+1,..., p j Give a recursive definition for OPT[j]

17 Notation. OPT[j] = minimum cost for points p 1, p 2,..., p j. E i,j = minimum sum of squares for points p i, p i+1,..., p j. To compute OPT[j]: –Last segment uses points p i, p i+1,..., p j for some i. –Cost = E i,j + c + OPT[i-1]. –Which i ??? Opt[j] = min 1  i  j (E i,j + c + Opt[i-1])

18 Segmented Least Squares: Algorithm can be improved to O(n 2 ) by pre-computing various statistics INPUT: n, p 1,…,p N, c Segmented-Least-Squares() { Opt[0] = 0 for j = 1 to n for i = 1 to j compute the least square error E ij for the segment p i,…, p j endfor for j = 1 to n Opt[j] = min 1  i  j (E ij + c + Opt[i-1]) endfor return Opt[n] }

19 Total Running time: O(n 3 ) Computing E i,j for O(n 2 ) pairs, O(n) per pair using previous formula –this gives O(n 3 ) to compute all E i,j pairs Following this the algorithm has n iterations for values j = 1,…,n; for each value of j we have to compute the minimum of the recurrence to fill the array entry Opt[j]; this takes O(n) for each j; –This part gives O(n 2 ) Remark – there is an exercise in the text which shows how to reduce the total running time from O(n 3 ) to O(n 2 )

20 Determining the solution When Opt[j] is computed, record the value of i that minimized the sum Store this value in an auxiliary array Use to reconstruct solution

21 Determining the solution Find-Segments(j) If j = 0 then 0utput nothing Else Get i that minimizes E i,j + C + Opt[i-1] Output the segment {p i,…p j } and the result of Find-Segments(i-1) Endif


Download ppt "CSCI 256 Data Structures and Algorithm Analysis Lecture 14 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some."

Similar presentations


Ads by Google