Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 4 The L 2 Norm and Simple Least Squares. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.

Similar presentations


Presentation on theme: "Lecture 4 The L 2 Norm and Simple Least Squares. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture."— Presentation transcript:

1 Lecture 4 The L 2 Norm and Simple Least Squares

2 Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture 03Probability and Measurement Error, Part 2 Lecture 04The L 2 Norm and Simple Least Squares Lecture 05A Priori Information and Weighted Least Squared Lecture 06Resolution and Generalized Inverses Lecture 07Backus-Gilbert Inverse and the Trade Off of Resolution and Variance Lecture 08The Principle of Maximum Likelihood Lecture 09Inexact Theories Lecture 10Nonuniqueness and Localized Averages Lecture 11Vector Spaces and Singular Value Decomposition Lecture 12Equality and Inequality Constraints Lecture 13L 1, L ∞ Norm Problems and Linear Programming Lecture 14Nonlinear Problems: Grid and Monte Carlo Searches Lecture 15Nonlinear Problems: Newton’s Method Lecture 16Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals Lecture 17Factor Analysis Lecture 18Varimax Factors, Empircal Orthogonal Functions Lecture 19Backus-Gilbert Theory for Continuous Problems; Radon’s Problem Lecture 20Linear Operators and Their Adjoints Lecture 21Fréchet Derivatives Lecture 22 Exemplary Inverse Problems, incl. Filter Design Lecture 23 Exemplary Inverse Problems, incl. Earthquake Location Lecture 24 Exemplary Inverse Problems, incl. Vibrational Problems

3 Purpose of the Lecture Introduce the concept of prediction error and the norms that quantify it Develop the Least Squares Solution Develop the Minimum Length Solution Determine the covariance of these solutions

4 Part 1 prediction error and norms

5 The Linear Inverse Problem Gm = d

6 The Linear Inverse Problem Gm = d data model parameters data kernel

7 Gm est = d pre an estimate of the model parameters can be used to predict the data but the prediction may not match the observed data (e.g. due to observational error) d pre ≠ d obs

8 e = d obs -d pre this mismatch leads us to define the prediction error e = 0 when the model parameters exactly predict the data

9 B) A) zizi d i obs d i pre eiei example of prediction error for line fit to data

10 “norm” rule for quantifying the overall size of the error vector e lot’s of possible ways to do it

11 L n family of norms

12 Euclidian length

13 higher norms give increaing weight to largest element of e

14 limiting case

15 guiding principle for solving an inverse problem find the m est that minimizes E=||e|| with e = d obs –d pre and d pre = Gm est

16 but which norm to use? it makes a difference!

17 outlier L1L1 L2L2 L∞L∞

18 B)A) Answer is related to the distribution of the error. Are outliers common or rare? long tails outliers common outliers unimportant use low norm gives low weight to outliers short tails outliers uncommon outliers important use high norm gives high weight to outliers

19 as we will show later in the class … use L 2 norm when data has Gaussian-distributed error

20 Part 2 Least Squares Solution to Gm=d

21 L 2 norm of error is its Euclidian length so E is the square of the Euclidean length mimimize E Principle of Least Squares = e T e

22 Least Squares Solution to Gm=d minimize E with respect to m q ∂E/∂m q = 0

23 so, multiply out

24 first term

25 ∂m j /∂m q = δ jq since m j and m q are independent variables

26 a i = Σ j δ ij b j = b i Kronecker delta (elements of identity matrix) [I] ij = δ ij a = Ib = b a i = Σ j δ ij b j = b i i

27 second term third term

28 putting it all together or

29 presuming [G T G] has an inverse Least Square Solution

30 presuming [G T G] has an inverse Least Square Solution memorize

31 example straight line problem Gm = d

32

33

34

35 in practice, no need to multiply matrices analytically just use MatLab mest = (G’*G)\(G’*d);

36 another example fitting a plane surface Gm = d

37 x, km y, km z, km

38 Part 3 Minimum Length Solution

39 but Least Squares will fail when [G T G] has no inverse

40 z d ? ? ? example fitting line to a single point

41

42 zero determinant hence no inverse

43 Least Squares will fail when more than one solution minimizes the error the inverse problem is “underdetermined”

44 R S 21 simple example of an underdetermined problem

45 What to do? use another guiding principle “a priori” information about the solution

46 in the case choose a solution that is small minimize ||m|| 2

47 simplest case “purely underdetermined” more than one solution has zero error

48 minimize L=||m|| 2 2 with the constraint that e=0

49 Method of Lagrange Multipliers minimize L with constraints C 1 =0, C 2 =0, … equivalent to minimize Φ=L+λ 1 C 1 +λ 2 C 2 +… with no constraints λ s called “Lagrange Multipliers”

50 e(x,y)=0 x y  L (x,y) (x 0,y 0 )

51

52 2m=G T λ and Gm=d ½GG T λ =d λ = 2[GG T ] -1 d m=G T [GG T ] -1 d

53 presuming [GG T ] has an inverse Minimum Length Solution m est =G T [GG T ] -1 d

54 presuming [GG T ] has an inverse Minimum Length Solution m est =G T [GG T ] -1 d memorize

55 Part 4 Covariance

56 Least Squares Solution m est = [G T G ] -1 G T d Minimum Length Solution m est =G T [GG T ] -1 d both have the linear form m=Md

57 but if m=Md then [cov m] = M [cov d] M T when data are uncorrelated with uniform variance σ d 2 [cov d]=σ d 2 I so

58 Least Squares Solution [cov m] = [G T G ] -1 G T σ d 2 G[G T G ] -1 [cov m] = σ d 2 [G T G ] -1 Minimum Length Solution [cov m] = G T [GG T ] -1 σ d 2 [GG T ] -1 G [cov m] = σ d 2 G T [GG T ] -2 G

59 Least Squares Solution [cov m] = [G T G ] -1 G T σ d 2 G[G T G ] -1 [cov m] = σ d 2 [G T G ] -1 Minimum Length Solution [cov m] = G T [GG T ] -1 σ d 2 [GG T ] -1 G [cov m] = σ d 2 G T [GG T ] -2 G memorize

60 where to obtain the value of σ d 2 a priori value – based on knowledge of accuracy of measurement technique my ruler has 1 mm divisions, so σ d ≈½mm a posteriori value – based on prediction error

61 variance critically dependent on experiment design (structure of G ) 11 2 1 2 3 1 2 3 4 11 2 2 3 3 4 …… which is the better way to weigh a set of boxes ?

62 A) B) m i est σ mi i i

63 z m1m1 m2m2 0 4 0 4 m1m1 0 4 m2m2 04 E E z dd A)B) C)D) Relationship between [cov m] and Error Surface

64 Taylor Series expansion of the error about its minimum

65 curvature matrix with elements ∂ 2 E/ ∂m i ∂m j

66 for a linear problem curvature is related to G T G E = (Gm-d) T (Gm-d) = m T [G T G]m-d T Gm-m T G T d+d T d so ∂ 2 E/ ∂m i ∂m j = [G T G] ij

67 and since [cov m] = σ d 2 [G T G] -1 we have

68 the sharper the minimum the higher the curvature the smaller the covariance


Download ppt "Lecture 4 The L 2 Norm and Simple Least Squares. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture."

Similar presentations


Ads by Google