# Instabilities of SVD Small eigenvalues -> m+ sensitive to small amounts of noise Small eigenvalues maybe indistinguishable from 0 Possible to remove small.

## Presentation on theme: "Instabilities of SVD Small eigenvalues -> m+ sensitive to small amounts of noise Small eigenvalues maybe indistinguishable from 0 Possible to remove small."— Presentation transcript:

Instabilities of SVD Small eigenvalues -> m+ sensitive to small amounts of noise Small eigenvalues maybe indistinguishable from 0 Possible to remove small eigenvalues to stabilize solution -> Truncated SVD, TSVD Condition number cond(G)=s 1 /s k

TSVD Example: removing instrument response g 0 t exp(-t/T 0 ) (t≥0) g(t)= 0 (t<0) v(t)=∫g(t-  )m true (  )d  (recorded acceleration) Problem: deconvolving g(t) from v(t) to get m true (t i -t j )exp[-(t i -t j )]/T 0  t (t j ≥t i ) d=Gm, G i,j = 0 (t j { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/11/3215544/slides/slide_2.jpg", "name": "TSVD Example: removing instrument response g 0 t exp(-t/T 0 ) (t≥0) g(t)= 0 (t<0) v(t)=∫g(t-  )m true (  )d  (recorded acceleration) Problem: deconvolving g(t) from v(t) to get m true (t i -t j )exp[-(t i -t j )]/T 0  t (t j ≥t i ) d=Gm, G i,j = 0 (t j

TSVD time [-5,100]s  t=0.5s -> G with m=n=210 Singular values 25.3->0.017, Cond(G)~1480 e.g., 1/1000 noise creates instability.. True signal: m true (t)=exp[-(t-8) 2 /2  2 ] + 0.5 exp[-(t-25) 2 /2  2 ]

TSVD d true =Gm true m=VS -1 U T d true

TSVD d true =Gm true m=VS -1 U T (d true +  ),  =N(0,(0.05 V) 2 ) Solution fits data perfectly, but worthless…

TSVD d true =Gm true m=V p S p -1 U p T (d true +  ),  =N(0,(0.05 V) 2 ) Solution for p=26 (removed 184 eigenvalues!)

Nonlinear Regression Linear regression: Now we know (e.g., LS) Assume a nonlinear system of m eq and m unknowns F(x)=0 … What are we going to do?? We will try to find a sequence of vectors x 0, x 1, … that will converge toward a solution x* Linearize: (assume that F is continuously differentiable)

Nonlinear Regression Linear regression: Now we know (e.g., LS) Assume a nonlinear system of m eq and m unknowns F(x)=0 … What are we going to do?? We will try to find a sequence of vectors x 0, x 1, … that will converge toward a solution x* Linearize: (assume that F is continuously differentiable) F(x 0 +  x)≈F(x 0 )+  F(x 0 )  x where  F(x 0 ) is the Jacobian

Nonlinear Regression Assume that the  x puts us at the unknown solution x*: F(x 0 +  x)≈F(x 0 )+  F(x 0 )  x=F(x*)=0 -F(x 0 ) ≈  F(x 0 )  x = Newton’s Method! F(x)=0, initial solution x 0. Generate a sequence of solutions x 1, x 2, …and stop if the sequence converges to a solution with F(x)=0. 1.Solve -F(x k ) ≈  F(x k )  x (fx, using Gaussian elimination). 2. Let x k+1 =x k +  x. 3. let k=k+1

Properties of Newton’s Method If x 0 is close enough to x*, F(x) is continuously differentiable in a neighborhood of x*, and  F(x * ) is nonsingular, Newton’s method will converge to x*. The convergence rate is ||x k+1 -x*|| 2 ≤c||x k -x*|| 2 2

Newton’s Method applied to a scalar function Problem: Minimize f(x) If f(x) is twice continuously differentiable f(x 0 +  x)≈f(x 0 )+  f(x 0 ) T  x+1/2  x T  2 f(x 0 )  x where  f(x 0 ) is the gradient and  2 f(x 0 ) is the Hessian

Newton’s Method applied to a scalar function A necessary condition for x* to be a minimum of f(x) is that  f(x * )=0. In the vicinity of x 0 we can approximate the gradient as  f(x 0 +  x)≈  f(x 0 )+  2 f(x 0 ) T  x (eq 9.8 - higher-order terms) Setting the gradient to zero (assuming x 0 +  x puts us at x*) we get -  f(x 0 ) ≈  2 f(x 0 ) T  x, which is Newton’s method for minimizing f(x): Twice differentiable function f(x), initial solution x 0. Generate a sequence of solutions x 1, x 2, …and stop if the sequence converges to a solution with  f(x)=0. 1.Solve -  f(x k ) ≈  2 f(x k )  x 2. Let x k+1 =x k +  x. 3. let k=k+1

Newton’s Method applied to a scalar function Is the same as solving a nonlinear system of equations applied to  f(x)=0, so therefore If x 0 is close enough to x*, f(x) is twice continuously differentiable in a neighborhood of x*, and there is a constant such that ||  2 f(x)-  2 f(y)|| 2 ≤ ||x-y|| 2 for every y in the neighborhood, and  2 f(x*) is positive definite, and x 0 is close enough to x*, then Newton’s method will converge quadratically to x*

Newton’s Method applied to LS Not directly applicable to most nonlinear regression and inverse problems (not equal # of model parameters and data points, no exact solution to G(m)=d). Instead we will use N.M. to minimize a nonlinear LS problem, e.g. fit a vector of n parameters to a data vector d. f(m)=∑ [(G(m) i -d i )/  i ] 2 Let f i (m)=(G(m) i -d i )/  I i=1,2,…,m, F(m)=[f 1 (m) … f m (m)] T So that f(m)= ∑ f i (m) 2  f(m)=∑  f i (m) 2 ] m i=1 m i=1 m i=1

Newton’s Method applied to LS  f(m) j =∑ 2  f i (m) j F(m) j ]  f(m)=2J(m) T F(m), where J(m) is the Jacobian   f(m) j =∑    f i (m) 2 ] = ∑ H i (m), where H i (m) is the Hessian of f i (m) 2 H i j,k (m)= m i=i m i=i m i=i

Newton’s Method applied to LS   f(m)=2J(m) T J(m)+Q(m), where Q(m)=∑ f i (m)   f i (m) Gauss-Newton (GN) method ignores Q(m),   f(m)≈2J(m) T J(m), assuming f i (m) will be reasonably small as we approach m*. That is, NM: Solve -  f(x k ) ≈  2 f(x k )  x  f(m) j =∑ 2  f i (m) j F(m) j, i.e. J(m k ) T J(m k )  m=-J(m k ) T F(m k ) m i=i

Newton’s Method applied to LS Levenberg-Marquardt (LM) method uses [J(m k ) T J(m k )+ I]  m=-J(m k ) T F(m k ) ->0 : GN ->large, steepest descent (SD) (down-gradient most rapidly). SD provides slow but certain convergence. Which value of to use? Small values when GN is working well, switch to larger values in problem areas. Start with small value of, then adjust.

Statistics of iterative methods Cov(Ad)=A Cov(d) A T (d has multivariate N.D.) Cov(m L2 )=(G T G) -1 G T Cov(d) G(G T G) -1 Cov(d)=  2 I: Cov(m L2 )=  2 (G T G) -1 However, we don’t have a linear relationship between data and estimated model parameters for the nonlinear regression, so cannot use these formulas. Instead: F(m*+  m)≈F(m*)+J(m*)  m Cov(m*)≈(J(m*) T J(m*)) -1 r i =G(m*) i -d i s=[∑ r i 2 /(m-n)] Cov(m*)=s 2 (J(m*) T J(m*)) -1

Implementation Issues 1.Explicit (analytical) expressions for derivatives 2.Finite difference approximation for derivatives 3.When to stop iterating?

Download ppt "Instabilities of SVD Small eigenvalues -> m+ sensitive to small amounts of noise Small eigenvalues maybe indistinguishable from 0 Possible to remove small."

Similar presentations