Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Chapter 5 – Orthogonality and Least Squares Outline 5.1 Orthonormal Bases and Orthogonal Projections 5.2 Gram-Schmidt Process and QR Factorization 5.3.

Similar presentations


Presentation on theme: "1 Chapter 5 – Orthogonality and Least Squares Outline 5.1 Orthonormal Bases and Orthogonal Projections 5.2 Gram-Schmidt Process and QR Factorization 5.3."— Presentation transcript:

1 1 Chapter 5 – Orthogonality and Least Squares Outline 5.1 Orthonormal Bases and Orthogonal Projections 5.2 Gram-Schmidt Process and QR Factorization 5.3 Orthogonal Transformations and Orthogonal Matrices 5.4Least Squares and Data Fitting 5.5Inner Product SpacesOrthonormal Bases and Orthogonal ProjectionsGram-Schmidt Process and QR FactorizationOrthogonal Transformations and Orthogonal MatricesLeast Squares and Data FittingInner Product Spaces

2 2 5.1 Orthonormal Bases and Orthogonal Projections Orthogonality, length, unit vectors –Two vectors and in R n are called perpendicular or orthogonal if. –The length (or magnitude or norm) of a vector in R n is. –A vector in R n is called a unit vector if its length is 1, (i,.e.,, or ). Orthonormal vectors –The vectors in R n are called orthonormal if they are all unit vectors and orthogonal to one another: Orthogonal complement –Consider a subspace V of R n. The orthogonal complement of V is the set of those vectors in R n that are orthogonal to all vectors in V:

3 3 Orthonormal Vectors Orthonormal vectors are linearly independent. Orthonormal vectors in R n form a basis of R n. (Example 2) For any scalar , the vectors, are orthonormal.

4 4 Orthogonal Projections If V is a subspace of R n, then its orthogonal complement is a subspace of R n as well. Consider a subspace V of R n with orthonormal basis. For any vector in R n, there is a unique vector in V such that is in. This vector is called the orthogonal projection of onto V, denoted by.We have the formula The Transformation from R n to R n is linear. Consider an orthonormal basis of R n. Then, for all in R n.

5 5 Example 4 (sol) –

6 6 Pythagorean Theorem Consider two vectors and in R n. The equation holds if (and only if) and are orthogonal. Example 6 (sol) –

7 7 Projections Consider a subspace V of R n and a vector in R n. Then,. The statement is an equality if (and only if) is in V.

8 8 Cauchy–Schwarz Inequality If and are vectors in R n, then. This statement is an equality if (and only if) and are parallel.

9 9 Angle Between Two Vectors

10 10 Example 7 (sol) –

11 11 Correlation Coefficient

12 12 Correlation The correlation coefficient r is always between -1 and 1; the cases when r = 1 (representing a perfect positive correlation) and r = -1 (perfect negative correlation) are of particular interest. (See Figure 15.) In both cases, the data points (x i, y i ) will be on the straight line y = mx. (See Figure 16.)

13 13 5.2 Gram–Schmidt Process and QR Factorization Convert any basis of a subspace V of R n into an orthonormal basis of V. –If V is a line with basis, we can find an orthonormal basis simply by dividing by its length: –We have to find a vector in V orthogonal to. The natural choice is, where L is the line spanned by. –We divide the vector by its length to get the second vector of an orthonormal basis.

14 14 Example 1 Find an orthonormal basis of the subspace of R 4, with basis. (sol) –

15 15 Matrix Form

16 16 The Gram-Schmidt Process Consider a subspace V of R n with basis, We wish to construct an orthonormal basis of V. Let. As we define, for j=2,3,…,m, we may assume that an orthonormal basis of V j-1 =span( ) has already been constructed. Let Note that

17 17 QR Factorization Consider an matrix M with linearly independent column. Then there is an matrix Q whose columns are orthonormal and an upper triangular matrix R with positive diagonal entries such that M=QR. This representation is unique. Furthermore,

18 18 Example 2 Find the QR factorization of the shear matrix. (sol) –We can compute the column of Q and the entries of R step by step –,,, –, –Now,

19 19 5.3 Orthogonal Transformation and Orthogonal Matrices A linear transformation T from R n to R n is called orthogonal if it preserves the length of vectors: If is an orthogonal transformation, we say that A is an orthogonal matrix.

20 20 Example 2 Consider a subspace V of R n. For a vector in R n, the vector is called the reflection of in V. Show the reflections are orthogonal transformations. (sol) –We can write and. By the Pythagorean, we have

21 21 Orthogonal Transformations Preserve Orthogonality Consider an orthogonal transformation T from R n to R n. If the vector and in R n are orthogonal, then so are and. Orthogonal transformation and orthonormal bases –A linear transformation T from R n to R n is orthogonal if (and only if) the vector form an orthonormal basis of R n. –An matrix A is orthogonal if (and only if) its columns form an orthogonal basis of R n.

22 22 Products and Inverse of Orthogonal Matrices The product AB of two orthogonal matrices A and B is orthogonal. –The linear transformation preserves length, because. The inverse A -1 of an orthogonal matrix A is orthogonal. –The linear transformation preserves length, because.

23 23 Example 4 Consider the orthogonal matrix Find another matrix B whose ijth entry is the jith entry of A: Compute BA, and explain the result. (sol) –This result is no coincidence: The ijth entry of BA is the dot product of the ith row of B and the j th column of A. By definition of B, this is just the dot product of the ith column of A and the j th column of A. Since A is orthogonal, this product is 1 if i=j and 0 otherwise.

24 24 The Transpose of a Matrix Consider an m × n matrix A. –The transpose A T of A is the n×m matrix whose ijth entry is the jith entry of A: The roles of rows and columns are reversed. –We say that a square matrix A is symmetric if A T = A, and A is called skew-symmetric if A T = -A. If and are two (column) vectors in R n, then. Consider an n×n matrix A. The matrix A is orthogonal if (and only if) A T A=I n or, equivalently, if A -1 = A T. The symmetric 2×2 matrices are those of the form The symmetric 2×2 matrices form a three-dimensional subspace of R 2×2, with basis The skew-symmetric 2×2 matrices are These form a one-dimensional space with basis

25 25 Properties of the Transpose Consider an n × n matrix A. Then, the following statements are equivalent: –A is an orthogonal matrix. –The transformation preserves length, that is, for all in R n. –The columns of A form an orthonormal basis of R n. –A T A=I n. –A -1 = A T. Properties of the transpose –If A is an m × n matrix and B an n × p matrix, then (AB) T = B T A T. Note the order of the factors. –If an n × n matrix A is invertible, then so is A T, and (A T ) -1 = (A -1 ) T. –For any matrix A, rank(A) = rank(A T ).

26 26 Orthogonal Projection Consider a subspace V of R n with orthonormal basis. The matrix of the orthogonal projection onto V is AA T, where Pay attention to the order of the factors (AA T as opposed to A T A).

27 27 Example 7 Find the matrix of the orthogonal projection onto the subspace of R 4 spanned by (sol) –Note that the vectors and are orthonormal. Therefore, the matrix is

28 28 5.4 Least Squares and Data Fitting Consider a subspace V = im(A) of R n, where. Then, V ⊥ is the kernel of the matrix A T. For any matrix A, (im A) ⊥ = ker(A T ). Consider the line, then is the plane with equation x 1 +2x 2 +3x 3 =0

29 29 Properties of Orthogonal Complement Consider a subspace V of R n. Then, –dim(V) + dim(V ⊥ ) = n, –(V ⊥ ) ⊥ = V, –V ∩ V ⊥ = {0}. If A is an m × n matrix, then ker(A) = ker(A T A). If A is an m × n matrix with ker(A) = {0}, then A T A is invertible. Consider a vector in R n and a subspace V of R n. Then, the orthogonal projection proj V is the vector in V closest to, in that, for all in V different from.

30 30 Least-Squares Approximations Consider a linear system, where A is an m ×n matrix. A vector in R n is called a least-squares solution of this system if for all in R n.

31 31 The Normal Equation The least-squares solutions of the system are the exact solutions of the (consistent) system. The system is called the normal equation of. If ker(A) = { }, then the linear system has the unique least-squares solution.

32 32 Example 1 Find the least-squares solution of the system, where and What is the geometric relationship between and ? (sol) –We compute and – Recall that is the orthogonal projection of onto the image of A. – is indeed perpendicular to the two column vectors of A.

33 33 The Matrix of an Orthogonal Projection Consider a subspace V of R n with basis. Let Then the matrix of the orthogonal projection onto V is A(A T A) -1 A T. (Example 2) Find the matrix of the orthogonal projection onto the subspace of R 4 spanned by the vectors and. (sol) –Let –And compute

34 34 Data Fitting (Example 3) Find a cubic polynomial whose graph passes through the points (1, 3), (-1, 13), (2, 1), (-2, 33). (sol) –We are looking for a function f (t) = c 0 + c 1 t + c 2 t 2 + c 3 t 3 such that f(1)=3, f(-1)=13, f(2)=1, f(-2)=33; that is, we have to solve the linear system –This linear system has the unique solution –Thus, the cubic polynomial whose graph passes through the four given data points is f (t)=5-4t+3t 2 -t 3,

35 35 Example 4 Fit a quadratic function to the four data points (a 1, b 1 ) = (-1, 8), (a 2, b 2 ) = (0, 8), (a 3, b 3 ) = (1, 4), and (a 4, b 4 ) = (2, 16). (sol) –We are looking for a function f (t) = c 0 + c 1 t + c 2 t 2 such that or or where and –We have four equations, corresponding to the four data points, but only three unknowns, the three coefficients of a quadratic polynomial. Check that this system is indeed inconsistent.

36 36 Example 4 (II) –The least-squares solution is –The least-squares approximation is, as shown in Figure 7. –This quadratic function f * (t) fits the data points best, in that the vector is as close as possible to –This means that –is minimal: The sum of the squares of the vertical distances between graph and data points is minimal. (See Figure 8.)

37 37 Example 5 Find the linear function c 0 + c 1 t that best fits the data points (a 1, b 1 ), (a 2, b 2 ),..., (a n, b n ), using least squares. Assume that a 1 = a 2. (sol) –We attempt to solve the system or or –Note that rank(A) = 2, since.

38 38 Example 5 (II) –The least-squares solution is –We have found that –These formulas are well known to statisticians. There is no need to memorize them.

39 39 Example 6 In the accompanying table, we list the scores of five students in the three exams given in a class. Find the function of the form f = c 0 +c 1 h +c 2 m that best fits these data, using least squares. What score f does your formula predict for Marlisa, another student, whose scores in the first two exams were h = 92 and m = 72? (sol) – We attempt to solve the system

40 40 Example 6 (II) –The least-squares solution is –The function which gives the best fit is approximately f = -42.4 + 0.639h + 0.799m. –This formula predicts the score for Marlisa.

41 41 5.5 Inner Product Spaces An inner product in a linear space V is a rule that assigns a real scalar (denoted by ) to any pair f, g of elements of V, such that the following properties hold for all f, g, h in V, and all c in R: – = – = + – =c – >0, for all nonzero f in V A linear space endowed with an inner product is called an inner product space. The inner product for functions is a continuous version of the dot product: The more subdivisions you choose, the better the dot product on the right will approximate the inner product.

42 42 Example 3 The trace of a square matrix is the sum of its diagonal entries. For example,. In, we can define the inner product = trace(A T B). –We will verify the first and the fourth axioms. – = trace(A T B)= trace((A T B) T )=trace(B T A) =. –To check that >0 for nonzero A, write A in terms of its columns:

43 43 Example 3 (II) –If A is nonzero, then at least one of the is nonzero, so that the sum is positive, as desired.

44 44 Norm, orthogonality The norm (or magnitude) of an element f of an inner product space is. Two elements f and g of an inner product space are called orthogonal (or perpendicular) if. We can define the distance between two elements of an inner product space as the norm of their difference: dist( f, g) = ||f-g||. In physics, the quantity || f|| 2 can often be interpreted as energy. For example, it describes the acoustic energy of a periodic sound wave f(t) and the elastic potential energy of a uniform string with vertical displacement f(x). (See Figure 3.) The quantity || f|| 2 may also measure thermal or electric energy.

45 45 Example 4, 5, 6 (Example 4) In the inner product space C[0, 1] with, find ||f|| for. (sol) (Example 5) Show that f(t)=sin(t) and g(t)=cos(t) are perpendicular in the inner product space C[0, 2π] with (sol) – (Example 6) Find the distance between f(t)=t and g(t)=1 in C[0, 1]. (sol) –

46 46 Orthogonal Projections If g 1,..., g m is an orthonormal basis of a subspace W of an inner product space V, then proj W f= g 1 +···+ g m, for all f in V.

47 47 Example 7 Find the linear function of the form g(t) = a+bt that best fits the function f (t) = e t over the interval from -1 to 1, in a continuous least-squares sense. (sol) –We need to find proj P 1 f. We first find an orthonormal basis of P 1 for the given inner product. In general, we have to use the Gram–Schmidt process to find an orthonormal basis of an inner product space. Because the two functions 1, t in the standard basis of P1 are already orthogonal, or, we merely need to divide each function by its norm: and –An orthonormal basis of P 1 is and –Now,

48 48 Fourier Analysis A space T n consists of all functions of the form f(t)=a+b 1 sin(t)+c 1 cos(t)+···+b n sin(nt)+c n cos(nt), called trigonometric polynomials of order ≤ n. Consider the Euler identities: These equations tell us that the functions 1, sin(t), cos(t),..., sin(nt), cos(nt) are orthogonal to one another. Another of Euler’s identities tells us that This means that the functions sin(t), cos(t),..., sin(nt),cos(nt) all have norm 1 with respect to the given inner product.

49 49 Orthonormal Basis in Fourier Analysis Let T n be the space of all trigonometric polynomials of order ≤n, with the inner product then, the function form an orthonormal basis of T n.

50 50 Fourier Approximation If f is a piecewise continuous function defined on the interval [- π, π], then its best approximation f n in T n is where The b k, the c k, and a 0 are called the Fourier coefficients of the function f. The function is called the nth-order Fourier approximation of f.

51 51 Example 8 Find the Fourier coefficients for the function f(t) = t on the interval.π ≤ t ≤ π: All c k and a 0 are zero, since the integrands are odd functions. The first few Fourier polynomials are

52 52 Fourier Analysis (II) The infinite series of the squares of the Fourier coefficients of a piecewise continuous function f converges to ||f|| 2.


Download ppt "1 Chapter 5 – Orthogonality and Least Squares Outline 5.1 Orthonormal Bases and Orthogonal Projections 5.2 Gram-Schmidt Process and QR Factorization 5.3."

Similar presentations


Ads by Google