1 Last lecture summary independent vectors x rank – the number of independent columns/rows in a matrixRank of this matrix is 2!Thus, this matrix is noninvertible (singular).It’s because both column and row spaces have thesame rank.And row2 = row1 + row3 are identical, thus rank is 2.
2 Column space – space given by columns of the matrix and all their combinations. Columns of a matrix span the column space.We’re highly interested in a set of vectors that spans a space and is independent. Such a bunch of vector is called a basis for a vector space.Basis is not unique.Every basis has the same number of vectors – dimension.Rank is dimension of the column space.
3 dim C(A) = r, dim N(A) = n - r (A is m x n) row space left null space C(AT), dim C(AT) = rleft null spaceN(AT), dim N(AT) = m – rC(A) ┴ N(AT)C(AT) ┴ N(A), row space and null space are orthogonal complements
5 orthogonal = perpendicular, dot product aTb = a1b1+a2b2+… = 0 length of the vector |a| = √|a|2 = √aTaIf subspace S is orthogonal to subspace T then every vector in S is orthogonal to every vector in T.
6 Four possibilities for Ax = b A: m × n, rank rr = m & r = nsquare & invertible1 solutionr = m & r < nshort & wide∞ solutionsr < m & r = ntall & thin0 or 1 solutionr < m & r < nnot full rank0 or ∞ solutions
7 Least squares problem induction based on excelent video lectures by Gilbert Strang, MITLecture 15
8 I want to solve Ax = b when there is no solution. WHAT ??WAS ??
9 So b is not in a column space. This problem is not rare, it’s actually quite typical.It appears when the number of equations is bigger than the number of unknowns (i.e. m > n for m x n matrix A)so what can you tell me about rank, what the rank can be?it can’t be m, it can be n or even lessso there will be a lot of RHS with no solution !!- show the example, 3x2 matrix with two independent columns, then solution x lie only in the plane given by these two columns, however
10 In many problems we’ve got too many equations with noisy RHSs (b). ExampleYou measure a position of sattelite buzzing aroundThere are six parameters giving the positionYou measure the position 1000-timesAnd you want to solve Ax = b, where A is 1000 x 6In many problems we’ve got too many equations with noisy RHSs (b).So I can't expect to solve Ax = b exactly right, because there's a measurement mistake in b. But there's information too. There's a lot of information about x in there.So I’d like to separate the noise from the information.
11 One way to solve the problem is throw away some measurements till we get nice square, non-singular matrix.That’s not satisfactory, there's no reason in these measurements to say these measurements are perfect and these measurements are useless.We want to use all the measurements to get the best information.But how?
12 What you can tell me about the matrix? Now I want you jump ahead to the matrix that will play a key role. It is a matrix ATA.What you can tell me about the matrix?shape?squaredimension?n x nsymmetric or not?symmetricNow we can ask more about the matrix. The answers will come later in the lectureIs it invertible?If not, what’s its null space?Now let me to tell you in advance what equation to solve when you can’t solve Ax = b:multiply both sides by AT from left, and you get ATAx = ATb, but this x is not the same as x in Ax = b, so lets call it , because I am hoping this one will have a solution.And I will say it’s my best solution. This is going to be my plan.
13 Now ask ourselves when ATA is invertible? And do it by example. So you see why I am so interested in ATA matrix, and its invertibility.Now ask ourselves when ATA is invertible? And do it by example.3 x 2 matrix, i.e. 3 equations on 2 unknownsrank = 2Does Ax equal b? When can we solve it?Only if b is in the column space of A.It is a combination of columns of A.The combinations just fill up the plane,but most vectors b will not be on that plane.
14 So I am saying I will work with matrix ATA. Help me, what is ATA for this A?Is this ATA invertible?YesHowever, ATA is not always invertible !Propose such A so that ATA is not invertible ?Generally, if I have two matriceseach with rank r, their productcan’t have rank higher than r.And in our case rank(A)=1, sorank(AT) can’t be more than 1.
15 This happens always, rank(ATA) = rank(A). If rank(ATA) = rank(A), then N(ATA)=N(A).So ATA is invertible exactly if N(A)=0. Which means when columns of A are independent.
16 Projections based on excelent video lectures by Gilbert Strang, MIT Lecture 15
17 I want to find a point on line a that is closest to b. e is the error, i.e. how muchI am wrong by, and it isperpendicular to aAnd we know, that theprojection p is some multipleof a, p = xa. And we want tofind the number x.be = b - pap = xapI want to find a point on line a that is closest to b.My space is what?2D planeIs line a a subspace?Yes, it is, one dimensional.So where is such a point?So we say we projected vector b on line a, we projected b into subspace. And how did we get it?Orthogonality
18 → Key point is that a is perpendicular to e. So I have aTe = aT(b-p) = aT(b -xa) = 0So after some simple math we getI may look at the problem from another point of view.The projection from b to p is carried out by some matrix called projection matrix P.p = PbWhat is the P for our case?→9:00
19 Projection matrix What’s its column space? How acts the column space of a matrix A?If you multiply the matrix A by anything you always get in the column space. That’s what column space is.So where am I if I do Pb?I am on the line a. The column space of P is the line through a.
20 What is the rank of P?oneColumn times row is a rank one matrix, the columns of the matrix are row-wise-multiples of the column vector, so the column vector is a basis for its column space.
21 P is symmetric. Show me why? What happens if I do the projection twice? i.e. I multiply by P and then by P again (P × P = P2).
22 So if I project b, and then do projection again I what? e = b - pp = xa = PbpSo if I project b, and then do projection again I what?stay putSo P2 = P … Projection matrix is idempotent.
23 Summary: if I want to project on line, there are three formulas to remember: And properties of P:P = PT, P = P2
24 More dimensionsThree formulas again, but different, we won’t have single line, but plane, 3D or nD subspace.You may be asking why I actually project?Because Ax = b may have no solutionI am given a problem with more equations than unknowns, I can’t solve it.The problem is that Ax is in the column space, but b does not have to be.So I change vector b into closest vector in the column space of A.So I solve Ax = p instead !!p is a projection of b onto the column spaceI should indicate somehow, that I am not looking for x from Ax = b (x, which actually does not exist), but for x that’s the best possible.
25 I must figure out what’s the good projection here I must figure out what’s the good projection here. What's the good RHS that is in the column space and that's as close as possible to b.Let’s move into 3D space, where I have a vector b I want to project into a plane (i.e. subspace of 3D space)
26 This plane is the column space of matrix A e = b - pe is perpendicular to the planeba2pa1this is a plane of a1 and a2This plane is the column space of matrix AApparently, projection p is some multiple of basis vectors.p = x1a1 + x2a2 = Ax , and I am looking for xvector b is not in the planeand I want to project b down into the planeSo I am looking for a nice formula for the projection of b into the plane.Fist of I've got to say what is that plane.How I going to tell you a plane?I'll tell you a basis for the plane. Two vectors a_1 and a_2They don’t have to be prependicular, but I will choose them to be perpendicular.I have an error e, this b-p part^^^^So now I've got hold of the problem. The problem is to find the rightcombination of the columns so that the error vector (b – Ax) is perpendicularto the plane.^
27 I write again the main point Projection is p = AxProblem is to find xKey is that e = b – Ax is perpendicular to the planeSo I am looking for two equations, because I have x1 and x2.And e is perpendicular to the plane, so it means it must be perpendicular to each vector in the plane. It must be perpendicular to a1 and a2 !!So which two eqs. do I have? Help me.be = b - pa2^p^a1^^^
28 In what subspace lies (b – Ax)? A word about subspaces.In what subspace lies (b – Ax)?Well, this is actually vector e, so I have ATe=0. Thus in which space is e?In N(AT)!And from the last lecture, what do we know about N(AT)?It is perpendicular to C(A).^- the two equations can be written in a matrix form, see the second equation
29 We all are happy, aren’t we? e is in N(AT)e is ┴ to C(A)bpa1a2e = b - pIt perfectly holds.We all are happy, aren’t we?- z toho mame vsichnio opravnenou radost
30 OK, we’ve got the equation, let’s solve it. ATA is n by n matrix. normal equationsOK, we’ve got the equation, let’s solve it.ATA is n by n matrix.As in the line case, we must get answers to three questions:What is x?What is projection p?What is projection matrix P?^
31 What is the projection p = Ax? ^x is what? Help me.What is the projection p = Ax?What’s the projectionmatrix p = Pb?^projection matrix P
32 Apparently not, but why not? What did I do wrong? can I do this?Apparently not, but why not? What did I do wrong?A is not square matrix, it does not have an inverse.Of course, this formula works well also if A was square invertible n x n matrix.Then it’s column space is the whole what?RnThen b is already in the whole Rn space, I am projecting b there, so the P = I.
33 Also P = PT, and P = P2 holds. Prove P2! So we have all the formulasAnd when will I use these equations. If I have more equations (measurements) than unknowns.Least squares, fitting by a line.
35 Least Squares Calculation based on excelent video lectures by Gilbert Strang, MITLecture 16
36 Projection matrix recap Projection matrix P = A(ATA)-1AT projects vector b to the nearest point in the column space (i.e. Pb).Let’s have a look at two extreme cases:If b is in the column space, then Pb = b. Why?What does it mean that b is in the column space of A?b is linear combination of columns of A, i.e. b is in the form Ax.so Pb = PAx = A(ATA)-1ATAx = Ax = b
37 If b is ┴ to the column space of A then Pb = 0. Why? What vectors are perpendicular to the column space?Vectors in N(AT)Pb = A(ATA)-1ATb = 0C(A)= 0p = Pb → b – e = Pbe = (I - P)bpbep + e = bThat’s the projection too.Projection onto the ┴ space.N(AT)When P projects onto one subspace, I – P projects onto the perpendicular subspace
38 OK, I want to find a matrix A, once we have A, we can do all we need. ypoints (1,1) (2,2) (3,2)(Points at the picture areshifted for better readability.)xOK, I want to find a matrix A, once we have A, we can do all we need.I am looking for the best line (smallest overall error) y = a + bx,meaning I am looking for a, b.I have three points, they do not lie exactly on the line, but close to. And I want to find the best line, with the smallest error.Equations:a + b = 1a + 2b = 2a + 3b = 2but this canthis eq. can’t be solved
39 In other words, the best solution is the line with smallest errors in all points. So I want to minimize length |Ax – b|, which is the error |e|, actually I want to minimize the never-zero quantity |Ax – b|2.yb2so the overall error is the sum of squares|e1|2 + |e2|2 + |e3|2p3e2e3p1p2e1b3What are those p1, p2, p3?If I put them in the equationsa + b = p1a + 2b = p2a + 3b = p3I can solve them. Vector [p1,p2,p3] is in the column spaceb1x
40 Least squares – traditional way least squares problem – “metoda nejmenších čtverců” … the sum of square of errors is minimizedypoints (x,y) : (1,1) (2,2) (3,2)I am looking for a line: a + bx = yxEquations:a + b = 1a + 2b = 2a + 3b = 2
41 So if there is a solution, each point lies on that line: Equations:a + b = 1a + 2b = 2a + 3b = 2xye1e2e3p1p2p3b1b3b2points (x,y) : (1,1) (2,2) (3,2)So if there is a solution, each point lies on that line:a + b = 1, a + 2b = 2, a + 3b = 2However, there is apparently no solution, no line at which all three points lie.The optimal line a+bx will go somewhere between the points. Thus for each point, there will be some error (i.e. b value of the point on that line will differ from the required b value)Therefore, the errors are:e1 = a + b - 1, e2 = a + 2b - 2, e3 = a + 3b - 2what are errors?I’m looking for the line y = a + bx.So if there is a solution, each point lies on that line: a+b = 1, a + 2b = 2, a + 3b = 2However, there is apparently no solution, no line at which all three points lie.The optimal line a+bx will go somewhere between the points. Thus for each point, there will be some error (i.e. b value of the point on that line will differ from the required b value)Therefore, the errors are: e1=a+b-1, e2=a+2b-2, e3=a+3b-2I take square of each of these errors, and add them up: E = (a+b-1)2+(a+2b-2)2+(a+3b-2)2I want to find minimum of this expression, there are two variables (a, b), so I must find partial derivation of a and of b, and each of these will have to be zero (minimum)dE/dC = 2.(a+b-1).1+2.(a+2b-2).1+2.(a+3b-2).1=6a+12b-10 a to se musi rovnat nule, tj: 3a+6b=5dE/dD = 2(a+b-1).1+2.(a+2b-2) (a+3b-2).3=12a+28b-22 a to se musi rovnat nule, tj.: 6a+14b=11In matrix form, I must do ATAx = ATb, where matrix A and vector b are from a+b = 1, a + 2b = 2, a + 3b = 2, thus A = [1 1;1 2;1 3], b = [1 2 3]so ATA is (one must calculate this product for the given matrix A) is [3 6;6 14]and ATb is [5 11]1. and 2. are EQUIVALENT, they lead to the same solution
42 Least squares – linear algebra way C(A)pbeN(AT)
43 Let’s solve that equation for Help me, what is ATA? And what is ATb? And now computationTask: find p and x = [a b]Let’s solve that equation forHelp me, what is ATA?And what is ATb?So I have to solve (Gauss elimination) a system of linear equations 3a + 6b =5, 6a + 14b = 11^a = 1/2 b=2/3
44 So we have projection vector p, and error vector e best line: 2/3 + 1/2xWhat is p1?A value for x = 1 … 7/6And e1?1 - p1 = -1/6p2 = 5/3, e2 = +2/6, p3 = 13/6, e3 = -1/6So we have projection vector p, and error vector epoints (1,1) (2,2) (3,2)Ja, das stimmt!
45 p and e should be perpendicular. Verify that. However, e is not perpendicular not only to p. Give me another vector e is perpendicular to?Well, e is perpendicular to column space, so?It must be perpendicular to columns of matrix A, i.e. to [1 1 1] and [1 2 3]Just again, fitting by straight line means solving the key equation- calculate p.e for the given caseBut A must have indpendent columns,then ATA is invertibleIf not, oops, sorry, I am out of luck