Regressions and approximation Prof. Graeme Bailey (notes modified from Noah Snavely, Spring 2009)

Slides:



Advertisements
Similar presentations
10.4 Complex Vector Spaces.
Advertisements

Chapter 4 Euclidean Vector Spaces
The Simple Regression Model
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Objectives 10.1 Simple linear regression
Least squares CS1114
CSci 6971: Image Registration Lecture 14 Distances and Least Squares March 2, 2004 Prof. Chuck Stewart, RPI Dr. Luis Ibanez, Kitware Prof. Chuck Stewart,
1 Functions and Applications
CLUSTERING PROXIMITY MEASURES
Signal Spaces.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
Planning under Uncertainty
6 6.1 © 2012 Pearson Education, Inc. Orthogonality and Least Squares INNER PRODUCT, LENGTH, AND ORTHOGONALITY.
Chapter 5 Orthogonality
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Chapter 4 Multiple Regression.
Evaluating Hypotheses
Nonlinear Regression Probability and Statistics Boris Gervits.
6 6.3 © 2012 Pearson Education, Inc. Orthogonality and Least Squares ORTHOGONAL PROJECTIONS.
Lecture 10: Robust fitting CS4670: Computer Vision Noah Snavely.
Computing transformations Prof. Noah Snavely CS1114
VECTORS AND THE GEOMETRY OF SPACE 12. VECTORS AND THE GEOMETRY OF SPACE So far, we have added two vectors and multiplied a vector by a scalar.
University of Texas at Austin CS395T - Advanced Image Synthesis Spring 2006 Don Fussell Orthogonal Functions and Fourier Series.
Neural Networks Lecture 8: Two simple learning algorithms
Linear Regression.
CSC 589 Lecture 22 Image Alignment and least square methods Bei Xiao American University April 13.
Linear Regression Analysis Additional Reference: Applied Linear Regression Models – Neter, Kutner, Nachtsheim, Wasserman The lecture notes of Dr. Thomas.
PATTERN RECOGNITION AND MACHINE LEARNING
Inner Product Spaces Euclidean n-space: Euclidean n-space: vector lengthdot productEuclidean n-space R n was defined to be the set of all ordered.
Vectors and the Geometry of Space
Chapter 5: The Orthogonality and Least Squares
Copyright © Cengage Learning. All rights reserved. 12 Vectors and the Geometry of Space.
CHAPTER FIVE Orthogonality Why orthogonal? Least square problem Accuracy of Numerical computation.
Orthogonal Functions Numerical Methods in Geophysics Function Approximation The Problem we are trying to approximate a function f(x) by another function.
UNIVERSAL FUNCTIONS A Construction Using Fourier Approximations.
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
Elementary Linear Algebra Anton & Rorres, 9th Edition
Mathematics for Graphics. 1 Objectives Introduce the elements of geometry  Scalars  Vectors  Points Develop mathematical operations among them in a.
Linear Regression James H. Steiger. Regression – The General Setup You have a set of data on two variables, X and Y, represented in a scatter plot. You.
Linear Regression Least Squares Method: an introduction.
Chapter 1-The Basics Calculus, 2ed, by Blank & Krantz, Copyright 2011 by John Wiley & Sons, Inc, All Rights Reserved.
Chapter 3 Vectors in n-space Norm, Dot Product, and Distance in n-space Orthogonality.
Fourier TheoryModern Seismology – Data processing and inversion 1 Fourier Series and Transforms Orthogonal functions Fourier Series Discrete Fourier Series.
Vector Norms and the related Matrix Norms. Properties of a Vector Norm: Euclidean Vector Norm: Riemannian metric:
Elementary Linear Algebra Anton & Rorres, 9th Edition
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
MATERI I FUNGSI. Preliminaries REAL NUMBERS A real number may be either rational or irrational; either algebraic or transcendental; and either positive,
CHEE825 Fall 2005J. McLellan1 Spectral Analysis and Input Signal Design.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
AGC DSP AGC DSP Professor A G Constantinides©1 Signal Spaces The purpose of this part of the course is to introduce the basic concepts behind generalised.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Chapter 4 Euclidean n-Space Linear Transformations from to Properties of Linear Transformations to Linear Transformations and Polynomials.
Fitting image transformations Prof. Noah Snavely CS1114
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
Copyright © Cengage Learning. All rights reserved. 12 Vectors and the Geometry of Space.
Optimization and least squares Prof. Noah Snavely CS1114
Inner Product Spaces Euclidean n-space: Euclidean n-space: vector lengthdot productEuclidean n-space R n was defined to be the set of all ordered.
1 Functions and Applications
Week 3 2. Review of inner product spaces
Lecture 7: Image alignment
A special case of calibration
Object recognition Prof. Graeme Bailey
Singular Value Decomposition SVD
Elementary Linear Algebra Anton & Rorres, 9th Edition
Lecture 8: Image alignment
Calibration and homographies
Image Stitching Linda Shapiro ECE/CSE 576.
Lecture 11: Image alignment, Part 2
Approximation of Functions
Approximation of Functions
Presentation transcript:

Regressions and approximation Prof. Graeme Bailey (notes modified from Noah Snavely, Spring 2009)

Object recognition 1.Detect features in two images 2.Match features between the two images 3.Select three matches at random 4.Solve for the affine transformation T 5.Count the number of inlier matches to T 6.If T is has the highest number of inliers so far, save it 7.Repeat 3-6 for N rounds, return the best T 2

A slightly trickier problem  What if we want to fit T to more than three points? –For instance, all of the inliers we find?  Say we found 100 inliers  Now we have 200 equations, but still only 6 unknowns  Overdetermined system of equations  This brings us back to linear regression 3

Linear regression, > 2 points  The ‘best’ line won’t necessarily pass through any data point 4 y = mx + b (y i, x i )

5 Some new definitions  No line is perfect – we can only find the best model, the line y = mx + b, out of all the imperfect ones. This process is called optimisation.  We’ll define a function Cost(m,b) that measures how far a line is from the data, then use that to find the best line –I.e., the model [m,b] that minimizes Cost(m,b) –Such a function Cost(m,b) is called an objective function –Often, we’re looking for a sequence of approximations which successively reduce the value of Cost(m,b) –There may or may not be a ‘limit’ to which these approximations converge.

6 Line goodness  What makes a line good versus bad? –This is actually a very subtle question –In reality, our cost function is a distance function or metric which can measure the distance between two functions in the space of possible functions. –There may be several different functions having the same distance from a given target function … how might we choose the ‘best’?

7 Residual errors  The difference between what the model predicts and what we observe is called a residual error –Consider the data point (x,y) and the model (line) y=mx+b –The model [m,b] predicts (x,mx+b) –The residual is y – (mx + b), the error from using the model to acquire the y value for that value of x –Each residual is the vertical distance to the proposed line –How do we measure the cumulative effect of all these residuals? There are many options here. –Once we’ve decided on our formula, how do we actually compute the least bad model (line)? Called optimisation.

8 Linear vertical error fitting This is a reasonable cost function, but we usually use something slightly different What are the benefits and negatives of this cost metric?

9 Least squares fitting We prefer to make this a squared distance Called “least squares” What about the benefits and negatives of this version?

10 Measuring distance  It’s interesting to see how we can make precise the notions of distance, length, and angle.  In particular, we’ll see how to articulate our biases (preferences) so that we can compute for them.  Anything which satisfies the following conditions is allowed to be called a distance metric –We’ll use d(p,q) to denote the distance between p and q. –d(p,q) ≥ 0 always, and d(p,q) = 0 if and only if p = q. –d(p,q) = d(q,p) always –d(p,q) ≤ d(p,r) + d(r,q) always. This last condition is called the triangle inequality, and says that it’s never shorter to go via somewhere else! p r q

11 Measuring length  If we had a way of measuring the length of a vector v from p to q, then the length of v would be d(p,q)  Anything which satisfies the following conditions is allowed to be called a length or norm for vectors –We’ll use ||v|| to denote the norm of v –||v|| ≥ 0 always, and ||v|| = 0 if and only if v = 0 –||av|| = |a| ||v|| always (stretching factors) –||v + w|| ≤ ||v|| + ||w|| always. This last condition is again the triangle inequality!  If p and q are vectors from the origin to the points p and q respectively, then d(p,q) = ||p - q|| 0 p q p-qp-q

12 Measuring length  Examples –In 2D, if v = xi + yj then define ||v|| = √(x 2 + y 2 ). This gives the usual Pythagorean or Euclidean distance. –Also in 2D, we could define ||v|| = max(|x|, |y|). Notice that a ‘circle’ around the origin would look to us like a square!! (A circle is a set of points equidistant from its centre.) –In (countably) infinite dimensions, if v = Σ v n e n then we can define ||v|| = √Σ v n 2 (assuming that series converges), or even as max(|v n |), by analogy with the two previous examples. –For functions defined on the interval [a, b] we could again extend the analogies to define ||f|| = either the square root of the integral from a to b of f(x) squared, or the integral of the absolute value of f(x). –Notice the different properties emphasised by the Euclidean distances versus the max flavours. There is no ‘right answer’, the choice as to which is ‘best’ depends on the context.

13 Measuring angle  There’s a curious, yet surprisingly powerful way to define the angle between two vectors. Anything which satisfies the following conditions is allowed to be called a dot, scalar or inner product. –We’ll use v.w to denote the inner product of v and w. –v.v ≥ 0 always, and v.v = 0 if and only if v = 0 –v.w = w.v always –v.(aw) = a(v.w) –v.(u + w) = v.u + v.w always.  If we have an inner product, then we can use that to define a norm (via ||v|| 2 = v.v) and then a metric. We define the angle θ between vectors v and w via the definition v.v = ||v|| ||w|| cos θ (assuming our scalars are reals; if they’re complex numbers then we take the complex conjugate of the RHS)

14 Measuring angle  Examples –In 2D, if v = v 1 i + v 2 j and w = w 1 i + w 2 j then we can define v.w = v 1 w 1 + v 2 w 2. This gives the usual Pythagorean or Euclidean angles and distances. –This can be extended to (countably) infinite dimensions and to function spaces in the same way (so the inner product of two functions f and g would be the integral of f(x)g(x) over [a, b]). –Perhaps surprisingly, there is no analogous example of inner product from which we can obtain the max norms (we can prove that no such inner product exists – it’s not just that we couldn’t see how to do it!!). – v w Having a definition of angle allows us to project the component of one vector onto another. So if ŵ is the unit vector in the direction of w, then the component of v in the direction of w is (||v|| cos θ) ŵ = ((v.w) / (w.w)) w θ

15 Applications  Now that we know how to formalise distance, we can define it so that we weight differently according to what we regard as important distinctions –If it satisfies the definition it can be used as a distance metric! –Given a sense of distance, we can ask about convergence (namely getting closer to a limit). So then a sequence (of functions) f n f if d(f n, f) 0. –Notice that if f n f then necessarily d(f n, f m ) 0, ie the terms of the sequence must eventually get closer together. This is nicer since we don’t have to know what the limit function is in order to show that a sequence ought to converge!! (Cauchy sequence.) –Suppose we wanted to ensure that our target line t was ‘close’ to the data points and also ‘close’ to ‘the slope’ of the data. We could estimate the data’s slope by taking the (weighted?) mean µ of the slopes of the piecewise linear path through the data points and then define a corresponding metric.