 # PHYS2020 NUMERICAL ALGORITHM NOTES ROOTS OF EQUATIONS.

## Presentation on theme: "PHYS2020 NUMERICAL ALGORITHM NOTES ROOTS OF EQUATIONS."— Presentation transcript:

PHYS2020 NUMERICAL ALGORITHM NOTES ROOTS OF EQUATIONS

Finding the Roots of Equations  What does finding the roots of an equation mean?  It means finding the x values when y = 0, assuming our relationship is y = f(x)  Let's take a small (familiar?) example:  Find the roots of the equation y = x^2 + 3x -4

 What are the roots? Start with y= 0, so, 0 = x^2 + 3x -4 factorise to get: y = 0 = (x + 4)( x - 1)  To make y = 0 true, either x = -4, or x = 1.  Graphically y x x = -4x = 1

Finding when an equation is equal to 0 is very important. If you want to minimise (or maximise) something you need to find the point where the derivative is equal to zero. Why is finding the roots of an equation important? e.g. If then the minimum or maximum of this function is found when so, in this case

 A max/min occurs when dy/dx=0 (i.e. the gradient equals 0)  So, we have 2x = 0, so x=0. The minimum value for this equation occurs when x = 0.  How can we tell if it is a minimum or a maximum? Try taking the second derivative!  d^2y/dx^2 = 2. The value is positive, so dy/dx is a minimum. Optimisation  There are many examples where we may wish to find the minimum or maximum value of a function. Wishing to maximise or minimise a function is often known as optimisation.  This is equivalent to finding the roots of the derivative of that function.

Opimisation is a common strategy used in experimental physics. A good example is trying to measure the black-body function of a protostar. Finding the Roots Numerically

We measure the intensity of the electromagnetic radiation from the star at a number of wavelengths and then try and find the best fit between the measured values and those predicted by the exact blackbody function. Experimental uncertainties usually guarantee that the measurements no one blackbody function will fit exactly, so we try and find the blackbody function (or often range of functions) that minimizes the difference between measured and predicted values. Hypothetical measured values marked with an x. Which curve is the best fit? Finding out takes the use of numerical optimization. x x x x x

Finding Equation Roots with the Bisection Method The bisection method is what is known as a bracketing method because we bracket the root by finding a place where the function is positive and a place where it is negative. We successively halve this interval to eventually find the root. y x Root is between these values Y > 0 Y < 0

For y = f(x), if we find an x value where y is positive and another where y is negative, then we are guaranteed that a root exists on the interval bracketed by these two values of x, provided of course that y is continuous over this interval. y Root is between x 1 and x 2 Y 2 > 0 Y 1 < 0 x2x2 x1x1 xmxm Y m > or < 0? Once we find x 1 and x 2 such that y 1 and y 2 have opposite signs, we then evaluate y for x m, the midpoint between x 1 and x 2. Then take a new interval bracketed by either x 1 and x m, or x m and x 2, depending on which interval is bracketed by a positive and a negative value of y. In the diagram below, after calculating x m we know that the root lies between x 1 and x m.

 ROBUSTNESS:  Bracketing is a robust method of root finding. It will always give us a root provided the conditions mentioned are met.  SPEED OF CONVERGENCE:  However, bracketing can converge slowly compared to fixed-point methods (such as Newton’s method). The trade off is that although fixed point methods may converge faster, there is no guarantee that they will converge at all.  FORM OF FUNCTION f(x):  One big advantage of bracketing methods is that an analytic form of the function f(x) need not be known.  This is particularly useful when you have a series of laboratory measurements that are positive and negative – for example the position of a chaotic oscillator at particular times. Sometimes it may be above the equilibrium position and other times below. Some comments on bracketing methods

Convergence of the bisection method It is easy to see how quickly the bisection method converges, and what the uncertainty in the value of the root we obtain will be after n iterations. The uncertainty in x n (the value of the root after n iterations) will be related to the size of the interval after n iterations (bisections). The time taken to converge to this level of uncertainly will be approximately the time it takes for n iterations of your code. The uncertainty after n iterations the size of the interval I will be: So, after n iterations, the uncertainty in x n (compared to the true unknown value of the root, x  ) is known to be

The Newton-Raphson Method The Newton-Raphson method (often just called Newton’s method) is a fixed-point method of finding the roots to an equation. This is an iterative method based on truncating the Taylor series expansion of f(x) at the first order term. It can also be arrived at by geometric considerations alone, which is what we will use here. It can also be arrived at by geometric considerations alone, which is what we will use here.

To use Newton’s method we to know the form of f(x), and we need an initial approximation to the root, x 0. y x Y = f(x) x0x0 f(x 0 )

We construct the tangent line to f(x) at x 0, and then find the point, x 1, where the tangent line intercepts the x-axis. We take x 1 as the new approximation to the root and then do the same again. y x y = f(x) x0x0 f(x 0 ) x1x1 x1x1

We now construct the tangent line to f(x) at x 1, and then find the point, x 2, where the new tangent line intercepts the x-axis. We then take x 2 as the next approximation to the root iterate to find further approximations. In the example illustrated here we are converging quite quickly. y x y = f(x) x0x0 f(x 0 ) x1x1 x1x1 x2x2

To obtain a mathematical expression for Newton’s method we need to find an expression for x 1 in terms of x 0 and f(x). We start by finding the equation for the tangent line at f(x 0 ), and then finding where this intercepts the x-axis to find x 1. y x y = f(x) x0x0 f(x 0 ) x1x1 x1x1

What do we know about the tangent line to f(x) at x 0 ? Remember that we can only use this method if we know f(x) analytically. So, if have a first guess at the root, x 0, we also know f(x 0 ). This means that we know one point on the tangent line, (x 0, f(x 0 )). We also know the slope of the tangent line which is given by Remembering that the equation of a straight line where we know one point and the slope is given by the expression We can substitute in (x 0, f(x 0 )) and f ’ (x) to get the equation for the tangent line of:

Now that we know the general equation for the tangent line, we want to know the x-value, x 1, where this line crosses the x-axis, so we set y = 0. This gives So, more generally, Newton’s method tells us that

Convergence of Newton’s Method Newton’s method converges quadratically. This means that the error at the (n + 1) th step is proportional to the error at the n th step. (This is in contrast to the bisection method which exhibits linear convergence.) Quadratic convergence is good if your first approximation is close to the root you are seeking. The square of a small number is a much smaller number. However, in practical terms, if the error in your first approximation to the root you are seeking is not “close enough” to that root, the convergence may be very slow, or the series may not converge at all. “Close enough” is a physicist’s term rather than a mathematician’s term. I (as a physicist) would merely caution you that you will find Newton’s method particularly useful for solving problems such as finding the correct blackbody function for an observed protostar, but you will need to find a good first approximation to the root you seek.

Advantages of Newton’s Method Newton’s method is not as “robust” as the bisection method, but it does have a very significant advantage:Newton’s method is not as “robust” as the bisection method, but it does have a very significant advantage: It generalises fairly easily to multi-dimensional problems, where you need to find a root in multi-dimensional space, given n simultaneous equations each involving n scalar variables, whereas the bisection method does not.It generalises fairly easily to multi-dimensional problems, where you need to find a root in multi-dimensional space, given n simultaneous equations each involving n scalar variables, whereas the bisection method does not. In multi-dimensional space all the problems of convergence are magnified. Newton’s method is good at either finding a “local” minimum (i.e. not the root you want), or not converging at all.In multi-dimensional space all the problems of convergence are magnified. Newton’s method is good at either finding a “local” minimum (i.e. not the root you want), or not converging at all. Of course, this can be overcome – just remember that you must use “physical” considerations to find a good first approximation for your desired minimum or root!Of course, this can be overcome – just remember that you must use “physical” considerations to find a good first approximation for your desired minimum or root!

In the case of trying to fit observations of the intensity of a protostar at various wavelengths to a single blackbody function, each measurement gives an equation predicting temperature on the basis of that measurement. If they all suggest the same temperature life is easy. Generally however, each one gives a different temperature, so it is a matter of trying to minimise the difference between observed and predicted values.In the case of trying to fit observations of the intensity of a protostar at various wavelengths to a single blackbody function, each measurement gives an equation predicting temperature on the basis of that measurement. If they all suggest the same temperature life is easy. Generally however, each one gives a different temperature, so it is a matter of trying to minimise the difference between observed and predicted values. In my experience of experimental physics this general class of problem is one of the most important we deal with:In my experience of experimental physics this general class of problem is one of the most important we deal with: Postulate underlying natural law (e.g. blackbody function to explain emission from a protostar) Make predictions that experiment can test (e.g. what intensity do we expect to observe at a given wavelength.) Get best match possible between measurements and underlying law (e.g. minimise difference Between observed values and postulated laws.)