# Formal Computational Skills

## Presentation on theme: "Formal Computational Skills"— Presentation transcript:

Formal Computational Skills
Numerical Methods for Differential Equations

Today’s lecture: By the end you should know:
How to solve a first order differential equation like: dy/dx = - y + x (eg ctrnn equation) using Euler’s method What affects the accuracy of numerical integration The existence of more advanced methods such as Runge-Kutta etc (briefly) how to solve higher order derivatives and issues affecting their solution Why you need to know it: Lots of applications: building agent simulations, simulating neurons, working out robot speeds etc etc

Rationale Behind Numerical Methods
“God does not care about our mathematical difficulties: he integrates empirically” Einstein Differential equations are everywhere in dynamical systems eg: dx/dt = f(t), what is x at t=2 ie x(2)?? Must integrate f(t) Basically: there’s lots of difficult maths concerned with exactly integrating differential equations, most of which are impossible Instead: Use some very simple maths to work out an approximate answer which works well enough Trade-off: Can take a LONG time to generate the numerical answer (but also takes a long time to learn hard maths …)

Difference Equations Euler’s method
Basic idea is to go back to the definition of the derivative: x(t+h) x(t) t t+h Or: If we assume (pretend) dx/dt is constant between t and t+h get: Thus if we know: dx/dt = f(t) and assume dx/dt is constant for intervals of h, given x at time t, x(t) , we have: x(t+h) = x(t) + h f(t) Euler’s method

Euler’s method estimates x(t+h) by starting at x(t) …
dx/dt|t+h=f(x,t+h) dx/dt|t=f(x,t) x(t+2h) x(t+h) Then evaluating the gradient at t and x(t). x(t) then use x(t+h) to get gradient at t+h and use both to estimate x(t+2h) t t+h t+2h Eg At t=0 have: x(0) = 0 and dx/dt|t=0 = f(0) = 2. Setting h=1: x(1) = x(0) + h dx/dt|t=0 = x(0) + h f(0) = = 2 At t = 1 have x(1) = 2 and dx/dt|t=1 = f(1) = 1 so: x(2) = x(1) + h dx/dt|t=1 = x(1) + h f(1) = = 3 Basically split time up into bits of h and solve iteratively. Note errors may accumulate

eg: Simulated robot controlled by a neural network which specifies its velocity dx/dt = v(x,t) at each time-step. If it starts at x = 0 at t = 0, what positions will it be in at each time-step? Set step-size h=1 and assume constant v between t and t+h so: x(t+1) = x(t) + v(t) t = … v = … x = 0 2 6 5 7 Note x must be defined initially in this case at t=0 Therefore known as Euler’s method for Initial Value Problems (IVPs) However, positions are only estimates. Robot must accelerate etc so estimates will be wrong. What determines the accuracy?

Accuracy Try robot with constant acceleration: evaluate x at t=4 starting with x(0)=0 using: x(t+h) = x(t) + h dx/dt = x(t) + hv 1) h = t = dx/dt = v = t v = dx/dt = x = 0 [Real ans = 0.5 a t2 = 0.5 x 16 = 8] 2) double time-step h h = t = 0 2 4 dx/dt = v = t v = dx/dt = x = [Real ans = 8] 3) double acceleration h = t = dx/dt = v = 2t v = dx/dt = x = [Real ans = 16] What is the error between estimates at t=4 and real answers?

1) t = v = dx/dt = x = Error = 8-6 = 2 2) double time-step (h) t = 0 2 4 v = dx/dt = x = Error = 8 – 4 = 4 3) double acceleration t = v = dx/dt = x = Error = = 4 So accuracy depends on h AND acceleration ie the smoothness of the function x

Makes sense intuitively and graphically:
x(t+h) Error h dx/dt x(t) t t+h Since we assume the gradient is constant over the range h, the accuracy of x(t+h) is increased (in one sense, see later) by making the increase in time-step h (and thus the range of the assumption) small However it also depends on how curvy (non-smooth) x is ie on the size of dx/dt, d2x/dt2, etc as this determines how much changes over h

The reliance of the accuracy of Euler’s method on small h can be made explicit by examining the Taylor expansion of y(t) ie: So if : Error Then the error is: Error = Thus for small h the error is determined by the factor h2 ie O(h2) and the method is said to be first order accurate But error also depends on higher order differentials so normally assume y is sufficiently smooth that these values are low

t t+h y(t) y(t+h) h dy/dt Note that Euler’s method is asymmetric as we assume dy/dt has a constant value evaluated at the beginning of the time interval (ie between t and t+h)

More symmetric (and accurate) to use an estimate from the middle of the interval ie at h/2
y(t) y(t+h) h dy/dt t+h/2 However to do this we will need to use the value of y at t+h/2 which we do not yet know So use the Runge-Kutta method as follows…

Suppose dy/dt= f(y,t) Do:
1. Estimate y(t+h/2) using dy/dt evaluated at t ie f(y(t),t) (estimate 1) 2. Use y(t+h/2) to evaluate dy/dt at t+h/2 ie f(y(t+h/2), t+h/2) (estimate 2) y(t+h/2) 3. Use this estimate of dy/dt to calculate y at t+h starting at y(t) (estimate 3) t t+h/2 t+h y(t+h/2) = y(t) + h/2 f(y(t), t) (estimate 1) (estimate 3) y(t+h) = y(t) + h f(y(t+h/2), t+h/2) (estimate 2) This is the 2nd Order Runge-Kutta method

As the name suggests it is 2nd order accurate (method called n’th order if error term is O(hn+1)
Why? Can show that: So comparing this estimate with the Taylor expansion: We see the error is O(h3) Using a similar procedure but using 4 evaluations of dy/dt we get the commonly used 4th order Runge-Kutta NB This is the basic idea of how Euler’s method is improved: for current/more complicated methods, see Numerical Recipes For MSc, only Runge-Kutta is needed, and usually only Euler

Note that high order does NOT always imply high accuracy: rather it means that the method is more efficient ie quicker to run as larger step-sizes can be used Generally, for difficult functions (non-smooth), simpler methods will be more accurate if small h used However, errors also arise as a consequence of rounding in computer evaluation (round-off errors) and while truncation error reduces as we lower h, round-off errors will increase Thus need constraints on step-sizes used so that algorithm remains stable ie round-off errors do not swamp the solution Alternatively, use a combination of forward/backward differences (explicit/implicit) which are much more computationally expensive to run (see Numerical Recipes chapter 19 for more details)

Note that one can get an estimate of the gradient at B from an advanced timestep (forward differencing): explicit scheme – fast and simple or from a previous timestep (backward difference) implicit scheme. More complex to solve

Higher Order Derivatives
For higher order derivatives, can use Euler twice, eg network outputs robot acceleration rather than velocity: By Euler But: and so However, for as the schemes get more complex and higher order, stability becomes more of an issue Also harder to ensure accuracy and so often check it by eg running with time-step h, then time-step h/2 and check difference