Presentation on theme: "Running a model's adjoint to obtain derivatives, while more efficient and accurate than other methods, such as the finite difference method, is a computationally."— Presentation transcript:
Running a model's adjoint to obtain derivatives, while more efficient and accurate than other methods, such as the finite difference method, is a computationally expensive task. In addition, the effort taken initially to write the adjoint is considerable and the task, time consuming. Therefore, if we can emulate the derivatives of a model this would decrease the demand for writing and running adjoints. The derivatives of a Gaussian process remain a Gaussian process and we are able therefore to model the derivatives of a model as a Gaussian process. Some performances of emulators to predict derivatives, built with and without derivatives, are shown below in Figures 10 - 12. Figure 10 (left). Emulating derivatives based on function output and known derivatives at 5 points. Figure 10 shows the performance of an emulator predicting the derivatives of the the true model, x + cos(x) + sin(x), based on the model output and the derivatives at 5 points. Figure 11 (left). Emulating derivatives based on function output at 5 points. Figure 11 shows the performance of an emulator based on training data which consists only of the simulator output at the 5 design points. We can see from Figure 11 that though we can still predict the derivatives of the true function, the uncertainty is much greater and the posterior mean is further from the true value than in the emulator which had derivative information included in the training data. It should be noted, though, that the computational expense required to build the two emulators of Figures 10 and 11 are not equal. Figure 12 (right). Emulating derivatives based on function output at 8 points. It would appear from Figure 11 that without derivative information in the training data, 5 runs of this simulator aren't enough for the emulator mean to produce an adequate approximation. Continuing with the example, we evaluate 3 more runs of the simulator and this yields an emulator, of which the mean and standard deviation is shown in Figure 12. We may not have a model’s adjoint and as such, the derivatives are unknown. We can still, however, emulate the derivatives of the model. Figures 11 and 12 illustrate emulators built only with function output. Using Derivative Information in the Statistical Analysis of Computer Models G. Stephenson, P. Challenor, R. Marsh firstname.lastname@example.org 1. Introduction to Gaussian Process Emulators 6. Conclusions Thanks to Jeremy Oakley, Robin Hankin and all in the MUCM team. www. mucm.group.shef.ac.uk 4. Emulation of Derivatives Complex computer models are used in many areas, such as engineering and environmental science, to simulate the behaviour of real-world systems. An appreciable amount of computing time may be required to run a complex model and performing analyses such as sensitivity and uncertainty analysis can require many runs of the simulator. This quickly becomes impractical with a computationally expensive model. Greater efficiency can be achieved by building an emulator, which is a statistical approximation to the simulator. The approach we use is to model the simulator with a Gaussian process. The emulator is built based on data collected from running the simulator at a specified, small number of input points. A Gaussian process emulator is illustrated below, in Figure 1. Figure 1 (left). The true function is evaluated at 5 points and training on this output, an emulator built. The emulator mean is then evaluated at a set of untried input points and the standard deviation at each of these points calculated. In this example, the posterior mean is close to the true value of the simulator; the predictions become worse and uncertainty much greater once it is forced to extrapolate. At the 5 design points the uncertainty pinches in to zero, as the true value of the simulator at these points is known. The uncertainty becomes more appreciable the further away from a design point we predict at, how quickly the uncertainty grows between design points depends on the roughness parameter in the emulator. If we increase the number of design points we would expect the predictions to become closer to the simulator output and the uncertainty to decrease. Figure 2 (left). Emulator with derivative information at 5 points. The emulator in Figure 1 is repeated but here, in addition to the function output, we include the derivatives of the output w.r.t. the input at each of the 5 points. The emulator mean is again evaluated at a set of untried input points and the standard deviation at each of these points calculated. The resulting emulator is shown in Figure 2. The Figure 3 (above, right). Emulator with derivative information at 3 points. The posterior mean is still close to the true simulator output. The uncertainty reduces to zero at design points, as expected, but whereas in Figure 1 the uncertainty becomes appreciable once we start predicting away from a design point, here the uncertainty remains very small for predictions closes to the design points. It is the derivative information in the model which allows for this reduced uncertainty. It is necessary to consider the computational cost of using derivative information in computer experiments as it must be determined at which point the costs outweigh the benefits. For example, if generating the derivatives of the model increases the computational cost substantially, this extra computing time may be better spent evaluating the model at more points instead. 3. Toy Model Investigation C-GOLDSTEIN is an intermediate complexity climate model and the adjoint of the model exists enabling the evaluation of derivatives. An example of the type of output C-GOLDSTEIN produces is shown in Figure 13. Gaussian process emulation can provide a practical solution to running computationally expensive models and results from investigations on toy models show that using derivative information when building emulators could improve efficiency. This is, however, dependent on the computational cost of obtaining derivatives. Moreover, we can use an emulator to predict the derivatives of a complex model. Further investigation is required, but current work suggests that emulation of derivatives could reduce the demand for writing and running adjoint models. Work will continue by further emulation of both the function output and derivatives of C-GOLDSTEIN. It is possible to obtain derivatives of model outputs with respect to its inputs. One approach is using Automatic Differentiation (AD). Derivatives are generated by repeatedly applying the chain rule to the combinations of elementary operations in the model. The differentiated code then runs along side the original code in the resulting model, which is termed the adjoint model. The value of learning derivatives when building emulators is being investigated to determine whether additional e ﬃ ciency can be achieved. Figures 2 and 3 illustrate a Gaussian process emulator which has been built with the additional information provided by derivatives. The performance of the emulator in Figure 2, while good, is such that it is difficult to identify precisely how and where the derivative information is having an effect. Due to this we now repeat the example but remove two of the simulator runs and corresponding derivatives from the training data. We emulate each toy model, with and without derivative information, for a range of simulator runs. As the simulators here are not complex models, it is possible to run them at all points we test the emulator at. An average prediction error is then determined by looking at the difference between the value of the posterior mean, and the true value at that point as given by the simulator. We investigate the value of derivative information in Gaussian process emulation with 3 toy models. These are shown below in Figures 4 - 6. Figure 7 - Performance of Emulator 1Figure 8 - Performance of Emulator 2Figure 9 - Performance of Emulator 3 Figures 7 - 9 show that for all toy models tested here, the mean of emulators built with derivatives provide a closer approximation to the relevant simulator. For models 2 and 3, an emulator without derivative information requires approximately twice as many simulator runs as an emulator with derivatives to achieve similar accuracy. 5. Application to the C-GOLDSTEIN climate model Figure 13. Air temperature at the year 2000. Figure 13 shows some output of C- GOLDSTEIN run to AD 2000 under default parameters settings. emulator mean is much closer to the true function output and the uncertainty much smaller than the emulator without derivatives (shown in Figure 1). Figure 4 - Toy Model 1Figure 5 - Toy Model 2Figure 6 - Toy Model 3 2. Use of Derivatives in Emulating Model Output C-GOLDSTEIN = Coupled Global-Linear Drag Salt and Temperature Equation Integrator. It is composed of a 3-d ocean model coupled with a 2-d energy moisture balance model of the atmosphere and a simple sea ice model. C-GOLDSTEIN has simplified physics and a low resolution (36x36 grid). We apply the method of Section 4 to C-GOLDSTEIN. 30 runs of the climate model are performed, varying 3 of the input parameters. An emulator is built with the resulting output of global mean air temperature. Validation data is produced by adopting the finite differences method; 50 derivatives with respect to each input are generated.. Figure 14 (left). A comparison of the derivatives of global mean air temperature with respect to atmospheric moisture diffusivity, at the year 2000. There are some areas where the emulated derivatives match those produced by finite differences quite well. A number of points show differences between the two methods though and additional runs of the simulator are likely to be required to improve the overall performance of the emulator. The corresponding plots for the 2 remaining parameters, ocean vertical diffusivity and atmospheric heat diffusivity, are omitted here but show similar patterns.