How to Emulate: Recipes without Patronising

Slides:



Advertisements
Similar presentations
Chapter 7 System Models.
Advertisements

Key Stage 3 National Strategy
OMV Ontology Metadata Vocabulary April 10, 2008 Peter Haase.
1 of 19 How to invest in Information for Development An Introduction IMARK How to invest in Information for Development An Introduction © FAO 2005.
1 of 17 Information Strategy The Features of an Information Strategy © FAO 2005 IMARK Investing in Information for Development Information Strategy The.
1 of 19 Organization and Management New Structures and Alliances IMARK Investing in Information for Development Organization and Management New Structures.
1 Web Search Environments Web Crawling Metadata using RDF and Dublin Core Dave Beckett Slides:
0 - 0.
1 Correlation and Simple Regression. 2 Introduction Interested in the relationships between variables. What will happen to one variable if another is.
Automata Theory Part 1: Introduction & NFA November 2002.
The Robert Gordon University School of Engineering Dr. Mohamed Amish
Project Appraisal Module 5 Session 6.
1 Adding a statistics package Module 2 Session 7.
Design, prototyping and construction
Final Year Projects Some tips
Lectures 6&7: Variance Reduction Techniques
Design Project (Last updated: Nov. 22/2010) Change since August 31: added the notes to the presentation in the next slide.
Defect testing Objectives
Uncertainty and Sensitivity Analysis of Complex Computer Codes
Southampton workshop, July 2009Slide 1 Tony O’Hagan, University of Sheffield Simulators and Emulators.
SAMSI Kickoff 11/9/06Slide 1 Simulators, emulators, predictors – Validity, quality, adequacy Tony O’Hagan.
Durham workshop, July 2008Slide 1 Tony O’Hagan, University of Sheffield MUCM: An Overview.
Insert Date HereSlide 1 Using Derivative and Integral Information in the Statistical Analysis of Computer Models Gemma Stephenson March 2007.
What place of importance does it hold in 21st century
1 General Iteration Algorithms by Luyang Fu, Ph. D., State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting LLP 2007 CAS.
2009 – E. Félix Security DSL Toward model-based security engineering: developing a security analysis DSML Véronique Normand, Edith Félix, Thales Research.
Slide 1 John Paul Gosling University of Sheffield GEM-SA: a tutorial.
Running a model's adjoint to obtain derivatives, while more efficient and accurate than other methods, such as the finite difference method, is a computationally.
Faculty Added Questions: Where do I begin? David Neiss & Laura Sandino.
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.
1 PART 1 ILLUSTRATION OF DOCUMENTS  Brief introduction to the documents contained in the envelope  Detailed clarification of the documents content.
Multiple Regression and Model Building
MA 1165: Special Assignment Completing the Square.
Least squares CS1114
©© 2013 SAP AG. All rights reserved. Product Definition Scenario Overview Defining Product Properties and Product Models Specifying Product Variants Scenario.
Validating uncertain predictions Tony O’Hagan, Leo Bastos, Jeremy Oakley, University of Sheffield.
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
Paper Title Your Name CMSC 838 Presentation. CMSC 838T – Presentation Motivation u Problem paper is trying to solve  Characteristics of problem  … u.
1 of 4 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2007 Microsoft Corporation.
Gaussian process modelling
‘One Sky for Europe’ EUROCONTROL © 2002 European Organisation for the Safety of Air Navigation (EUROCONTROL) Page 1 VALIDATION DATA REPOSITORY Overview.
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
MBA7025_01.ppt/Jan 13, 2015/Page 1 Georgia State University - Confidential MBA 7025 Statistical Business Analysis Introduction - Why Business Analysis.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
MBA7020_01.ppt/June 13, 2005/Page 1 Georgia State University - Confidential MBA 7020 Business Analysis Foundations Introduction - Why Business Analysis.
Chapter 9 Prototyping. Objectives  Describe the basic terminology of prototyping  Describe the role and techniques of prototyping  Enable you to produce.
Interviews In today’s lesson : The purpose of an interview The importance of preparation Interview setting Interview techniques.
Overview of the final test for CSC Overview PART A: 7 easy questions –You should answer 5 of them. If you answer more we will select 5 at random.
SEM Basics 2 Byrne Chapter 2 Kline pg 7-15, 50-51, ,
Forecasting Parameters of a firm (input, output and products)
ACE TESOL Diploma Program – London Language Institute OBJECTIVES You will understand: 1. A variety of interactive techniques that cater specifically to.
Designing Factorial Experiments with Binary Response Tel-Aviv University Faculty of Exact Sciences Department of Statistics and Operations Research Hovav.
Usability Testing Instructions. Why is usability testing important? In a perfect world, we would always user test instructions before we set them loose.
Options and generalisations. Outline Dimensionality Many inputs and/or many outputs GP structure Mean and variance functions Prior information Multi-output,
CSE SW Metrics and Quality Engineering Copyright © , Dennis J. Frailey, All Rights Reserved CSE8314M13 8/20/2001Slide 1 SMU CSE 8314 /
Helpful hints for planning your Wednesday investigation.
Improve Own Learning and Performance. Progression from levels 1-3 Progression from levels 1-3 At all levels, candidates are required to show they can.
8 Sept 2006, DEMA2006Slide 1 An Introduction to Computer Experiments and their Design Problems Tony O’Hagan University of Sheffield.
5 In the Survey Options section, click an option to determine whether users' names will appear in survey results, and then whether users can respond to.
What is a CAT? What is a CAT?.
Deep Feedforward Networks
5 In the Survey Options section, click an option to determine whether users' names will appear in survey results, and then whether users can respond to.
By Dr. Abdulrahman H. Altalhi
5 In the Survey Options section, click an option to determine whether users' names will appear in survey results, and then whether users can respond to.
THE BUSINESS ANALYSIS PROCESS MODEL
Hidden Markov Models Part 2: Algorithms
Unit 6: Application Development
Analytics – Statistical Approaches
Product Definition Scenario Overview
Presentation transcript:

How to Emulate: Recipes without Patronising The MUCM Toolkit Dan Cornford, Aston University

Overview What and why is the toolkit? How is it delivered Current toolkit contents A (slightly contrived) tour through parts of the toolkit What is the future of the toolkit? What would you like to see in the toolkit?

What is the toolkit? A series of linked (web) pages: Threads follow the derivation of major idea as a series of linked pages Core threads cover main areas, variants cover specialisations Procedures describe an operation or algorithm provide sufficient information to allow the implementation of the operation Discussions cover issues that may arise during the implementation of a method, or other optional details Alternatives present available options when building a specific part of an emulator (e.g. choosing a covariance function) and provide some guidance for making the selection Examples present how to use the techniques in practice Definitions of a term or a concept Meta any page that does not fall in one of the above categories usually pages about the Toolkit itself

What are the main threads? ThreadCoreGP - the core model, dealt with by fully Bayesian, Gaussian Process, emulation ThreadCoreBL - the core model, dealt with by Bayes Linear emulation And to come … ThreadVariantMultipleOutputs - variant of the core model in which we emulate more than one output of a simulator ThreadGenericMultipleEmulators – dealing with multiple outputs from more than one emulator ThreadVariantMultipleSimulators - variant of the core model: emulating outputs from more than one related simulator ThreadVariantDynamic - a special case of multiple outputs as timeseries ThreadVariantStochastic - variant of the core model in which the simulator output is random ThreadVariantDerivatives - variant of the core model in which we also model derivatives of outputs

Example

Do I have to read it linearly? Pages can be accessed individually or as part of a thread. We will add cross-cutting threads, e.g. on design for computer models

How are we creating it? The toolkit is built using a wiki All the MUCM team contributes Tony O’Hagan is the editor in chief, Yiannis Andrianakis is managing the overall technology We release sections of the toolkit as they become mature to a web site This allows us control over the quality of the content We plan further enhancement to the presentation More graphical presentation of the structure Ability for users to add comments to pages

How to use the toolkit I’ll use a scenario to motivate this. A chemical engineer is working on an azoisopropane chemical process simulation. The process involves two key chemicals, which react to produce 39 main chemicals, with 42 reactions possible. Thus the simulator has 39+2*42 = 123 inputs. For now the chemist is mainly interested in a single output, the main target azoisopropane concentration, 1 output! I want to show how the toolkit can help here!

What does the chemist want to know? There are many chemical reactions, but which are the most important for determining the output variation? This is in essence a sensitivity analysis. Not all the reaction rates and activation energies are perfectly known – many are not directly observable Initial concentrations can be controlled ThreadCoreGP is relevant here.

Exploratory analysis, prior judgements The chemist expects only a few reactions to be important, and wants to know which these are At present they use local estimates based on simulator Jacobians The model is not too complex – typical evaluation takes a few tens of seconds, depending on target time It is likely that reaction rate parameters within the model could lie in the range 0.5x to 2.0x where x is the specified value

ThreadCoreGP: how to emulate ThreadCoreGP discusses all the issues that need to be tackled when undertaking emulation in the situation: We are only concerned with one simulator The simulator only produces one output The output is deterministic We do not have observations of the real world We don’t make statements about the real world process We cannot directly observe derivatives of the simulator We’ll explore how we can use ThreadCoreGP

What is in ThreadCoreGP? Definition of what a Gaussian process is Discussion of the implications of using a Gaussian process Alternatives to the ‘full Bayesian’ approach – Bayes Linear methods Provides technical information and discusses alternatives for: determining active inputs mean functions and covariance functions choice of prior distributions experimental design of simulator runs fitting the emulator using the emulator prediction, uncertainty analysis and sensitivity analysis

DiscGaussianAssumption – what is in there? This discusses issues to do with representing beliefs about the simulator in terms of a Gaussian process Why we use a Gaussian process computation and simplicity; other approaches could be entertained When a Gaussian process might be inappropriate outputs constrained in a range (but not practically important if we have a good emulator) What to do if Gaussian process is not appropriate main solution is use transformations e.g. log Also mentions Bayes Linear methods

AltMeanFunction – what is in there? Discussion of the alternatives for the mean function: mean function should be chosen to represent ‘the general shape of how the analyst expects the simulator output to respond to changes in the inputs’ Typically a linear in parameters regression, with a prior over the parameters – AltGPPriors Other forms possible but there is a price

AltCorrelationFunction – what is in there? Discussion of the alternatives for choosing the covariance function Gaussian (squared exponential), generalised Gaussian, Matern Role of nuggets Implications of choices Other possible choices

OK time for you to take over Rather than presenting this I now want to get you to do some work I like volunteers to try and use the toolkit – let’s talk about your simulation problems as see if the toolkit has the answers What problems made you sign up for today I’ll try and find the answers in the toolkit or the experts

Toolkit development – the future The toolkit is continually developing By the end of MUCM there will be a complete description of most aspects of building and using emulators MUCM2 will add more content, particularly accessible introductions and more examples Have we missed something? Please tell us! Future releases should allow easy commenting

Summary The toolkit will distil the combined knowledge of the MUCM team (and beyond) We intend it to become the ‘emulation Wikipedia’: An accessible, free community resource which will outlive the project We are releasing it in parts, and will continue to improve it within MUCM2