Bayes Theorem The most likely value of x derived from this posterior pdf therefore represents our inverse solution. Our knowledge contained in is explicitly.

Slides:



Advertisements
Similar presentations
Introduction to Data Assimilation NCEO Data-assimilation training days 5-7 July 2010 Peter Jan van Leeuwen Data Assimilation Research Center (DARC) University.
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Mobile Robot Localization and Mapping using the Kalman Filter
CHAPTER 13 M ODELING C ONSIDERATIONS AND S TATISTICAL I NFORMATION “All models are wrong; some are useful.”  George E. P. Box Organization of chapter.
Visual Recognition Tutorial
Lecture 11: Recursive Parameter Estimation
Pattern Recognition and Machine Learning
x – independent variable (input)
Announcements  Homework 4 is due on this Thursday (02/27/2004)  Project proposal is due on 03/02.
SYSTEMS Identification
Lecture 5 Probability and Statistics. Please Read Doug Martinson’s Chapter 3: ‘Statistics’ Available on Courseworks.
#9 SIMULATION OUTPUT ANALYSIS Systems Fall 2000 Instructor: Peter M. Hahn
Generative Models Rong Jin. Statistical Inference Training ExamplesLearning a Statistical Model  Prediction p(x;  ) Female: Gaussian distribution N(
Expectation Maximization for GMM Comp344 Tutorial Kai Zhang.
Retrieval Theory Mar 23, 2008 Vijay Natraj. The Inverse Modeling Problem Optimize values of an ensemble of variables (state vector x ) using observations:
7. Least squares 7.1 Method of least squares K. Desch – Statistical methods of data analysis SS10 Another important method to estimate parameters Connection.
REMOTE SENSING & THE INVERSE PROBLEM “Remote sensing is the science and art of obtaining information about an object, area, or phenomenon through the analysis.
Chi Square Distribution (c2) and Least Squares Fitting
Lecture II-2: Probability Review
Modern Navigation Thomas Herring
CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Slam is a State Estimation Problem. Predicted belief corrected belief.
1 Linear Methods for Classification Lecture Notes for CMPUT 466/551 Nilanjan Ray.
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
Machine Learning Queens College Lecture 3: Probability and Statistics.
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
Principles of Pattern Recognition
EM and expected complete log-likelihood Mixture of Experts
Model Inference and Averaging
Linear Regression Andy Jacobson July 2006 Statistical Anecdotes: Do hospitals make you sick? Student’s story Etymology of “regression”
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Conjugate Priors Multinomial Gaussian MAP Variance Estimation Example.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
Solution of a System of ODEs with POLYMATH and MATLAB, Boundary Value Iterations with MATLAB For a system of n simultaneous first-order ODEs: where x is.
INVERSE MODELING OF ATMOSPHERIC COMPOSITION DATA Daniel J. Jacob See my web site under “educational materials” for lectures on inverse modeling atmospheric.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:
INVERSE MODELING TECHNIQUES Daniel J. Jacob. GENERAL APPROACH FOR COMPLEX SYSTEM ANALYSIS Construct mathematical “forward” model describing system As.
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.
Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.
Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.
NON-LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 6 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.
An Introduction to Optimal Estimation Theory Chris O´Dell AT652 Fall 2013.
STA347 - week 91 Random Vectors and Matrices A random vector is a vector whose elements are random variables. The collective behavior of a p x 1 random.
Lecture II-3: Interpolation and Variational Methods Lecture Outline: The Interpolation Problem, Estimation Options Regression Methods –Linear –Nonlinear.
1 Information Content Tristan L’Ecuyer. 2 Degrees of Freedom Using the expression for the state vector that minimizes the cost function it is relatively.
Parameter estimation class 5 Multiple View Geometry CPSC 689 Slides modified from Marc Pollefeys’ Comp
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
Estimating emissions from observed atmospheric concentrations: A primer on top-down inverse methods Daniel J. Jacob.
CSC2535: Computation in Neural Networks Lecture 7: Independent Components Analysis Geoffrey Hinton.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
June 20, 2005Workshop on Chemical data assimilation and data needs Data Assimilation Methods Experience from operational meteorological assimilation John.
Applied statistics Usman Roshan.
Uncertainty and Error Propagation
Chapter 7. Classification and Prediction
(5) Notes on the Least Squares Estimate
Probability Theory and Parameter Estimation I
Model Inference and Averaging
CH 5: Multivariate Methods
PSG College of Technology
Latent Variables, Mixture Models and EM
Probabilistic Models for Linear Regression
Modelling data and curve fitting
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Biointelligence Laboratory, Seoul National University
Uncertainty and Error Propagation
Presentation transcript:

Bayes Theorem The most likely value of x derived from this posterior pdf therefore represents our inverse solution. Our knowledge contained in is explicitly expressed in terms of the forward model and the statistical description of both the error of this model and the error of the measurement. The factor P(y) will be ignored as it in practice is a normalizing factor. This represents an updating to our prior knowledge P(x) given the measurement y is the knowledge of y given x = pdf of forward model

x = { r eff,  } y = { R 0.86, R 2.13,... } Forward model must map x to y. Mie Theory, simple cloud droplet size distribution, radiative transfer model.

Example: x true : { r eff = 15  m,  = 30 } Errors determined by how much change in each parameter (r eff,  ) causes the  2 to change by one unit.

Correlated Errors Variables: y 1, y 2, y sigma errors:  1,  2,  3... The correlation between y 1 and y 2 is c 12 (between –1 and 1), etc. Then, the Noise Covariance Matrix is given by:

Example: Temperature Profile Climatology for December over Hilo, Hawaii P = (1000, 850, 700, 500, 400, 300) mbar = (22.2, 12.6, 7.6, -7.7, -19.5, -34.1) Celsius

Correlation Matrix: Covariance Matrix:

Prior Knowledge Prior knowledge about x can be known from many different sources, like other measurements or a weather or climate model prediction or climatology. In order to specify prior knowledge of x, called x a, we must also specify how well we know x a ; we must specify the errors on x a. The errors on x a are generally characterized by a Probability Distribution Function (PDF) with as many dimensions as x. For simplicity, people often assume prior errors to be Gaussian; then we simply specify S a, the error covariance matrix associated with x a.

The  2 with prior knowledge:

Minimization Techniques Minimizing the  2 is hard. In general, you can use a look- up table (this still works, if you have tabulated values of F(x) ), but if the lookup table approach is not feasible (i.e., it’s too big), then you have to do iteration: 1. Pick a guess for x, called x Calculate (or look up) F(x 0 ). 3. Calculate (or look up) the Jacobian Matrix about x 0 : K is the matrix of sensitivities, or derivatives, of each output (y) variable with respect to each input (x) variable. It is not necessarily square.

How to Iterate in Multi-Dimensions: where:

Iteration in practice: Not guaranteed to converge. Can be slow, depends on non-linearity of F(x). There are many tricks to make the iteration faster and more accurate. Often, only a few function iterations are necessary.

Connection of chi^2 to Confidence Limits in multiple dimensions

Error Correlations? x true r eff = 15  m,  = 30r eff = 12  m,  = 8 {R 0.86, R 2.13 } true 0.796, , {R 0.86, R 2.13 } measured 0.808, , x derived r eff = 14.3  m,  = 32.3r eff = 11.8  m,  = 7.6 Formal 95% Errors ± 1.5  m, ± 3.7± 2.2  m, ± 0.7 {r eff,  } Correlation 5%55%

Deg. Of Freedom Example

Geometry / Set-up

State vector + Measurements + Surface albedo parameters, Gas columns (CO2, O2, CH4)