Inverse Modeling of Surface Carbon Fluxes Please read Peters et al (2007) and Explore the CarbonTracker website.

Slides:



Advertisements
Similar presentations
Introduction to Data Assimilation NCEO Data-assimilation training days 5-7 July 2010 Peter Jan van Leeuwen Data Assimilation Research Center (DARC) University.
Advertisements

Ordinary Least-Squares
Lesson 10: Linear Regression and Correlation
Geometric Representation of Regression. ‘Multipurpose’ Dataset from class website Attitude towards job –Higher scores indicate more unfavorable attitude.
Statistical Techniques I EXST7005 Simple Linear Regression.
Regresi Linear Sederhana Pertemuan 01 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Kriging.
Improving Understanding of Global and Regional Carbon Dioxide Flux Variability through Assimilation of in Situ and Remote Sensing Data in a Geostatistical.
Cost of surrogates In linear regression, the process of fitting involves solving a set of linear equations once. For moving least squares, we need to.
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
Lecture 4 The L 2 Norm and Simple Least Squares. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Motion Analysis (contd.) Slides are from RPI Registration Class.
Estimating parameters in inversions for regional carbon fluxes Nir Y Krakauer 1*, Tapio Schneider 1, James T Randerson 2 1. California Institute of Technology.
Curve-Fitting Regression
Compatibility of surface and aircraft station networks for inferring carbon fluxes TransCom Meeting, 2005 Nir Krakauer California Institute of Technology.
NOTES ON MULTIPLE REGRESSION USING MATRICES  Multiple Regression Tony E. Smith ESE 502: Spatial Data Analysis  Matrix Formulation of Regression  Applications.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Evaluating the Impact of the Atmospheric “ Chemical Pump ” on CO 2 Inverse Analyses P. Suntharalingam GEOS-CHEM Meeting, April 4-6, 2005 Acknowledgements.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Retrieval Theory Mar 23, 2008 Vijay Natraj. The Inverse Modeling Problem Optimize values of an ensemble of variables (state vector x ) using observations:
Modeling approach to regional flux inversions at WLEF Modeling approach to regional flux inversions at WLEF Marek Uliasz Department of Atmospheric Science.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. by Lale Yurttas, Texas A&M University Chapter 171 CURVE.
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Application of Geostatistical Inverse Modeling for Data-driven Atmospheric Trace Gas Flux Estimation Anna M. Michalak UCAR VSP Visiting Scientist NOAA.
Classification and Prediction: Regression Analysis
Principles of Least Squares
Today Wrap up of probability Vectors, Matrices. Calculus
1 Oceanic sources and sinks for atmospheric CO 2 The Ocean Inversion Contribution Nicolas Gruber 1, Sara Mikaloff Fletcher 2, and Kay Steinkamp 1 1 Environmental.
Collaborative Filtering Matrix Factorization Approach
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Fitting a line to N data points – 1 If we use then a, b are not independent. To make a, b independent, compute: Then use: Intercept = optimally weighted.
Results from the Carbon Cycle Data Assimilation System (CCDAS) 3 FastOpt 4 2 Marko Scholze 1, Peter Rayner 2, Wolfgang Knorr 1 Heinrich Widmann 3, Thomas.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
G(m)=d mathematical model d data m model G operator d=G(m true )+  = d true +  Forward problem: find d given m Inverse problem (discrete parameter estimation):
Statistical Methods Statistical Methods Descriptive Inferential
Curve-Fitting Regression
Thomas Knotts. Engineers often: Regress data  Analysis  Fit to theory  Data reduction Use the regression of others  Antoine Equation  DIPPR.
Simple Linear Regression. The term linear regression implies that  Y|x is linearly related to x by the population regression equation  Y|x =  +  x.
Integration of biosphere and atmosphere observations Yingping Wang 1, Gabriel Abramowitz 1, Rachel Law 1, Bernard Pak 1, Cathy Trudinger 1, Ian Enting.
Research Vignette: The TransCom3 Time-Dependent Global CO 2 Flux Inversion … and More David F. Baker NCAR 12 July 2007 David F. Baker NCAR 12 July 2007.
Development of an EnKF to estimate CO 2 fluxes from realistic distributions of X CO2 Liang Feng, Paul Palmer
INVERSE MODELING TECHNIQUES Daniel J. Jacob. GENERAL APPROACH FOR COMPLEX SYSTEM ANALYSIS Construct mathematical “forward” model describing system As.
Atmospheric Inventories for Regional Biogeochemistry Scott Denning Colorado State University.
Seismological Analysis Methods Receiver FunctionsBody Wave Tomography Surface (Rayleigh) wave tomography Good for: Imaging discontinuities (Moho, sed/rock.
Psychology 202a Advanced Psychological Statistics October 22, 2015.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
Error correlation between CO 2 and CO as a constraint for CO 2 flux inversion using satellite data from different instrument configurations Helen Wang.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
Earth Observation Data and Carbon Cycle Modelling Marko Scholze QUEST, Department of Earth Sciences University of Bristol GAIM/AIMES Task Force Meeting,
OLS Regression What is it? Closely allied with correlation – interested in the strength of the linear relationship between two variables One variable is.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Comparison of adjoint and analytical approaches for solving atmospheric chemistry inverse problems Monika Kopacz 1, Daniel J. Jacob 1, Daven Henze 2, Colette.
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
Carbon Cycle Data Assimilation with a Variational Approach (“4-D Var”) David Baker CGD/TSS with Scott Doney, Dave Schimel, Britt Stephens, and Roger Dargaville.
Biointelligence Laboratory, Seoul National University
Notes on Weighted Least Squares Straight line Fit Passing Through The Origin Amarjeet Bhullar November 14, 2008.
Part 5 - Chapter
Part 5 - Chapter 17.
Ch3: Model Building through Regression
CH 5: Multivariate Methods
Ordinary Least Squares (OLS) Regression
Part 5 - Chapter 17.
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
6.5 Taylor Series Linearization
Biointelligence Laboratory, Seoul National University
5.2 Least-Squares Fit to a Straight Line
Regression Statistics
Hartmut Bösch and Sarah Dance
Topic 11: Matrix Approach to Linear Regression
Presentation transcript:

Inverse Modeling of Surface Carbon Fluxes Please read Peters et al (2007) and Explore the CarbonTracker website

Consider Linear Regression Given N measurements y i for different values of the independent variable x i, find a slope ( m ) and intercept ( b ) that describe the “best” line through the observations –Why a line? –What do we mean by “best” –How do we find m and b ? Compare predicted values to observations, and find m and b that fit best Define a total error (difference between model and observations) and minimize it!

Our model is Error at any point is just Could just add the error up: Problem: positive and negative errors cancel … we need to penalize signs equally Define the total error as the sum of the Euclidean distance between the model and observations (sum of square of the errors): Defining the Total Error

Minimizing Total Error (Least Squares) Take partial derivs Set to zero Solve for m and b

Minimizing the Error Solve for b Plug this result into other partial deriv, and solve for m (1) (2) (3) (4)

Minimizing the Error (cont’d) (4) Plug (4) into (2) and simplify: (5) Now have simple “Least Squares” formulae for “best” slope and intercept given a set of observations

Geometric View of Linear Regression Any vector can be written as a linear combination of the orthonormal basis set This is accomplished by taking the dot product (or inner product) of the vector with each basis vector to determine the components in each basis direction Linear regression involves a 2D mapping of an observation vector into a different vector space More generally, this can involve an arbitrary number of basis vectors (dimensions)

Linear Regression Revisited This notation can be rewritten in subscript notation: and applied to a familiar problem. Imagine that there are 2 data points (d 1, d 2 ) and 2 model parameters (m 1, m 2 ). Then the system of equations could be explicitly written as: d 1 = G 11 m 1 + G 12 m 2 d 2 = G 21 m 1 + G 22 m 2 Or in matrix form With two points, this is just two slope-intercept form equations: y 1 = m x 1 + b y 2 = m x 2 + b This is an "even-determined" problem - there is exactly enough information to determine the model parameters precisely, there is only one solution, and there is zero prediction error.

Generalized Least Squares The G's can be thought of as partial derivatives and the whole matrix as a Jacobian.

Linear Regression (again) Take partial derivatives w.r.t. m, set to zero, solve for m. The solution is

Matrix View of Linear Regression Solution

Matrix View of Regression (cont’d) Perform the matrix inversion on G T G: Recall:

Matrix View of Regression (cont’d)

TransCom Inversion Intercomparison Discretize world into 11 land regions (by vegetation type) and 11 ocean regions (by circulation features) Release tracer with unit flux from each region during each month into a set of 16 different transport models Produce timeseries of tracer concentrations at each observing station for 3 simulated years

Synthesis Inversion Decompose total emissions into M “basis functions” Use atmospheric transport model to generate G Observe d and N locations Invert G to find m datatransport fluxes d 1 = G 11 m 1 + G 12 m 2 + … + G 1N m N CO 2 sampled at location 1Strength of emissions of type 2 partial derivative of CO 2 at location 1 with respect to emissions of type 2

Near-Collinearity of Basis Vectors

“Dipoles”

“Best Fit” Inverse Results

Rubber Bands Inversion seeks a compromise between detailed reproduction of the data and fidelity to what we think we know about fluxes The elasticity of these two “rubber bands” is adjustable Prior estimates of regional fluxes Observational Data Solution: Fluxes, Concentrations

MODEL a priori emission a priori emission uncertainty concentration data concentration data uncertainty emission estimate emission estimate uncertainty Bayesian Inversion Technique

MODEL a priori emission a priori emission uncertainty reduction concentration data concentration data uncertainty emission estimate emission estimate uncertainty Bayesian Inversion Technique

Our problem is ill-conditioned. Apply prior constraints and minimize a more generalized cost function: Bayesian Inversion Formalism Solution is given by Inferred flux Prior Guess Data Uncertainty Flux Uncertainty observations data constraintprior constraint

Uncertainty in Flux Estimates A posteriori estimate of uncertainty in the estimated fluxes Depends on transport (G) and a priori uncertainties in fluxes (C m ) and data (C d ) Does not depend on the observations per se!

Covariance Matrices Inverse of a diagonal matrix is obtained by taking reciprocal of diagonal elements How do we set these? (It’s an art!)

Accounting for “Error” Sampling, contamination, analytical accuracy (small) Representativeness error (large in some areas, small in others) Model-data mismatch (large and variable) Transport simulation error (large for specific cases, smaller for “climatological” transport) All of these require a “looser” fit to the data

Individual models: background fields

Annual Mean Results Substantial terrestrial sinks in all northern regions is driven by data (reduced uncertainty) Tropical regions are very poorly constrained (little uncertainty reduction) Southern ocean regions have reduced sink relative to prior, with strong data constraint Neglecting rectifier effect moves terrestrial sink S and W, with much reduced model spread Gurney et al, Nature, 2002

Sensitivity to Priors Flux estimates and a posteriori uncertainties for data-constrained regions (N and S) are very insensitive to priors Uncertainties in poorly constrained regions (tropical land) very sensitive to prior uncertainties As priors are loosened, dipoles develop between poorly constrained regions