1Combining Observations and Models: A Bayesian View Mark Berliner, OSU Stat Dept Bayesian Hierarchical ModelsSelected ApproachesGeophysicalExamplesDiscussion
2Bayesian Hierarchical Models Main ThemesGoal: Develop probability distributions for unknowns of interest by combining information sources: Observations, theory, computer model output, past experience, etc.Approaches:Bayesian Hierarchical ModelsIncorporate various information sources by modelingpriorsdata model or likelihoodsEmphasis on using model/theory explicitly or quantitatively, as opposed to qualitatively2
3Bayesian Hierarchical Models Skeleton:Data Model: [ Y | X , q ]Process Model Prior: [ X | q ]Prior on parameters: [ q ]Bayes’ Theorem: posterior distribution: [ X , q | Y]Compare to“Statistics”: [ Y | q ] [ q ]“Physics”: [ X | q (Y) ]
4Approaches Stochastic models incorporating science Physical-statistical modeling (Berliner 2003 JGR) From ``F=ma'' to [ X | q ]Qualitative use of theory (eg., Pacific SST model; Berliner et al J. Climate)Incorporating large-scale computer modelsFrom model output to priors [ q ]Model output as samples from process model prior [ X | q ] almost !Model output as ``observations'' (Y)Combinations4
5Glacial Dynamics (Berliner et al. 2008 J. Glaciol) Steady Flow of Glaciers and Ice SheetsFlow: gravity moderated by drag (base &sides) & ….stuff….Simple models: flow from geometryData:Program for Arctic Climate Regional Assessments& Radarsat Antarctic Mapping Projectsurface topography (laser altimetry)basal topography (radar altimetry)velocity data (interferometry)5
8Modeling: surface – s, thickness – H, velocity - u Physical ModelBasal Stress: t = - rgH ds/dx (+ “stuff”)Velocities: u = ub + b0 H tnwhere ub = k tp + ( rgH )-qOur ModelBasal Stress: t = - rgH ds/dx + hwhere h is a ``corrector process;” H, s unknownVelocities: u = ub + b H t n + ewhere ub = k t p + ( rgH )-q or a constant;b is unknown, e is a noise process8
13Paleoclimate (Brynjarsdóttir & Berliner 2009) Climate proxies: Tree rings, ice cores, corals, pollen, underground rock provide indirect information on climateInverse problem: proxy f(climate)Boreholes: Earth stores info on surface temp’sModel: Heat equationBorehole data f(surface temp’s)Infer boundary condition (initial cond. is nuisance)13
22Bayesian Hierarchical Models to Mediterranean Forecast System (MFS) Augment theMediterranean Forecast System (MFS)Ralph Milliff CoRAChris Wikle Univ. MissouriMark Berliner Ohio State Univ..Nadia Pinardi INGV (I'Istituto Nazionale diGeofisica e Vulcanologia)Univ. Bologna (MFS Director)Alessandro Bonazzi,Srdjan Dobricic INGV, Univ. Bologna
23Bayesian Modeling in Support of Massive Forecast Models MFS is an Ocean ModelA Boundary Condition/Forcing: Surface WindsApproach: produce surface vector winds (SVW), for ensemble data assimilationExploit abundant, “good” satellite wind data (QuikSCAT)Samples from our winds-posterior ensemble for MFS(Before us: coarse wind field (ECMWF))23
24“Rayleigh Friction Model” for winds (Linear Planetary Boundary Layer Equations)(neglect second order time derivative)discretize:TheoryOur model
2510 members selected from the Posterior Distribution (blue) BHM Ensemble WindsA snapshot depiction of MFS-Wind-BHM output is shown here.Ten realizations of the SVW process from the estimate of the posterior distribution are shown for a single time during the analysis period for the Western Med. At each output grid location, a cluster of 10 blue vectors depicts the realizations from the BHM. Their mean (i.e. a somewhat crude estimate of the posterior mean) is shown in green. The red vectors are the corresponding ECMWF analysis vectors included here for comparison. As we have seen, we should not expect (or even desire) that the analysis vectors and the realizations from MFS-Wind-BHM agree. The QuikSCAT data stage has endowed the BHM winds with more realistic kinetic energy content.Note that the spread in each cluster of realizations from the BHM is a function of space (and as we will see next, a function of time as well). This is a representation of a space-time uncertainty function. In regions where wind speed are weak, the error model for the QuikSCAT retrievals has larger amplitude, and the ensemble spread can be large <point S of Sicily>. In regions with relatively strong winds, the clusters are more aligned and uncertainty is therefore reduced.10 m/s
26Approaches Stochastic models incorporating science Physical-statistical modeling (Berliner 2003 JGR) From ``F=ma'' to [ X | q ]Qualitative use of theory (eg., Pacific SST model; Berliner et al J. Climate)Incorporating large-scale computer modelsFrom model output to priors [ q ]Model output as samples from process model prior [ X | q ] almost !Model output as ``observations'' (Y)Combinations26
27Part B) Information from Models Develop prior from model outputThink of model output runs O1, … , On as samples from some distributionDo data analysis on O’s to estimate distributionUse result (perhaps with modifications) as a prior for XExample: O’s are spatial fields: estimate spatial covariance function of X based on O’s.Example: Berliner et al (2003) J. Climate
28[ Y | X , q ] is measurement error model: Model output as realizations of prior “trends”Process Model Prior X = O + hqwhere h is “model error”, “bias”, “offset”[ Y | X , q ] is measurement error model:Y = X + eqSubstitution yields [ Y | O , hq , q ]Y = O + hq + eqModeling h is crucial (I have seen h set to 0)Seems odd to have model ouptut as condition in data model, but “reality” is there through eta
29[ O | X , q ] to include “bias, offset, ..” [ X | O , q ] Model output as “observations”Data Model: [ Y, O | X , q ]( = [ Y | X, q ] [ O | X , q])[ O | X , q ] to include “bias, offset, ..”Previous approach: start by constructing[ X | O , q ]This approach: construct [ O | X , q]Model for “bias” a challenge in both casesThis is not uncommon, though not always made clear
30A Bayesian Approach to Multi-model Analysis and Climate Projection (Berliner and Kim 2008, J Climate)Climate Projection:Future climate depends on future, but unknown, inputs.IPCC: construct plausible future inputs, “SRES Scenarios” (CO2 etc.)Assume a scenario and get corresponding projection
31Hemispheric Monthly Surface Temperatures Observations (Y) forData Model: Gaussian with mean = true temp.& unknown variance (with a change-point)Two models (O): PCM (n=4), CCSM (n=1) for , and 3 SRES scenarios (B1,A1B,A2).Data Model: assumes O’s are Gaussian with mean = bt + model biast (different for the two models) and unknown, time-varying variances (different for the two models)All are assumed conditionally independent
32Ojk = b + bj + ejk Notes (Freeze time) b is common to both Models Data model for kth ensemble member from Model j:Ojk = b + bj + ejkb is common to both Modelsbj is Model j biasE( ejk ) = 0 and variances of e’s depend on jComputer model model:b = X + ewhere E(e) = 0Priors for biases, variances, and XExtensions to different model classes (more b’s) and richer models are feasible.
33Temp: mean and ENSO variations plus weather Temp: mean and ENSO variations plus weather. This is based on biases which may change every 25 years
34IPCC (global) Us (NH)Note that we messed up a color code: our green is IPCC red, Our red is IPCC green (our blue is their purple). We didn’t do the constant thing (IPCC’s orange).
35Both n-hemi. Left: basis period = 25, right = no change.
36Discussion: Which approach is best? Depends on form and quality of observations and models and practicalityDevelop prior for X from scientific model (part A) offers strong incorporation of theory, but practical limits on richness of [ X | q ] may ariseModel output as “observations”Combining models: Just like different measuring devices;Nice for analysis & mixed (obs’ & comp.) designNeed a prior [ X | q ]Model output as realizations of prior “trends”Most common among Bayesian statisticiansCombining models: like combining experts36
37Discussion: Models versus Reality Need for modeling differences between X’s and O’s.Model “assessment” (“validation”, “verification”) helps, but is difficult in complicated settings:Global climate models. Virtually no observations at the scales of the models.Tuning. Modify model based on observations.Observations are imperfect, and are often output of other physical models.Massive data. Comparing space-time fields
38Part C) Combining approaches Discussion, Cont’dPart C) Combining approachesExample: Wikle et al 2001, JASA. Combined observations and large-scale model output as data with a prior based on some physicsUsually, many physical models. No best one, so it’s nice to be flexible in incorporating their informationThank You!38