Presentation on theme: "BIOLOGICAL PROXIES IN SEDIMENTARY ARCHIVES – PROGRESS, PROBLEMS, POTENTIALITIES John Birks University of Bergen and University College London NICE Autumn."— Presentation transcript:
BIOLOGICAL PROXIES IN SEDIMENTARY ARCHIVES – PROGRESS, PROBLEMS, POTENTIALITIES John Birks University of Bergen and University College London NICE Autumn School on Methods of Quantitative Palaeoenvironmental Reconstructions 6-10 October 2008
Introduction Indicator-species approach Assemblage approach Transfer-function approach Linear-based transfer-function methods Unimodal-based transfer-function methods Pollen-climate response surfaces Modern analogue-based approaches Consensus reconstructions and smoothers Problems of spatial autocorrelation in transfer functions Use of artificial simulated data-sets Multi-proxy approaches Use of methods Implications for current projects Potential palaeoecological applications of reconstructions Acknowledgements CONTENTS
INTRODUCTION Emphasis on biological climatic proxies in continental sedimentary archives Some mention of biological proxies in marine archives in order to put the available methods for the quantitative reconstruction of past climates into a broader context Three basic approaches indicator-species approach assemblage approach transfer-function approach (=calibration in statistics, bioindication in applied ecology) All can provide quantitative reconstructions of past climate Some of the methods were originally developed for the quantitative reconstruction of other environmental variables (e.g. lake-water pH) but are directly applicable to climate reconstructions
Gradient analysis: Environment gradientCommunity or Assemblage Gradient Analysis and Bioindication Relation of species to environmental variables or gradients In bioindication, use species optima or indicator values to obtain an estimate of environmental conditions or gradient values. Calibration, bioindication, reconstruction. Ecology Palaeoecology Community or AssemblageEnvironment gradient In gradient analysis, use environmental conditions or gradient values to explain community composition
Relevant Reviews of Continental Reconstruction Approaches Birks HJB (1995) Quantitative palaeoenvironmental reconstructions. In: Maddy & Brew (eds) Statistical Modelling of Quaternary Science Data. QRA pp Birks HJB (1998) Numerical tools in palaeolimnology – progress, potentialities, and problems. J. Paleolimnol. 20: Birks HJB (2003) Quantitative palaeoenvironmental reconstructions from Holocene biological data. In: Mackay et al. (eds) Global Change in the Holocene. Arnold, pp Birks HJB & Seppä H (2004) Pollen-based reconstructions of late-Quaternary climate in Europe – progress, problems, and pitfalls. Acta Palaeobotanica 44: ter Braak CJF (1995) Non-linear methods for multivariate statistical calibration in palaeoecology. Chemo. Intell. Lab. Sys. 28: ter Braak CJF (1996) Unimodal Models to Relate Species to Environment (2 nd edn). DLO-Agricultural Maths Group, Wageningen. ter Braak CJF & Prentice IC (1988) A theory of gradient analysis. Advances Ecol. Res. 18: Also for marine environments: Guiot J & de Vernal A (2007) Transfer functions: methods for quantitative paleoceanography based on microfossils. In: Hillaire-Marcel & de Vernal (eds) Proxies in Late Cenozoic Paleoceanography. Elsevier, pp
Notation X f Past climatic variable to be reconstructed Y f Fossil biological proxies used in the climatic reconstruction X m The same climatic variable as X f but modern values Y m Modern biological data from the same localities as X m Û m Modern transfer functions estimated mathematically from Y m and X m
Basic Idea of Quantitative Environmental Reconstruction Fossil biological dataEnvironmental variable (e.g., pollen, chironomids)(e.g., temperature) 'Proxy data' 1, m species1 Y f X f Unknown.tsamples To be estimated or reconstructed To solve for X f, need modern data about species and climate 1, m species1 Y m X mnsamples Modern biologyModern environment (e.g., pollen, chironomids)(e.g., temperature)
Viscum album (mistletoe), Hedera helix (ivy), & Ilex aquifolium (holly) at or near their northern limits in Denmark (Iversen 1944) Considered occurrence, growth, and fruit production in relation to mean temperature of the coldest month and of the warmest month to delimit 'thermal Hedera helix limits'. Iversen 1944 INDICATOR SPECIES APPROACH
The thermal-limit curves for Ilex aquifolium, Hedera helix, and Viscum album in relation to the mean temperatures of the warmest and coldest months. 1, 2, and 3 represent samples with pollen of Ilex, Hedera, & Viscum, Hedera & Viscum, and Ilex & Hedera, respectively. Mid-Holocene changes: ca ºC warmer summers ca ºC warmer winters Walther et al. (2005) show shifts in the northern limit of Ilex (holly) in last 50 years in response to climatic changes, confirming its sensitivity to climatic shifts. Viscum album Ilex aquifolium Birks 1981
ASSEMBLAGE APPROACH Compare fossil assemblage with modern assemblages from known environments. Identify the modern assemblages that are most similar to the fossil assemblage and infer the past environment to be similar to the modern environment of the relevant most similar modern assemblages. If done qualitatively, standard approach in Quaternary pollen analysis, etc., since 1950s. If done quantitatively, modern analogue technique or analogue matching.
Modern analogue technique (MAT) = k-nearest neighbours (k-NN). e.g. Hutson (1980), Prell (1985), Guiot (1985, 1987, 1990), ter Braak (1995) Repeat for all fossil samples Repeat for all modern samples (Y m ) Compare fossil sample (Y f ) t with modern sample i Calculate DC between t and i Estimate past environment (X f ) for sample t as (weighted) mean of the modern environment (X m ) of the k analogues Select k-closest analogues for fossil sample t Value of k estimated by visual inspection, arbitrary rules (e.g., 10, 20, etc.), or cross-validation DC=dissimilarity coefficient = proximity measure
Grichuk et al.USSR1950s–1960s Atkinson et al.UK1986, 1987 Coleoptera TMAX - mean temperature of warmest month TMIN - mean temperature of coldest month TRANGE - TMAX–TMIN Quote median values of mutual overlap and ‘limits given by the extremes of overlap'. Mutual Climatic Range Method
Thermal envelopes for hypothetical species A, B, and C Schematic representation of the Mutual Climate Range method of quantitative temperature reconstructions (courtesy of Adrian Walking).
ASSUMPTIONS 1.Species distribution is in equilibrium with climate. 2.Distribution data and climatic data are same age. 3.Species distributions are well known, no problems with species introductions, taxonomy or nomenclature. 4.All the suitable climate space is available for species to occur. ? Arctic ocean, ? Truncation of climate space. 5.Climate values used in MCR are the actual values where the beetle species lives in all its known localities. Climate stations tend to be at low altitudes; cold-tolerant beetles tend to be at high altitudes. ? Bias towards warm temperatures. Problems of altitude, lapse rates. 495 climate stations across Palaearctic region from Greenland to Japan. Deriving reliable climate data is a major problem.
Climate reconstructions from (a) British Isles, (b) western Norway, (c) southern Sweden and (d) central Poland. TMAX refers to the mean temperature of the warmest month (July). The chronology is expressed in radiocarbon years BPx1000 (ka). Each vertical bar represents the mutual climatic range (MCR) of a single dated fauna. The bold lines show the most probable value or best estimate of the palaeotemperature derived from the median values of the MCR estimates and adjusted with the consideration of the ecological preferences of the recorded insect assemblages. Coope & Lemdahl 1995
Kühl et al. (2002) Quaternary Research 58; Kühl (2003) Dissertations Botanicae 375; 149 pp. Kühl & Litt (2003) Vegetation History & Archeobotany 12; Basic idea is the quantify the present-day distribution of plants that occur as Quaternary fossils (pollen and/or macrofossils) in terms of July and January temperature and probability density functions (pdf). Assuming statistical independence, a joint pdf can be calculated for a fossil assemblage as the product of the pdfs of the individual taxa. Each taxon is weighted by the extent of its climatic response range, so 'narrow' indicators receive 'high' weight. The maximum pdf is the most likely past climate and its confidence interval is the range of uncertainty. Can be used with pollen (+/-) and/or macrofossils (+/-). Probability Density Functions
Estimated probability density function of Ilex aquilifolium as an example for which the parametric normal distribution (solid line) fits well the non-parametric distribution (e.g., Kernel function (dashed line) histogram). Distribution of Ilex aquilifolium in combination with January temperature. Kühl et al. 2002
Estimated one- and two-dimensional pdfs of four selected species. The histograms (non-parametric pdf) and normal distributions (parametric pdf) on the left represent the one-dimensional pdfs. Crosses in the right-hand plots display the temperature values provided by the 0.5º x 0.5º gridded climatology (New et el., 1999). Black crosses indicate presence, grey crosses absence of the specific taxon. A small red circle marks the mean of the corresponding normal distribution and the ellipses represent 90% of the integral of the normal distribution centred on . Most sample points lie within this range. The interval, however, may not necessarily include 90% of the data points. Carex secalina as an example of an azonally distributed species is an exception. A normal distribution does not appear to be an appropriate estimating function for this species, and therefore no normal distribution is indicated. Kühl et al. 2002
Reconstruction for the fossil assemblage of Gröbern. The thin ellipses indicate the pdfs of the individual taxa included in the reconstruction, and the thick ellipse the 90% uncertainty range of the reconstruction result. Kühl et al. 2002
Simplified pollen diagram for last interglacial from Gröbern (Litt 1994), reconstructed January and July temperature, and 18 O (after Boettger et al. 2000). Kühl & Litt 2003
Reconstructed most probable mean January (blue) and July (red) temperature and 90% uncertainty range (dotted lines) for three last interglacial sequences Kühl & Litt 2003
Comparison of the reconstructed mean January temperature using the pdf- method (green) and the modern analogue technique (blue). Bispingen uncertainty range – 90%; La Grande Pile – 70%. Kühl & Litt 2003
TRANSFER-FUNCTION APPROACH General Theory Y - biological responses ("proxy data") X - set of environmental variables that are assumed to be causally related to Y (e.g. sea-surface temperatures) B - set of other environmental variables that together with X completely determine Y (e.g. trace nutrients) If Y is totally explicable as responses to variables represented by X and B, we have a deterministic model (no allowance for random factors, historical influences) Y = XB If B = 0 or is constant, we can model Y in terms of X and Re, a set of ecological response functions Y = X (Re) In palaeoecology we need to know Re. We cannot derive Re deductively from ecological studies. We cannot build an explanatory model from our currently poor ecological knowledge. Instead we have to use direct empirical models based on observed patterns of Y in modern surface-samples in relation to X, to derive U, our empirical calibration or transfer functions. Y = XU Imbrie & Kipp 1971
BIOLOGICAL DATA (e.g. Diatoms, pollen, chironomids) ENVIRONMENTAL DATA (e.g. Mean July temperature) Modern data ”training set” 1,, m taxa n samples YmYm 1 variable n samples X m Û m Basic Idea of Quantitative Environmental Reconstructions using Transfer Functions + Fossil data 1,, m taxa t samples YfYf 1 variable t samples XfXf Unknown To be reconstructed from Û m + Û m
In practice, this is a two-step process Regression in which we estimate Û m, modern calibration functions or regression coefficients Y m = Û m (X m )Training set Y m modern surface-sample data X m associated environmental data (classical regression) or X m = Û m (Y m )(inverse regression) Calibration, in which we reconstruct X f, past environment, from fossil core data using transfer functions Û m or their inverse Û m -1 X f = Û m -1 (Y f )Y f fossil core data fossil set (classical calibration) orX f = Û m (Y f )(inverse calibration) ^ ^ ^
Based on an unpublished diagram by Steve Juggins XmXm XfXf YfYf YmYm ÛmÛm TRANSFER FUNCTION
Surface-Sediment Sampling Renberg HON corer Renberg HON corer + sediment core & mud-water interface
304 modern pollen samples Norway, northern Sweden, Finland (Sylvia Peglar, Heikki Seppä, John Birks, Arvid Odland) All from lakes of comparable size and morphometry, all collected in same way, and all with consistent pollen identifications and taxonomy Now extended to Estonia, Latvia, Lithuania, Sweden and western Russia – c. 950 comparable samples (Seppä et al. unpublished) Seppä & Birks 2001
PROXIES Betula (birch)Alnus (alder) Quercus (oak)Pinus (pine) Empetrum nigrum (crowberry) Agropyron repens (Gramineae) (grass) Modern pollen, identical treatment, all at same magnification, all stained with safranin Pollen - good indicators of vegetation and hence indirect indicators of climate.
Chironomids - good indicators of past lake-water temperatures and hence indirectly of past climate Common late-glacial chironomid taxa. A: Tanytarsina; b: Sergentia; c: Heterotrissocladius; d: Hydrobaenus/Oliveridia; e: Chironomus; f: Dicrotendipes; g: Microtendipes; h: Polypedilum; i: Cladopelma. Scale bar represents 50 m.
Freshwater diatoms - excellent indicators of lake-water chemistry (e.g. pH, total P). Not reliable direct climate indicators.
Basic Biological Assumptions Marine planktonic foraminifera - Imbrie & Kipp 1971 Foraminifera are a function of sea-surface temperature (SST) Foraminifera can be used to reconstruct past SST Pollen Pollen is a function of vegetation Vegetation is a function of climate Pollen is an indirect function of climate and can be used to reconstruct past climate Chironomids (aquatic non-biting midges) Chironomids are a function of lake-water temperature Lake-water temperature is a function of climate Chironomids are an indirect function of climate and can be used to reconstruct past climate Freshwater diatoms (microscopic algae) Diatoms are a function of lake-water chemistry Diatoms can be used to reconstruct past lake-water chemistry Lake-water chemistry may be a very weak function of climate Diatoms may be a very weak function of climate
Biological Proxy Data Properties May have species, expressed as proportions or percentages in samples “Closed” compositional data – difficult statistical properties Multicollinearity Biological data contain many zero values (absences) Species generally show non-linear unimodal responses to their environment, not simple linear responses
Environmental Data Properties Generally few variables, often show a skewed distribution Strong multicollinearity (e.g. July mean temperature, growing season duration, annual mean temperature) Often difficult to obtain (few modern climate stations, corrections for altitude of sampling sites, etc.) Strong spatial autocorrelation (tendency of values at sites close to each other to resemble one another more than randomly selected sites. Values at one site can be partially predicted from its values at neighbouring sites)
. A unimodal relation between the abundance value (y) of a species and an environmental variable (x). (u=optimum or mode; t=tolerance; c=maximum). A straight line displays the linear relation between the abundance value (y) of a species and an environmental variable (x), fitted to artificial data (). (a=intercept; b=slope or regression coefficient). LINEAR UNIMODAL Species Response Models
Species Responses Species nearly always have non-linear unimodal responses along gradients J. Oksanen 2002 trees (m)
1.Taxa in training set (Y m ) are systematically related to the climate (X m ) in which they live. 2.Environmental variable (X f, e.g. summer temperature) to be reconstructed is, or is linearily related to, an ecologically important variable in the system. 3.Taxa in the training set (Y m ) are the same as in the fossil data (Y f ) and their ecological responses (Û m ) have not changed significantly over the timespan represented by the fossil assemblage. 4.Mathematical methods used in regression and calibration adequately model the biological responses (U m ) to the environmental variable (X m ). 5.Other environmental variables than, say, summer temperature have negligible influence, or their joint distribution with summer temperature in the fossil set is the same as in the training set. 6.In model evaluation by cross-validation, the test data are independent of the training data. Assumptions in Quantitative Palaeoclimatic Reconstructions Birks et al. 1990, Telford & Birks 2005
Approaches to Estimating Transfer Functions (1)Y = f(X) + error Biology Environment (2)Estimate f by some mathematical procedure and 'invert' our estimated (f) to find unknown past environment X f from fossil data Y f X f f -1 (Y f ) (4)X f = g(Y f ) Obtain 'plug-in' estimate of past environment X f from fossil data Y f f or g are 'transfer functions' (3)X = g(Y) + error In practice, for various mathematical reasons, do an inverse regression or calibration Inverse Approach Classical Approach 1. Basic Numerical Models
2. Assumed Species Response Model Linear or unimodal No response model assumed (linear or non-linear) 3. Dimensionality Full (all species considered) Reduced (species components used) 4. Estimation Procedure Global (estimate parametric functions, extrapolation possible) Local (estimate non-parametric functions, extrapolation not possible)
Commonly Used Methods Principal components regression (PCR)IL (U)RG Segmented linear inverse regressionILFLn Partial least squares (PLS)ILRG Guassian logit regression (GLR)CUFG Two-way weighted averaging (WA)CUFG WA-PLSIURG Modern analogue technique (MAT)INAFLn Artificial neural networks (ANN)INAFLn Smooth response surfacesCNAFLn I = inverse; C = classical L = linear; U = unimodal; NA = not assumed; R = reduced dimensionality; F = full dimensionality; G = global parametric estimation; Ln = local non-parametric estimation
Reasons for preferring methods with assumed species response model, full dimensionality, and global parametric estimation 1.Can test statistically if species A has a statistically significant relation to particular climate variables 2.Can develop ‘artificial’ simulated data with realistic assumptions for numerical ‘experiments’ 3.Such methods have clear and testable assumptions – less of a ‘black box’ than, for example, artificial neural networks 4.Can develop model evaluation or diagnostic procedures analogous to regression diagnostics in statistical modelling 5.Having a statistical basis, can adopt well-established principles of statistical model selection and testing. Minimises ‘ad hoc’ aspects
LINEAR-BASED TRANSFER FUNCTIONS Inverse Multiple Regression Approach = principal components regression Multiple regression of temperature (X m ) on abundance of taxa in core tops (Y m ) (inverse regression) X m = Û m Y m = b 0 + b 1 y 1 + b 2 y 2 + b 3 y 3 …+ b m y m X f = Û m Y f = b 0 + b 1 y 1 + b 2 y 2 + b 3 y 3 …+ b m y m i.e.X f = 0 + k y ik + m k=1 ^ ^ Approach most efficient if: 1.relation between each taxon and environment is linear with normal error distribution 2.environmental variable has normal distribution
Usually not usable because: 1.taxon abundances show multicollinearity 2.very many taxa 3.many zero values, hence regression coefficients unstable 4.basically linear model Consider non-linear model and introduce extra terms: X m = b 0 + b 1 c 1 + b 2 c b 3 c 2 + b 4 c b 5 c 3 + b 6 c … Can end up with more terms than samples. Cannot be solved. Hence dimension reduction approach of Imbrie & Kipp (1971)
Location of 61 core top samples (Imbrie & Kipp 1971) 61 core-top samples x 27 taxa Principal components analysis + varimax rotation (‘factor analysis’) 61 samples x 4 varimax assemblages (79%) Will use as illustrative data-set in this lecture
Abundance of the tropical assemblage versus winter surface temperature for 61 core top samples. Data from Tables 4 and 13. Curve fitted by eye Abundance of the subtropical assemblage versus winter surface temperature for 61 core top samples. Data from Tables 4 and 13. Curve fitted by eye Abundance of the subpolar assemblage versus winter surface temperature for 61 core top samples. Data from Tables 4 and 13. Curve fitted by eye Abundance of the polar assemblage versus winter surface temperature for 61 core top samples. Data from Tables 4 and 13. Curve fitted by eye Imbrie & Kipp 1971
Now did inverse regression using 4 varimax assemblages rather than the 27 original taxa. X m = b 0 + b 1 A + b 2 B + b 3 C + b 4 DLinear where A, B, C and D are varimax assemblages. X m = b 0 + b 1 A + b 2 B + b 3 C + b 4 D + b 5 AB + b 6 AC + b 7 AD + b 8 BC + b 9 BD + b 10 CD + b 11 A 2 + b 12 B 2 + b 13 C 2 + b 14 D 2 Non-linear, quadratic CALIBRATION STAGE using the fossil assemblages described as the 4 modern varimax assemblages X f =b 0 + b 1 A f + b 2 B f + b 3 C f + b 4 D f ^^^^^^
General abundance trends for four of the varimax assemblages related to winter surface temperatures. Winter surface temperatures "measured" by Defant (1961) versus those estimated from the fauna in 61 core top samples by means of the transfer function. Imbrie & Kipp 1971
Average surface salinities ”measured” by Defant (1961) versus those estimated from the fauna in 61 core top samples. Summer surface temperatures ”measured” by Defant (1961) versus those estimated from the fauna in 61 core top samples. Imbrie & Kipp 1971
Palaeoclimatic estimates for 110 samples of Caribbean core V12-133, based on palaeoecological equations derived from 61 core tops. T w = winter surface temperature; T s = summer surface temperature; ‰ = average surface salinity. Salinity Imbrie & Kipp 1971
Multiple linear regression or quadratic regression of X on PC1, PC2, PC3, etc. PCA components maximise variance within Y Selection of components done visually until very recently. Now cross-validation is used to select model with fewest components, lowest root mean square error of prediction (RMSEP), and lowest maximum bias. ‘Minimal adequate model’ in statistical modelling. YX PC1 PC2 PC3 Principal components regression (= Imbrie & Kipp (1971) approach)
1.Why 4 assemblages? Why not 3, 5, 6? No cross- validation 2.Assemblages inevitably unstable, because of many transformation, standardization, and scaling options in PCA 3.Assumes linear relationships between taxa and their environment although non-linear models are possible (quadratic terms) but curiously rarely used 4.Expressing fossil assemblages in terms of modern varimax assemblages. Problems of fossil assemblages not resembling modern assemblages (no-analogue problem) 5.No sound theoretical basis, ‘ad hoc’ Approach unsatisfactory because:
Scatter diagrams of: (A) % birch (Betula); and (B) % oak (Quercus) pollen versus latitude. The thirteen regions for which regression equations were obtained. Local estimation of parametric functions Bartlein & Webb 1985 Segmented Linear Inverse Regression
Regression equations for mean July temperature from the thirteen calibration regions in eastern North America Region A: N; W Pollen sum: Alnus + Betula + Cyperaceae + Forb sum + Gramineae + Picea + Pinus July T (oC) = *Pinus *Forb sum *Picea.5 (1.61) (.14) (.05) (.10) *Cyperaceae.5 – 0.37*Gramineae – 0.03*Alnus (.13) (.08) (.01) R 2 = 0.80; adj. R 2 = 0.78; Se = 0.96oC n = 114; F = 69.86; Pr = Region B: N; W Pollen sum: Abies + Alnus + Betula + Herb sum + Picea + Pinus July T (oC) = *Picea *Betula *Herb sum – 0.01*Alnus (2.27) (.19) (.14) (.01) (.01) R 2 = 0.70; adj. R 2 = 0.70; Se = 1.52oC n = 165; F = 95.48; Pr =
Regression equations used to reconstruct mean July temperature at 6000 yr BP. Bartlein & Webb 1985 "We selected the appropriate equation for each sample by identifying the calibration region that; (1) contains modern pollen data that are analogous to the fossil sample; and (2) has an equation that does not produce an unwarranted extrapolation when applied to the fossil sample.” ‘Ad hoc’
Isotherms for estimated mean July temperatures (ºC) at 6000 yr BP. Bartlein & Webb 1985
Reconstructions produced by the regression approach using regression equations from five different calibration areas Elk Lake, Minnesota Bartlein & Whitlock 1993
Chemometrics – predicting chemical concentrations from near Infra-red spectra Linear-based methods Approaches to Multivariate Calibration Responses Predictors
Form of PC regression developed in chemometrics PCR -components are selected to capture maximum variance within the predictor variables (species in our case) irrespective of their predictive value for the environmental response variable PLS -components are selected to maximise the covariance between linear combinations of predictor variables with the environmental response variables Partial Least Squares Regression – PLS Inverse approach PLS usually requires fewer components and gives a lower prediction error than PCR. Both are ‘biased’ inverse regression methods that guard against multi- collinearity among predictors by selecting a limited number of uncorrelated orthogonal components (reduced rank or dimensionality). (Biased because some data are discarded; reduced dimensionality).
CONTINUUM REGRESSION = 0 = normal inverse least squares regression = 0.5 = PLS = 1.0 = PCR PLS is thus a compromise and performs well by combining desirable properties of inverse regression (high correlation) and PCR (stable predictors of high variance) into one technique. PLS will always give a better fit (r 2 ) than PCR with same number of components. Stone & Brooks 1990
Partial least squares regression (PLS) Components selected to maximise covariance between linear combinations of species and environmental variables X Selection of number of PLS components to include based on cross-validation. Model selected should have fewest components possible and low RMSEP and low maximum bias. Y = speciesX = environmental variables YX PLS1 PLS2 PLS3
UNIMODAL-BASED TRANSFER FUNCTIONS In late 1980s major environmental issue was so-called ‘acid-rain’ research. Involved establishing causes of recent surface-water acidification in lakes and rivers in NW Europe and parts of N America. Very politically charged with large power-generating companies employing scientists who were well paid to find faults in whatever one did. Led to major developments in palaeolimnology (and later to palaeoclimatology) thanks to the work and doctoral thesis (1987) of Cajo ter Braak, a Dutch statistician, on numerical and statistical methods based on the biologically realistic unimodal response model. Cajo ter Braak, April 1992
1.Need biological system with abundant fossils that is responsive and sensitive to environmental variables of interest. 2.Need a large, high-quality training set of modern samples. Should be representative of the likely range of variables, be of consistent taxonomy and nomenclature, be of highest possible taxonomic detail, be of comparable quality (methodology, count size, etc.), and be from the same sedimentary environment. 3.Need fossil set of comparable taxonomy, nomenclature, quality, and sedimentary environment. 4.Need good independent chronological control for fossil set. 5.Need robust statistical methods for regression and calibration that can adequately model taxa and their environment with the lowest possible error of prediction and the lowest bias possible. 6.Need statistical estimation of standard errors of prediction for each constructed value. 7.Need statistical and ecological evaluation and validation of the reconstructions. Basic Requirements in Quantitative Palaeoenvironmental Reconstructions Birks et al. 1990
Estimate the compositional turnover or gradient length for the environmental variable(s) of interest. Detrended canonical correspondence analysis with x as the only external or environmental predictor. Detrend by segments, non linear rescaling. Estimate of gradient length in relation to x in standard deviation (SD) units of compositional turnover. Length may be different for different climatic variables and the same biological data. July mean T 2.62 SD Jan mean T2.76 SD annual mean T1.52 SD If gradient length < 2 SD, taxa are generally behaving monotonically along gradient and linear-based methods are appropriate (e.g. PLS). If gradient length > 2 SD, several taxa have their optima located within the gradient and unimodal-based methods are appropriate. Linear or Unimodal Methods? Birks 1995
Gaussian response model - regression + We know observed abundances Y m + We know gradient values X m Estimate or model the species response curves for all species (Û m ) Bioindication - calibration + We know observed abundances Y f + We know the modelled species response curves for all species (Û m ) = Estimate the gradient value of X f The most likely value of the gradient is the one that maximises the likelihood function given observed and expected abundances of species Unimodal Classical Methods Maximum Likelihood Prediction of Gradient Values
Gaussian logit regression can be parameterised as a generalised linear model (binomial or quasi-binomial error structure, logarithmic link function). Solved by maximum likelihood estimation. Can estimate optimum (u), tolerance (t), and maximum (c) for each species
Imbrie and Kipp core tops27 taxa Summer SSTWinter SSTSalinity Significant Gaussian logit model Significant increasing linear logit model Significant decreasing linear logit model No significant relationship Guassian logit regression
Likelihood is the probability of a given observed value with a certain expected value Maximum likelihood estimation: expected or reconstructed values that give the highest likelihood for the observed fossil assemblages - ML estimates are close to observed values, and the proximity is measured with the likelihood function - commonly use the negative logarithm for the likelihood, since combined probabilities may be very small J. Oksanen 2002 Maximum likelihood approach to reconstruction
Modified from J. Oksanen 2002 Temperature (ºC) Modern responses -logLik Inferred Observed Inferring past temperature from multivariate species composition
Guassian logit regression (GLR) and maximum likelihood (ML) calibration Y m + X m YfYf XfXf b 0, b 1, b 2 species GLR regression coefficients for all species modern data fossil data environmental reconstruction ML ^
Winter SST ºC Summer SST ºC Salinity ‰ Imbrie & Kipp 1971 linear Imbrie & Kipp 1971 non-linear Maximum likelihood regression and ML calibration Root mean squared error for winter SST, summer SST, and salinity using different procedures
Problems with Gaussian logit regression and maximum likelihood calibration 1.Computationally difficult – a bug existed in our software for nearly 10 years! 2.Computationally demanding and time consuming (and complex) to derive sample-specific prediction errors. Less of a problem as computing power increases and reliable procedures for computer-intensive procedures are becoming more available in, for example, R ter Braak (1987) showed that simple two-way weighted averaging (WA) closely approximates GLR with artificial data. ter Braak and van Dam (1989) and Birks et al. (1990) showed that WA performed nearly as well (or even slightly better) than GLR with real data.
The basic idea is very simple. In a lake with a certain temperature range, chironomids with their temperature optima close to the lake’s temperature will tend to be the most abundant species present. A simple estimate of the species’ temperature optimum is thus an average of all the temperature values for lakes in which that species occurs, weighted by the species’ relative abundance. (WA regression) Two-Way Weighted Averaging Conversely, an estimate of a lake's temperature is the weighted average of the temperature optima of all the species present. (WA calibration)
Weighted averaging regression Optimum Tolerance where Û k is the WA optimum of taxon k t k is WA standard deviation or tolerance of k Y ik is percentage of taxon k in sample i X i is environmental variable of interest in sample i And there are i=1,....,n samples and k=1,....,m taxa Estimating species optima and tolerances from modern data ^
Weighted averaging calibration A lake will tend to be dominated by taxa with temperature optima close to the lake's temperature Estimate of this temperature is given by averaging the optima of all taxa present in the lake. If a species' abundance data are available these can be used as weights: WA Calibration where X i = estimate of environmental variable for fossil sample i Y ik = abundance of species k in fossil sample i Û k = optima of species k Reconstructing an environmental variable from a fossil assemblage ^
Weighted averaging calibration or reconstruction WA WA tol Simple WA WA with tolerance downweighting
Two-way weighted averaging Y m + X m YfYf XfXf U1 U2 Ut species optima ‘transfer function’ modern data fossil data environmental reconstruction ^
Two-way weighted averaging. ter Braak & van Dam (1989) and Birks et al. (1990) (i)Estimate species optima (u ) by weighted averaging of the environmental variable (x) of the sites. Species abundant at a site will tend to have their ecological optima close to the environmental variable at that site. (WA regression). (ii)Estimate the environmental values (x ) at the sites by weighted averaging of the species optima (u ). (WA calibration.) (iii)Because averages are taken twice, the range of estimated x-values is shrunken, and a simple 'inverse' or 'classical' deshrinking is required. Usually regress x on the preliminary estimates (x ) and take the fitted values as final estimates of x. (Deshrinking regression) Can downweight species in step (ii) by their estimated WA tolerances (niche breadths) so that species with wide tolerances have less weight than species with narrow tolerances (WA tol ) [Two-way WA with inverse regression = canonical correspondence analysis regression] ^ ^ ^ ^
TRAINING SET modern samples and lake-water temperature (T) Jack-knifing Do reconstruction 177 times. Leave out sample 1 and reconstruct T; add sample 1 but leave out sample 2 and reconstruct T. Repeat for all 177 reconstructions using a training set of size 177 leaving out one sample every time. Can derive jack- knifing estimate of T and its variance and hence its standard error. Bootstrap Draw at random a training set of 178 samples using sampling with replacement so that same sample can, in theory, be selected more than once. Any samples not selected form an independent test set. Reconstruct T for both modern test- set samples and for fossil samples. Repeat for 1000 bootstrap cycles. Mean square error of prediction = 1. error due to variability in estimating species parameters in training set (i.e. s.e. of bootstrap estimates)+ 2. error due to variation in species abundances at a given T (i.e. actual prediction error differences between observed T and the mean bootstrap estimate of T for modern samples when in the independent test). This component commonly ignored in simple bootstrapping. Birks et al Estimation of Sample-Specific Errors - basic idea of computer re-sampling procedures
Use of data and the bootstrap distribution to infer a sampling distribution. The bootstrap procedure estimates the sampling distribution of a statistic in two steps. The unknown distribution of population values is estimated from the sample data, then the estimated population is repeatedly sampled to estimate the sampling distribution of the statistic.
61 sample training set, draw 61 samples at random with replacement to give a bootstrap training set of size 61. Any samples not selected form a test set. Mean square error of prediction = error due to variability in estimates of optima and/or tolerances in training set error due to variation in abundances at a given temperature + (s.e. of bootstrap estimates)(actual prediction error differences between observed x i and mean bootstrap estimate S1S1 S2S2 + (x i,boot is mean of x i,boot for all cycles when sample i is in test set). For a fossil sample RMSE = (S 1 + S 2 ) ½ Error estimation by bootstrapping WACALIB 3.1+ & C2
Root mean square errors of prediction estimated by bootstrapping W AW A tol Summer sea-surface temperature C Training set RMSE total RMSE S RMSE S Fossil samples Winter sea-surface temperature C Training set RMSE total RMSE S RMSE S Fossil samples Salinity ‰ Training set RMSE total RMSE S RMSE S Fossil samples
ROOT MEAN SQUARED ERROR (RMSE) of CORRELATION BETWEEN x i and x i r COEFFICIENT OF DETERMINATIONr 2 r or r 2 measures strength between observed and inferred values and allows comparison between transfer functions for different variables. RMSE 2 = error 2 + bias 2 Error= SE (x i - x i )RANDOM PREDICTION ERROR ABOUT BIAS Bias= Mean (x i - x i ) SYSTEMATIC PREDICTION ERROR (Mean of prediction errors) Also Maximum bias – divide sampling interval of x i into equal intervals (usually 10), calculate mean bias for each interval, and the largest absolute value of mean bias for an interval is used as a measure of maximum bias. Transfer Function Assessment Note in RMSE the divisor is n, not (n - 1) as in standard deviation. This is because we are using the known gradient values only. ^ ^ ^
Reliable : Root mean squared error of prediction (RMSEP) Unreliable: Correlation as it depends on the range of observations Root mean squared error Bias b: systematic difference Error : random error about bias. RMSE 2 = b 2 + 2 Must be cross-validated or will be badly biased J. Oksanen 2002 Bias and Error
Leave-one-out ('jack-knife'), each in turn, or divide data into training and test data sets. Leave-one-out changes the data too little, and hence exaggerates the goodness of prediction. K-fold cross-validation leaves out a certain proportion (e.g. 1/10) and evaluates the model for each of the data sets left out. Badly biased unless one does cross-validation J. Oksanen 2002 Cross-validation
PARTITIONING RMSEP RMSEPcf. RMSE r cvr r 2 cvr 2 PREDICTED VALUESAPPARENT VALUES or ESTIMATED VALUESmean biasmaximum bias TRAINING SET ASSESSMENT AND SELECTION Lowest RMSEP, highest r or r 2 cv, lowest mean bias, lowest maximum bias. Often a compromise between RMSEP and bias. RMSEP 2 = ERROR 2 + BIAS 2 Error due to estimating optima and tolerances Error due to variations in abundance of taxa at given environmental value Cross-validation statistics S S 2 2
INITIAL ASSUMPTIONS 1.Taxa related to physical environment. 2.Modern and fossil taxa have same ecological responses. 3.Mathematical methods adequately model the biological responses. 4.Reconstructions have low errors. 5.Training set is representative of the range of variation in the fossil set. Statistical and Ecological Evaluation of Reconstructions
1.RMSEP for individual fossil samples Monte Carlo simulation using leave-one-out initially to estimate standard errors of taxon coefficient and then to derive specific sample standard errors, or bootstrapping. 2.Goodness-of-fit statistics Canonical correspondence analysis (= unimodal regression) of calibration set, fit fossil sample passively on axis (environmental variable of interest), examine squared residual distance to axis, see if any fossil samples poorly fitted. 3.Analogue statistics Good and close analogues. Extreme 5% and 2.5% of modern DCs. 4.Percentages of total fossil assemblage that consist of taxa not represented in all calibration data set and percentages of total assemblage that consist of taxa poorly represented in training set (e.g. < 10% occurrences) and have coefficients poorly estimated in training set (high variance) of beta values in cross-validation). < 5% not presentreliable < 10% not presentokay < 25% not presentpossibly okay > 25% not presentnot reliable Reconstruction evaluation or diagnostics
Transfer-function performance statistics Imbrie & Kipp 1971 data RMSEPSummer SSTWinter SSTSalinity PC regression2.55ºC2.57ºC0.57 ‰ PC regression with quadratic terms 2.15ºC1.54ºC0.57 ‰ GLR (ML)1.63ºC1.20ºC0.54 ‰ WA2.02ºC1.97ºC0.57 ‰
Reconstructions of the pH history of Lysevatten based on historical data and inference from the subfossil diatoms in the sediment. Historical data are pH measurements (thin solid line) and indirect data from fish reports and data from other similar lakes (thin broken line). The insert, showing pH variations from April 1961 to March 1962, is based on real measurements. Diatom- inferred values (thick solid line) were obtained by weighted averaging. ECOLOGICAL VALIDATION Diatoms and pH Renberg & Hultberg 1992
1.Ecologically plausible – based on unimodal species response model. 2.Mathematically simple but has a rigorous mathematical theory. Properties fairly well known now. 3.Empirically powerful: a.does not assume linear responses b.not hindered by too many species, in fact helped by many species! Full dimensionality c.relatively insensitive to outliers 4.Tests with simulated and real data – at its best with noisy, species-rich compositional percentage data with many zero values over long environmental gradients (> 3 standard deviations). 5.Because of its computational simplicity, can derive error estimates for predicted inferred values. 6.Does well in ‘non-analogue’ situations as it is not based on the assemblage as a whole but on INDIVIDUAL species optima and/or tolerances. Global parametric estimation. 7.Ignores absences. 8.Weaknesses. Weighted Averaging – An Assessment
Species packing model: Gaussian logit curves of the probability (p) that a species occurs at a site, against environmental variable x. The curves shown have equispaced optima (spacing = 1), equal tolerances (t = 1) and equal maximum probabilities occurrence (p max = 0.5). x o is the value of x at a particular site. ter Braak 1987
Diatoms and pH Birks 1994
1.Sensitive to distribution of environmental variable in training set. 2.Considers each environmental variable separately. 3.Disregards residual correlations in species data. Can extend WA to WA-partial least squares to include residual correlations in species data in an attempt to improve our estimates of species optima. Weaknesses of Weighted Averaging
WA estimate of species optimum (ũ) is good if: 1.Sites are uniformly distributed over species range 2.Sites are close to each other WA estimates of gradient values (x) are good if: 1.Species optima are dispersed uniformly around x 2.All species have equal tolerances 3.All species have equal modal abundances 4.Optima are close together y=abundancei=species u=optimumj=site x=gradient value~=WA estimate These conditions are only true for infinite species packing conditions! Weighted averages
Weighted averages are good estimates of Gaussian optima, unless the response is truncated. Bias towards the gradient centre: shrinkage, hence the need for deshrinking regression J. Oksanen 2002 pH WA GLR WA GLR Biases and truncation in weighted averages
Chemometrics – predicting chemical concentrations from near infra-red spectra Responses Predictors LINEAR UNIMODAL PCR PLS CAR WA-PLS Inverse Approaches to Multivariate Calibration
Roux 1979 Reduced Imbrie & Kipp (1971) modern foraminifera data to 3 CA axes. Then used these in inverse regression. Reduced dimensionality RMSEP Summer tempWinter temp PC regression2.55°C2.57°C PC regression with2.15°C1.54°C quadratic terms CA regression1.72°C1.37°C GLR (ML)1.63°C1.20°C WA2.02°C1.97°C WA-PLS1.53°C1.17°C Shows importance of using a unimodal-based method Correspondence Analysis Regression
Extend simple WA to WA-PLS to include residual correlations in species data in an attempt to improve our estimates of species optima. Partial least squares (PLS) Form of PCA regression of x on y PLS components selected to show maximum covariance with x, whereas in PCA regression components of y are calculated irrespective of their predictive value for x. Weighted averaging PLS WA = WA-PLS if only first WA-PLS component is used WA-PLS uses further components, namely as many as are useful in terms of predictive power. Uses residual structure in species data to improve our estimates of species parameters (optima) in final WA predictor. Optima of species that are abundant in sites with large residuals are likely to be updated most in WA-PLS. Weighted Averaging Partial Least Squares (WA-PLS)
Weighted averaging partial least squares regression (WA-PLS). ter Braak & Juggins (1993) and ter Braak et al. (1993) YX PLS1 PLS2 PLS3 Components selected to maximise covariance between species weighted averages and environmental variable x Selection of number of PLS components to include based on cross- validation. Model selected should have fewest components possible and low RMSEP and maximum bias – minimal adequate model. Reduced dimensionality. Global parametric estimation.
1.Centre the environmental variable by subtracting weighted mean. 2.Take the centred environmental variable (x i ) as initial site scores – (cf. WA) 3.Calculate new species scores by WA of site scores. 4.Calculate new site scores by WA of species scores. 5.For component 1, go to 6. For components 2 and more, make site scores uncorrelated with previous axes. 6.Standardise new site scores and (cf. WA) use as new component. 7.Regress environmental variable on the components obtained so far using a weighted regression (inverse) and take fitted values as current estimate of estimated environmental variable. Go to step 2 and use the residuals of the regression as new site scores (hence name ‘partial’) (cf WA). Optima of species that are abundant in sites with large residuals likely to be most updated. Weighted averaging partial least squares – WA-PLS
Using the WA-PLS model, inferred environment is calculated as a weighted sum: where x i is the inferred environment for sample i, y ik is the abundance of fossils of taxon k in sample i, and k is the Beta of species k Calculating inferred environment variable ^ ^
Performance of WA-PLS in relation to the number of components (s): apparent error (RMSE) and prediction error (RMSEP). The estimated optimum number of components is 3 because three components give the lowest RMSEP in the training set. The last column is not generally available for real data as it is based on independent test data. These suggest a 2-component model. Problem of ‘overfitting’ the model. sTraining setTest set ApparentLeave-one-out RMSERMSEPRMSEP * * ter Braak & Juggins 1993 Leave-one-out and test set cross-validation
The performance of WA-PLS applied to three diatom data sets for different number of components (s) in terms of apparent RMSE and leave-one-out (RMSEP) (* = selected model). Dataset SWAP Bergen Thames RMSERMSEPRMSERMSEPRMSERMSEP s * * * Reduction in prediction error (%)01932 ter Braak & Juggins 1993
Leave-one-out cross validation Predicted air temperature. RMSEP = 0.89ºC Bias = 0.61ºC Predicted – observed air temperature 1:1 Norwegian chironomid–climate training set
Brooks & Birks 2000
Inferred mean July air temperature Oxygen isotope ratios Brooks & Birks 2000
Precipitation mm Mean July ºC Mean January ºC Norwegian pollen and climate Birks et al. (unpublished)
Generalised additive models
Root mean squared errors of prediction (RMSEP) based on leave-one-out jack-knifing cross-validation for annual precipitation, mean July temperature, and mean January temperature using five different statistical models. Ppt n JulyJanuary (mm)( C)( C) Weighted averaging (WA) (inverse) WA-PLS Modern analogue technique (MAT)
Vuoskojaurasj, Abisko, Sweden
Vuoskojaurasj consensus reconstructions
Tibetanus, Abisko Valley Inferred from pollen Hammarlund et al Reconstruction validation Oxygen isotopes
Björnfjelltjörn, N. Norway
Björnfjelltjörn consensus reconstructions
304 modern pollen samples Norway, northern Sweden, Finland (Sylvia Peglar, Heikki Seppä, John Birks, Arvid Odland) Seppä & Birks (2001) Broad-Scale Studies
RMSEPR2R2 Max. bias July temperature ( ºC)1.0ºC ºC Annual precipitation ( mm)341 mm mm Performance statistics - WA-PLS - leave-one-out cross-validation Seppä & Birks (2001) Predicted versus observedResiduals
Broad-scale patterns in western Norway using pollen data Changes in July summer temperature relative to present-day reconstructed temperature on a S-N transect west of the Scandes mountains. 16 sites covering all or much of the Holocene. South
Nesje et al. (2005) North
1.Different methods, although they have similar modern model performances, can give very different reconstruction results. Birks (2003) Some Warnings!
Validate using another proxy - macrofossils of tree birch Importance of independent validation but which proxies to use? 2.Use of different proxies - different proxies may give different reconstruction, e.g. mean July temperature at Björnfjell, northern Norway.
3.Covarying environmental variables e.g. temperature and lake trophic status (e.g. total N or P) or temperature and lake depth Brodersen & Anderson (2002)
Anderson 2000 pH and climate Environmental variables often co-vary Is the inferred temperature change real or a result of pH changes that weakly co-vary with temperature?
PROBLEMS 1.Assessment of ‘most similar’? 2.1, 2, 9, 10 most similar? 3.No-analogues for past assemblages. 4.Choice of similarity measure. 5.Require huge set of modern samples of comparable site type, pollen morphological quality, etc, as fossil samples. Must cover vast geographical area. 6.Human impact, especially in Eurasia and North America. 7.Multiple analogues (the Pinus problem). MODERN ANALOGUE-BASED APPROACHES Do an analogue-matching between fossil sample i and all available modern samples with associated environmental data. Find modern sample(s) most similar to i, infer the past environment for sample i to be the modern environment for those modern samples. Repeat for all fossil samples.
Reconstructions produced using the analogue approach Elk Lake, Minnesota Bartlein & Whitlock 1993
Joel Guiot 1.Taxon weighting Palaeobioclimatic operators (PBO) computed from either a time- series of fossil sequence or from a PCA of fossil pollen data from large spatial array of sites. Weights are selected to 'emphasis the climate signal within the fossil data‘ and to 'highlight those taxa that show the most coherent behaviour in the vegetational dynamics', 'to minimise the human action which has significantly disturbed the pollen spectra', and 'to reduce noise'. 2.Environmental estimates are weighted means of estimates based on 20, 40 or 50 or so most similar assemblages. 3.Standard deviations of these estimates used as an approximate standard error. Modified modern analogue approaches
Reconstruction of variations in annual total precipitation and mean temperature expressed as deviations from the modern values (1080 mm and 9.5 o C for La Grande Pile. 800 mm and 11 o C for Les Echets). The error bars are computed by simulation. The vertical axis is obtained by linear interpretation from the dates indicated in Fig.2 Cor is the correlation between estimated and actual data. + ME is the mean upper standard deviation associated to the estimates, - ME is the lower standard deviation. These statistics are calculated on the fossil data and on the modern data. In this case, R must be replaced by C. Guiot et al. 1989
MAT, ANALOG, C2 1.Modern data and environmental variable(s) of interest. Do analog matches and environmental prediction for all samples but with cross-validation jack-knifing. Find number of analogues to give lowest RMSEP for environmental variable based on mean or weighted mean of estimates of environmental variable. Can calculate bias statistics as well. 2.Reconstruct using fossil data using the ‘optimal’ number of analogues (lowest RMSEP, lowest bias). Use chord distance or chi-squared distance as dissimilarity measure. Optimises signal to noise ratio with ‘closed’ percentage data. Modern Analogue Techniques for environmental reconstruction = k-nearest neighbours (k-NN)
Pollen percentages in modern samples plotted in ‘climate space’ (cf Iversen’s thermal limit species +/– plotted in climate space). Contoured Trend-surface analysis R 2 Bartlein et al Contoured onlyby locally weighted regressionWebb et al Reconstruction purposes – grid, analogue matching Simulation purposes PROBLEMS 1. Need large high-quality modern data for large geographical areas. 2. No robust error estimation for reconstruction purposes. 3. Reconstruction procedure ‘ad hoc’ – grid size, etc. POLLEN-CLIMATE RESPONSE SURFACES
Response surfaces for individual pollen types. Each point is labelled by the abundance of the type. Many points are hidden – only the observation with the highest abundance was plotted at each position. For (a) to (e) ’+’ denotes 0%, ’0’ denotes 0- 10%, 1 denotes 10-20%, ’2’ denotes 20-30% etc. For (f) to (h), ’+’ denotes 0%, ’0’ denotes 0-1%, 1 denotes 1- 2%, ’2’ denotes 2-3%, etc. ’H’ denotes greater than 10%. Bartlein et al. 1986
Percentage of spruce (Picea) pollen at individual sites plotted in climate space along axes for mean July temperature and annual precipitation. (B) Grid laid over the climate data to which the pollen percentage are fitted by local-area regression. The box with the plus sign is the window used for local- area regression. (C) Spruce pollen percentages fitted onto the grid. (D) Contours representing the response surface and pollen percentages shown in part C. Webb et al. 1987
Reconstruction produced by the response surface approach Elk Lake, Minnesota Bartlein & Whitlock 1993
Fossil and simulated isopoll map sequences for Betula. Isopolls are drawn at 5, 10, 25, 50 and 75% using an automatic contouring program. Simulation purposes Huntley 1992
Fossil and simulated isopoll map sequences from Quercus (deciduous). Maps are drawn at year intervals between yr BP and the present. The upper map sequence presents the observed fossil and contemporary pollen values. The lower map sequence presents the pollen values simulated, by means of the pollen-climate response surface from the climate conditions obtained by applying to the measured contemporary climate the palaeoclimate anomalies that Kutzbach & Guetter (1986) simulated using the NCAR CCM, for to 3000 yr BP. The map for the present is simulated from the measured contemporary climate. Isopolls are drawn at 2, 5, 10, 25, and 50% using an automatic contouring program. Huntley 1992
1.Choice of how much or how little smoothing. 2.Choice of scale of grid for reconstructions. 3.No statistical measure of ‘goodness-of-fit’. 4.No reliable error estimation for predicted values. 5.Reconstruction procedure is close to modern analogue technique but with smoothed modern data. Response surfaces - Critique
CONSENSUS RECONSTRUCTIONS AND SMOOTHERS Plotting of Reconstructed Values 1.Plot against depth or age the reconstructed values, indicate the observed modern value if known. 2.Plot deviations from the observed modern value or the inferred modern value against depth or age. 3.Plot centred values (subtract the mean of the reconstructed values) against depth or age to give relative deviations. 4.Plot standardised values (subtract the mean of the reconstructed values and divide by the standard deviation of the reconstructed values) against depth or age to give standardised deviations. 5.Add LOESS smoother to help highlight major trends, and to identify signal from noise in reconstructions.
Brooks & Birks 2001 Trends or RMSEP? Chironomids
Elk Lake climate reconstruction summary. The three series plotted with red, green and blue lines show the reconstructions produced by the individual approaches, the series plotted with the thin black line show the envelope of the prediction intervals, and the series plotted with a thick purple line represents the stacked and smoothed reconstruction of each variable (constructed by simple averaging of the individual reconstructions for each level, followed by smoothing [Velleman, 1980]). The modern observed values ( ) for Itasca Park are also shown. Consensus Reconstructions Bartlein & Whitlock 1993
PROBLEMS OF SPATIAL AUTOCORRELATION IN TRANSFER FUNCTIONS Telford & Birks (2005) Quaternary Science Reviews 24: Estimating the predictive power and performance of a training set as RMSEP, maximum bias, r 2, etc., by cross- validation ASSUMES that the test set (one or many samples) is STATISTICALLY INDEPENDENT of the training set. Cross-validation in the presence of spatial auto- correlation seriously violates this assumption as the samples are not spatially and statistically independent.
Positive spatial autocorrelation is the tendency of sites close to each other to resemble one another more than randomly selected sites. Value of an autocorrelated variable at one site can be predicted from its values at neighbouring sites. Property of almost all climate data and much ecological and biological data. Can cause spurious associations (inflated Type I errors) between species assemblages and environmental variables, particularly when the latter are spatially smooth like sea- surface temperature or air temperature. “The elimination of spurious correlation due to position in time or space” Student 1914 “Spatial autocorrelation: trouble or new paradigm?” Pierre Legendre 1993
Atlantic foraminiferal data-set- 947 samples Gaussian logit regression WA WA-PLS Artificial neural networks (ANN) MAT Ten-fold cross-validation repeated 10 times with 80% training set, 10% optimisation set (to find number of WA-PLS components, number of analogues in MAT, to optimise ANN), and 10% independent set.
Cross-validation within the N Atlantic set (80% training, 10% optimisation, 10% independent test) and both N Atlantic (80% training, 20% optimisation) and S Atlantic (independent test set) Cross-validation RMSEP ºC N Atlantic Independent test RMSEP ºC S Atlantic GLR * WA * WA-PLS (5)1.37 *1.95 ANN1.01 *2.01 MAT0.89 *1.87
Cross-validation of N Atlantic data suggests ANN, MAT, and WA-PLS have the best performance. Using the S Atlantic data as a test set where there can be no spatial autocorrelation, ANN, WA-PLS, and MAT are the worst! Similar results if N Atlantic split at 40ºW into western (test set) and eastern (training set) halves, namely ANN, MAT, and WA-PLS are the worst. Why?
Why should MAT, WA-PLS, and ANN perform worse when applied to a spatially independent test set, but appear to perform best when applied to spatially dependent or spatially autocorrelated test set? MAT – selects the k-nearest neighbours in the training set using an appropriate dissimilarity coefficient, and calculates a (weighted) mean of the environmental parameter (e.g. SST) of interest. Dissimilarity measure between sites is a holistic measure of compositional similarities in the assemblages. Best analogues are not only similar in SST but similar in other environmental variables. As most environmental variables are autocorrelated, sites that are similar in ALL environmental variables are likely to be close to the site being tested. As such sites will also have a similar SST, this will strengthen the apparent relationship between assemblages and SST. Local non-parametric estimation in MAT.
ANN – extract unknown structure from data. Cannot distinguish between structure imposed by SST and structure generated by spatial autocorrelation. Uses both to minimise RMSEP. Need careful training to prevent model overfitting. Use training and optimisation sets and then an independent test set. If training and test data are not spatially independent, the ANN model learns features of the training set area. Must have a spatially independent test set to force ANN to use the general relationships only. Local non-parametric estimation. WA-PLS – uses additional components to account for residual correlation in the biological data after fitting a WA model to SST. Improves RMSEP by reducing 'edge effects' implicit in WA and by picking up residual variation in certain areas of the gradient, and this may be spatially autocorrelated to SST. WA and GLR – only model the variation in the assemblages that is correlated with SST. Robust to spatial structure in the data. Global parametric estimation.
Models with tuneable parameters (ANN, MAT, WAPLS) perform worse when tested with a spatially independent test set. Models without tuneable parameters (WA, GLR) are easier to fit and perform better when tested with a spatially independent test set. These arguments also apply to pollen-climate data-sets using MAT, ANN, WA-PLS, and response-surfaces (= smoothed MAT). How to define a spatially independent test set for pollen? Individualistic nature of lakes may mean that diatom-pH, diatom- total P, diatom-salinity, and possibly chironomid-temperature transfer functions will be less prone to autocorrelation problems.
Lowest auto- correlation in MAT and ANN residuals Highest auto- correlation in WA and GLR (= ML) residuals Highlights 'secret assumption' of transfer functions
How widespread is the spatial autocorrelation problem? Telford & Birks 2009 Quat Sci Rev doi: /j.quascirev Several data-sets diatoms and lake-water pH benthic foraminifera and salinity planktonic foraminifera and summer SST arctic pollen and July sunshine
Simple tool to detect autocorrelation is to delete samples at random and derive transfer function and its performance statistics or to delete sites geographically close to the test sample and derive transfer function and its performance. If strong autocorrelation is present, deleting geographically close sites will preferentially delete the closest samples. Because of autocorrelation these will bias the apparent ‘good’ performance of the transfer function in cross- validation. Thus their deletion should drastically decrease the performance of the transfer function. In contrast, random deletion should have much less effect on model performance.
Telford & Birks 2008 submitted random deletion neighbour deletion Note different y-axis scales
How to cross-validate a transfer function in the presence of spatial autocorrelation? Telford & Birks 2008 submitted h-block cross-validation. In time-series analysis, an observation is deleted from the training-set along with h observations on either side to prevent temporal autocorrelation. Extend this to spatial autocorrelation by defining radius h in km. What value to use? Use the range of a variogram of the residuals ( ) from the basic transfer function model Y = f(X) + (Variogram is a standard tool in spatial analysis – a plot of half the squared difference between two observations against their distance in space, averaged for a series of distance classes – Legendre & Fortin 1989). What are the MAT and WA-PLS RMSEP for leave-one-out and h-block cross- validation?
RMSEP MATWA-PLS Diatoms - pHL-O-O h-block % change0% Benthic forams - salinityL-O-O h-block % change171%100% Planktonic forams - SSTL-O-O1.2ºC1.7ºC h-block3.0ºC2.7ºC % change150%58% Pollen – July sunshineL-O-O2.3%3.6% h-block7.6%9.0% % change230%150% Increase of RMSEP in h-block cross-validation for all data except diatoms and pH Greatest increase in RMSEP always in MAT and in training-sets with the highest spatial autocorrelation in the X variable MAT and WA-PLS
USE OF ARTIFICIAL, SIMULATED DATA-SETS Simulated data-sets Generate many training sets (different numbers of samples and taxa, different gradient lengths, vary extent of noise, absences, etc) and evaluation test sets, all under different species response models to investigate specific problems, e.g. no-analogue problem
1. Probably widespread. 2. Does it matter? 3. Analogue-based techniques for reconstruction - YES! Modern analogue technique Response surfaces Artificial neural networks 4. GLR, WA, and WAPLS regression methods What we need are ‘good’ (i.e. reliable) estimates of û k. Apply them to same taxa but in no-analogue conditions in the past. Assume that the realised niche parameter û k is close to the potential or theoretical niche parameter u k *. GLR, WA, and WA-PLS are, in reality, additive indicator species approaches rather than strict multivariate analogue-based methods. 5. Simulated data ter Braak, Chemometrics & Intelligent Laboratory Systems 28, 165–180 No-analogue Problem
0 100 X Z Set C No analogue test set Training set Set A L-shaped climate configuration of samples (circles) in the training set (Table 3), with x the climate variable to be calibrated and z another climate variable. Also indicated are the regions of the samples in evaluation set A and set C
Inverse versus classical methods; method-dependent bias in the leave-one-out error estimate. Comparison of the prediction error of inverse (WA-PLS and k-NN) and classical (MLM) approaches in the training set (t) and the three evaluation sets (B, A and C). Set B is a five time replication of t, set A is a subset of t and set C is an extrapolation set. The data are from simulation series 3 of Ref  in which species composition is governed by two climate variable (x and z) with an intermediate amount of unimodality (R x = R z = 1). Numbers are geometric means of root mean squared errors of prediction of x in four replications. The coefficients of variation of each mean is ca. 10%. Coefficient of variation of the ratio of 2 means within a column is ca. 15%. The range of x is [0, 100]. The number in superscript is the range of optimal number of components in WA-PLS and the optimal number of nearest neighbours in k-NN in the four replicates. k-NN uses Eq.(3) & (5). a Significant difference (P<0.01) between leave-out validation and validation by the independent evaluation set B. SettBAC Inverse approach WA-PLS k-NN a Classical approach MLM w.r.t. x MLM w.r.t. x and z ter Braak 1995
6.General conclusions from simulated data experiments WA, WA-PLS, Maximum likelihood, and MAT can all perform poorly under no-analogue conditions and no one method performs consistently better than other methods. For strong extrapolation, WA performed best. Appears WA- PLS deteriorates quicker than WA with increasing extrapolation. Hutson (1977) – in no-analogue conditions WA outperformed inverse linear regression and PCR. Important therefore to assess analogue status of fossil samples as well as ‘best’ training set in terms of RMSEP, bias, etc. 7.Multiple-analogue problem – try to avoid!
Swiss surface pollen samples – lake sediments Selected trees and shrubs MULTI-PROXY APPROACHES
Swiss surface lake sediments Selected herbs and pteridophytes
WA-PLS Modern Swiss pollen-climate RMSEP 1.25ºC 1.03ºC 194 mm
Gerzensee, Bernese Oberland, Switzerland
Lotter et al PB-O YD-PB Tr YD AL-YD Tr G-O Gerzensee
Lotter et al Pollen O 16 /O 18 Chydorids Gerzensee
Birks & Ammann 2000
USE OF METHODS Marine studies (foraminifera, diatoms, coccoliths, radiolaria, dinoflagellate cysts) -PCR, much MAT plus variants, some ANN and response surface uses, very few WA or WA-PLS uses Freshwater studies (diatoms, chironomids, ostracods, cladocera) -WA or WA-PLS, very few MAT or ANN uses, no PCR uses Terrestrial studies (pollen) -MAT plus variants, response surfaces, a few ANN uses, increasingly more WA-PLS uses, no PCR uses In comparisons using simulated and real data, WA and WA-PLS usually outperform PCR and MAT but not always. Classical methods of Gaussian logit regression and calibration rarely used (freshwater, terrestrial) but this is changing. Some applications of artificial neural networks and few studies within a Bayesian framework. Bayesian framework may be an important future research direction but it presents very difficult and time-consuming computational problems.
Use weighted-averaging (WA) or weighted-averaging partial least squares regression. (WA-PLS) Takes account of % data, ignores zero values, assumes unimodal responses, can handle several hundred species, and gives transfer functions of high precision ( 0.8ºC), low bias, and high robustness. X = g(Y 1, Y 2, Y 3, , Y m )Modern data X f = g (Y f1, Y f2, Y f3, , Y fm )Fossil data g is our transfer function for X and Y ^ Simple, ecologically realistic, and robust WA is robust to spatial autocorrelation
1.We have the statistical tools for transfer function work 2.Need high-quality training sets and associated environmental data i.consistent detailed taxonomy ii.comparable methods and quality iii.comparable sedimentary environment iv.good environmental data 3.Need to minimise ‘noise’ as much as possible (marine vs freshwater systems). 4.Need care in design of sampling along primary environmental gradients. Problems of secondary gradients. IMPLICATIONS FOR CURRENT PROJECTS
5.Unless one can use linear-based methods (problems of getting enough samples along short gradients), the gradient in the training set should lie about 1–1.5 SD either side of environmental variable’s range represented in the fossil data (to avoid end of gradient problems). 6.Major problems of spatial autocorrelation in evaluating models unless the test data are spatially independent of the training set. Usually not possible. In which case RMSEP based on cross-validation are over-optimistic and misleading. Same spatial autocorrelation problems arise in deriving sample-specific prediction errors by boot-strapping as it assumes all samples are independent. Much work still needed on the problems of spatial autocorrelation not only in transfer functions but also in much climate research.
POTENTIAL PALAEOECOLOGICAL APPLICATIONS OF RECONSTRUCTIONS Palaeoecological data (e.g. pollen, diatom, plant macrofossil, chironomid) can be interpreted in at least two ways 1.Descriptive – descriptive and narrative approaches in which the emphasis is on reconstruction of past biota, past populations, past communities, past landscapes, and past environment. Primarily a hypothesis-generation approach. 2.Causal – causes or forcing functions of observed changes or observed stability. Analytical approach. Primarily a hypothesis-testing approach. Analogous to exploratory data analysis and confirmatory data analysis in statistics.
Causal interpretations in palaeoecology not easy, as it is only too easy to fall into circular arguments. e.g. pollen is used to reconstruct past climate, climate reconstruction is used to explain changes in pollen stratigraphy. Great need for rigorous approaches to both reconstructions and causal interpretation.
Quantitative palaeoenvironmental reconstructions in the context of palaeoecology are not really an end in themselves (in contrast to palaeoclimatology) but they are a means to an end. Use the reconstructions based on one proxy to provide an environmental history against which observed biological changes in another, independent proxy can be viewed and interpreted as biological responses to environmental change.
Minden Bog, Michigan. Booth & Jackson 2003 Major change 1000 years ago towards drier conditions, decline in Fagus and rise in Pinus in charcoal Climate vegetation fire frequency Black portions = wet periods, grey = dry periods
Central New England, eastern USA Environmental proxies – hydrogen isotope ratios as temperature proxy (low values indicate colder temperatures) - lake levels indicate moisture balance See major pollen changes coincide with climatic transitions Climate control of vegetational composition at millennial scales Shuman et al. 2004
These approaches involving environmental reconstructions independent of the main fossil record can be used as a long-term ecological observatory or laboratory to study long-term ecological dynamics under a range of environmental conditions, not all of which exist on Earth today (e.g. lowered CO 2 concentrations, low human impact). Can begin to study the Ecology of the Past. Exciting prospect, many potentialities in future research.
ACKNOWLEDGEMENTS Key Figures in Transfer-Function Statistical Research John Imbrie Cajo ter Braak Tom WebbSvante Wold Steve JugginsRichard Telford
One cannot do transfer-function research without high quality data and these need skilled palaeoecologists. Many colleagues have contributed to the development of transfer functions by creating superb modern- environmental data sets Heikki Seppä Andy LotterOliver Heiri Steve BrooksViv JonesUlrike Herzschuh
Many other people have contributed to the work on which this talk is based or to the preparation of this presentation Hilary BirksSylvia Peglar Rick BattarbeeJohn Boyle Arvid OdlandGaute Velle Kathy WillisCathy Jenks