State-Space Models for Within-Stream Network Dependence William Coar Department of Statistics Colorado State University Joint work with F. Jay Breidt This.

Slides:

Advertisements

Similar presentations

State Space Models. Let { x t :t T} and { y t :t T} denote two vector valued time series that satisfy the system of equations: y t = A t x t + v t (The.

Advertisements

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.

VARYING RESIDUAL VARIABILITY SEQUENCE OF GRAPHS TO ILLUSTRATE r 2 VARYING RESIDUAL VARIABILITY N. Scott Urquhart Director, STARMAP Department of Statistics.

An Overview STARMAP Project I Jennifer Hoeting Department of Statistics Colorado State University

Model Building For ARIMA time series

Maximum Likelihood And Expectation Maximization Lecture Notes for CMPUT 466/551 Nilanjan Ray.

Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.

Robust sampling of natural resources using a GIS implementation of GRTS David Theobald Natural Resource Ecology Lab Dept of Recreation & Tourism Colorado.

Nonparametric, Model-Assisted Estimation for a Two-Stage Sampling Design Mark Delorey, F. Jay Breidt, Colorado State University Abstract In aquatic resources,

Bayesian Models for Radio Telemetry Habitat Data Megan C. Dailey* Alix I. Gitelman Fred L. Ramsey Steve Starcevich * Department of Statistics, Colorado.

1 STARMAP: Project 2 Causal Modeling for Aquatic Resources Alix I Gitelman Stephen Jensen Statistics Department Oregon State University August 2003 Corvallis,

Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.

Linking watersheds and streams through functional modeling of watershed processes David Theobald, Silvio Ferraz, Erin Poston, and Jeff Deems Natural Resource.

Semiparametric Mixed Models in Small Area Estimation Mark Delorey F. Jay Breidt Colorado State University September 22, 2002.

Bayesian modeling for ordinal substrate size using EPA stream data Megan Dailey Higgs Jennifer Hoeting Brian Bledsoe* Department of Statistics, Colorado.

1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

Habitat selection models to account for seasonal persistence in radio telemetry data Megan C. Dailey* Alix I. Gitelman Fred L. Ramsey Steve Starcevich.

Parametric Inference.

1 Accounting for Spatial Dependence in Bayesian Belief Networks Alix I Gitelman Statistics Department Oregon State University August 2003 JSM, San Francisco.

PAGE # 1 Presented by Stacey Hancock Advised by Scott Urquhart Colorado State University Developing Learning Materials for Surface Water Monitoring.

Quantifying fragmentation of freshwater systems using a measure of discharge modification (and other applications) David Theobald, John Norman, David Merritt.

Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.

Two-Phase Sampling Approach for Augmenting Fixed Grid Designs to Improve Local Estimation for Mapping Aquatic Resources Kerry J. Ritter Molly Leecaster.

Example For simplicity, assume Z i |F i are independent. Let the relative frame size of the incomplete frame as well as the expected cost vary. Relative.

Ordinary least squares regression (OLS)

Habitat association models  Independent Multinomial Selections (IMS): (McCracken, Manly, & Vander Heyden, 1998) Product multinomial likelihood with multinomial.

PAGE # 1 STARMAP OUTREACH Scott Urquhart Department of Statistics Colorado State University.

Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.

Linear and generalised linear models

State-Space Models for Biological Monitoring Data Devin S. Johnson University of Alaska Fairbanks and Jennifer A. Hoeting Colorado State University.

Quantitative Business Analysis for Decision Making Simple Linear Regression.

1 Learning Materials for Surface Water Monitoring Gerald Scarzella.

Zen, and the Art of Neural Decoding using an EM Algorithm Parameterized Kalman Filter and Gaussian Spatial Smoothing Michael Prerau, MS.

Optimal Sample Designs for Mapping EMAP Data Molly Leecaster, Ph.D. Idaho National Engineering & Environmental Laboratory Jennifer Hoeting, Ph. D. Colorado.

Linear and generalised linear models

Applications of Nonparametric Survey Regression Estimation in Aquatic Resources F. Jay Breidt, Siobhan Everson-Stewart, Alicia Johnson, Jean D. Opsomer.

Random Effects Graphical Models and the Analysis of Compositional Data Devin S. Johnson and Jennifer A. Hoeting STARMAP Department of Statistics Colorado.

Maximum likelihood (ML)

Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.

Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.

1 Adjustment Procedures to Account for Nonignorable Missing Data in Environmental Surveys Breda Munoz Virginia Lesser R

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

1 Spatial and Spatio-temporal modeling of the abundance of spawning coho salmon on the Oregon coast R Ruben Smith Don L. Stevens Jr. September.

Regression and Correlation Methods Judy Zhong Ph.D.

Alignment and classification of time series gene expression in clinical studies Tien-ho Lin, Naftali Kaminski and Ziv Bar-Joseph.

Prognosis of gear health using stochastic dynamical models with online parameter estimation 10th International PhD Workshop on Systems and Control a Young.

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

DAMARS/STARMAP 8/11/03# 1 STARMAP YEAR 2 N. Scott Urquhart STARMAP Director Department of Statistics Colorado State University Fort Collins, CO

HMM - Part 2 The EM algorithm Continuous density HMM.

CS Statistical Machine learning Lecture 24

Estimation Method of Moments (MM) Methods of Moment estimation is a general method where equations for estimating parameters are found by equating population.

Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.

KNN Ch. 3 Diagnostics and Remedial Measures Applied Regression Analysis BUSI 6220.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.

Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.

VARYING DEVIATION BETWEEN H 0 AND TRUE  SEQUENCE OF GRAPHS TO ILLUSTRATE POWER VARYING DEVIATION BETWEEN H 0 AND TRUE  N. Scott Urquhart Director, STARMAP.

1 Autocorrelation in Time Series data KNN Ch. 12 (pp )

Introduction We consider the data of ~1800 phenotype measurements Each mouse has a given probability distribution of descending from one of 8 possible.

STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.

Daiwen Kang 1, Rohit Mathur 2, S. Trivikrama Rao 2 1 Science and Technology Corporation 2 Atmospheric Sciences Modeling Division ARL/NOAA NERL/U.S. EPA.

Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis.

The simple linear regression model and parameter estimation

Model Building For ARIMA time series

EE513 Audio Signals and Systems

State Space Models.

TROUBLESOME CONCEPTS IN STATISTICS: r2 AND POWER

Model generalization Brief summary of methods

Chapter 13 Additional Topics in Regression Analysis

Longitudinal Data & Mixed Effects Models

Presentation transcript:

State-Space Models for Within-Stream Network Dependence William Coar Department of Statistics Colorado State University Joint work with F. Jay Breidt This research is funded by U.S.EPA – Science To Achieve Results (STAR) Program Cooperative Agreement # CR

Disclaimer The work reported here was developed under the STAR Research Assistance Agreement CR awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA. The views expressed here are solely those of the presenter and STARMAP, the Program (s)he represents. EPA does not endorse any products or commercial services mentioned in this presentation. The work reported here was developed under the STAR Research Assistance Agreement CR awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA. The views expressed here are solely those of the presenter and STARMAP, the Program (s)he represents. EPA does not endorse any products or commercial services mentioned in this presentation.

Outline Introduction to the problem Evolution of state-space models Likelihood Missing data Kalman recursions EM algorithm Simulation example Future work

Consider a simple stream network Two upstream reaches merge together to create downstream reaches. Suggests a natural dependency on upstream reaches. Autocorrelation can arise from water flowing from reach to reach. Logical ordering in space. Downstream Y 1 Y 2 Y 3 Y 4 Y5Y6Y5Y6 Y7Y7

The Beginnings Expressing a measurement on a reach in terms of its upstream contributors such that where.

The Beginnings This is also the modified Cholesky decomposition of  -1 For any Y~(µ,  ), there exists a unit lower triangular matrix T with corresponding diagonal D such that TY=Z where Z~(0,D). Simplifying T can allow for dependencies similar to autoregressive structures in time series. ie, a measurement depends only on its two immediate upstream neighbors. in the simple example. Suggestive of a more general state-space model. Y 1 Y 2 Y 3 Y 4 Y 5 Y 6 Y7Y7

State-Space Model Define a state-space representation by with {W(t)}~N(0,{R(t)}), {V(t)}~N(0,{Q(t)}), and V(s) uncorrelated with W(t) for all s and t. Further assume that W(t) and V(t) are uncorrelated with all X(s 1 ), where s 1 is any first order reach. u1u1 u2u2 t

Downstream Filter Best mean square predictors under Normality are Predict X(t) given upstream information Update with observed information from Y(t) where. u1u1 u2u2 t

Likelihood Use the innovations and variances from the downstream filter In the case where data are available for every reach in the network, the likelihood is easily expressed in terms of these innovations where n is the total number of reaches in the stream network.

EM Algorithm The likelihood for missing data can be difficult to express. E-Step Predict, update, smooth based on current estimates of model parameters. Form an approximation to the likelihood by filling in the missing values with smoothed estimates. The M-Step Maximization of the approximation to the likelihood in order to obtain new parameter estimates for the next iteration. Iterate until revised parameter estimates stabilize. Since log-likelihood decreases with each iteration, estimates should converge to MLE.

Upstream Smoother Start with the very last reach in the network. Smooth two at a time using information from the filtered as well as smoothed downstream values. Estimate based on observations from the entire network with the conditional expectation. Recursive relationship results in smoothed estimates with variance where. u1u1 u2u2 t

Other Tree Type Smoothers Each reach as a parent that creates two children Existing work Huang & Cressie (1997) and Chou (1994) for uptree filtering (fine-coarse) and downtree smoothing (coarse-fine) Model different resolutions Assumption that children are independent conditioned on the parent. Violated in the stream network model considered. Parent Child

x x x x xx x=missing value First order reaches up in the mountains Fifth order reach on the plains Example

Consider a network that has 39 different reaches 20 first order,19 higher order Let k be the Strahler order of reach t created by two reaches of order i and j. State-Space representation of with. Assumptions about V(t) Cov(V(s),V(t))=0 for s ≠ t Cov(V(t),X(s 1 ))=0 for any first order reach s 1

Parameter Estimation Total of 12 parameters to estimate based on 33 stream segments (6 missing values). 6 different  parameters to estimate in this model. 5 different (conditional) variances to estimate. 1 variance parameter from first order. Most parameters will be estimated with few observations. Only a few reaches will contribute to estimating each . Suggests looking at parametric models for . Need a much larger stream network to achieve more reasonable parameter estimates.

Kalman Recursions Downstream Filter (Y(t)=X(t)) The filtered value is either the observed Y(t), or its conditional expectation given the two immediate upstream filtered values. Variance is either 0 (if Y(t) is observed) or the prediction error variance of Y(t) given the two immediate upstream filtered values. Upstream Smoother Smooth two at a time, Y(u 1 ) and Y(u 2 ). Either the observed value or the conditional expectation of Y(u i ) given all reaches with observed measurements. Need to know the logical order of flow

Parameter Estimates Iterate  21  31  32  33  43  54 [6,] [7,] [8,] True [6,] e [7,] e [8,] e True

Smoothed Data Values [6,] [7,] [8,] True More iterations in the EM algorithm Better model for the coefficient parameters Plot estimates from regression against covariates (regressogram) Re-compute MLE based on new parametric model suggested by the regressogram

Future Work Work with real data from larger networks. Obtain better initial estimates. Investigate EM convergence. Use reach-specific covariate information such as location within a reach, inflow from upstream reaches, etc. State space representations that allow for larger classes of models than the AR structure considered here. Allow for upstream measurements on the same reach.