Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health.

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

MCMC estimation in MlwiN
Regression and correlation methods
Chapter 12 Inference for Linear Regression
Lesson 10: Linear Regression and Correlation
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Objectives 10.1 Simple linear regression
Combining Information from Related Regressions Duke University Machine Learning Group Presented by Kai Ni Apr. 27, 2007 F. Dominici, G. Parmigiani, K.
3-Dimensional Gait Measurement Really expensive and fancy measurement system with lots of cameras and computers Produces graphs of kinematics (joint.
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Model Assessment, Selection and Averaging
Objectives (BPS chapter 24)
CJT 765: Structural Equation Modeling Class 3: Data Screening: Fixing Distributional Problems, Missing Data, Measurement.
Session 2. Applied Regression -- Prof. Juran2 Outline for Session 2 More Simple Regression –Bottom Part of the Output Hypothesis Testing –Significance.
BAYESIAN INFERENCE Sampling techniques
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 10 Simple Regression.
End of Chapter 8 Neil Weisenfeld March 28, 2005.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Analysis of Covariance Goals: 1)Reduce error variance. 2)Remove sources of bias from experiment. 3)Obtain adjusted estimates of population means.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Regression and Correlation Methods Judy Zhong Ph.D.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
Model Inference and Averaging
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Confidence intervals and hypothesis testing Petter Mostad
Univariate Linear Regression Problem Model: Y=  0 +  1 X+  Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
A shared random effects transition model for longitudinal count data with informative missingness Jinhui Li Joint work with Yingnian Wu, Xiaowei Yang.
Tutorial I: Missing Value Analysis
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Markov Chain Monte Carlo in R
Stats Methods at IC Lecture 3: Regression.
Inference about the slope parameter and correlation
Michael R. Elliott2, Xiaobi (Shelby) Huang1, Sioban Harlow3
Multiple Imputation using SOLAS for Missing Data Analysis
Model Inference and Averaging
Ch3: Model Building through Regression
Linear Mixed Models in JMP Pro
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
OUTLINE Lecture 5 A. Review of Lecture 4 B. Special SLR Models
Ch13 Empirical Methods.
Ch11 Curve Fitting II.
CHAPTER 12 More About Regression
CS 594: Empirical Methods in HCC Introduction to Bayesian Analysis
Classical regression review
Presentation transcript:

Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health September 30, 2008

Outline Has the onset of menopause changed since the early 20 th Century? Tremin I: U. Minnesota Undergraduates 1930’s. Tremin II: U. Minnesota Undergraduates 1960’s. Develop statistical model for menstrual cycle length. Bayesian methods Apply to complete data from Tremin I. Future work Accounting for missing data (hormone, dropout). Including Tremin II data. Relating to existing suggestions for FMP markers (60 days, 90 days, etc.).

Modeling Menstrual Cycle Length: Observed Data Characterized by: Stable trend during a woman’s 20’s and 30’s “Breakdown” several months to several years before FMP Increase in variability Increase in mean length

Modeling Menstrual Cycle Length: Observed Data Easier to see on a log scale:

Modeling Menstrual Cycle Length: Linear Changepoint Model There appears to be a common pattern to how menstrual cycle length changes over age. A linear changepoint model: Can be implemented as a linear spline with one changepoint:, where

Modeling Menstrual Cycle Length: Linear Changepoint Model Different slopes and intercepts either side of θ: But means converge at the “knot” of θ:

Modeling Menstrual Cycle Length: Linear Changepoint Model Variance can be modeled via linear changepoint model, just like the mean. Note that the changepoint(s) θ are estimated from the data, not fixed in advance.

Modeling Menstrual Cycle Length: Hierarchical Model Despite general overall pattern being the same, women have unique ages when their cycle lengths begin to change, as well as difference means and variances at “baseline”. This suggests constructing a hierarchical model in which women will have unique parameters governing mean and variance changepoint models.

Modeling Menstrual Cycle Length: Hierarchical Model Start at age 35 and take log of cycle length to improve normal approximation. Linear changepoint for both mean and variance where is the length in days of the tth menstrual cycle for the ith woman,, is her age in years at the start of her tth cycle, and

Modeling Menstrual Cycle Length: Hierarchical Model Each of the individual parameters is then assumed to follow a common distribution: where. Allows information about cycle parameters to be shared across women. Accommodates “within-woman” correlation in cycle lengths Relates cycle parameters to baseline covariates via regression coefficients.

Bayesian Models Considering a model of this form is easier from a Bayesian perspective. Classical or “frequentist" approach to statistics considers observed data y to be random, governed by fixed (unknown) parameters θ. Determine the joint distribution of. and consider as a function of θ:. Point estimate of made by maximizing or. Inference about θ made by considering repeated sampling properties of y for different θ. Ex: 95% confidence interval, a hypothetical set of which will contain θ with 95% probability

Bayesian Models Bayesian statistics also models, but considers θ to have a probability distribution of its own,. Prior information contained in is updated from data y to obtain a posterior distribution of θ: Hierarchical models model prior parameters with “hyperprior” distributions :

Bayesian Models Prior distributions or hyperprior distributions encode prior knowledge about parameters, but can be chosen to be very weakly informative if little prior information is available, or if it is to be ignored. Here Modern computational techniques such as Markov Chain Monte Carlo allow complex models such as those to be used here to be fit (relatively) painlessly. Results from 3,000 draws of Gibbs sampler, 1,000 draws discarded after “burn-in”.

Results Fit the above model to the 106 women with complete data in Tremin I. Pregnancies, abortions. No hormone use or gaps in reporting. Subject level parameters (random effects). Fit of predicted means and variances. Regression against parity, menarche, means and standard errors at age Correlation of subject-level parameters.

Results: Subject-level estimates of trends i (3.33, 3.39) (-.014, -.006).089 (.007,.303) 49.9 (47.2, 51.4) -4.8 (-5.3, -4.3).01 ( ).76 (.56,.99) 47.3 (46.5, 48.0) (3.06, 3.23).037 (-.010,.084).279 (-.136,.727) 42.0 (41.4, 42.1) -3.6 (-4.1, -3.0).52 (.36,.66) 1.27 (.19, 2.61) 42.0 (41.0, 42.1)

Results: Subject-level estimates of trends Mean cycle length: Posterior mean (95% posterior predictive interval) 2.5 and 97.5 Percentiles: Posterior mean (95% PPI)

Results: Subject-level estimates of trends One-changepoint model provides reasonably good fit to the data. Subject-level differences in mean and variance trends appear to be captured. Uncertainty in the location of the changepoints reflected in the smoothness of the ``elbow’’ for the mean and variances.

Results: Subject-level estimates of trends Posterior means and 80% posterior predictive intervals for mean changepoint and variance changepoint. Variability in cycle length increases well in advance of increases in mean cycle length itself. Changepoints for some subjects are well-estimated, while there is a great deal of uncertainty for others.

Results: Population Mean for trend parameters (unadjusted) Mean intercept3.30 (3.27,3.32) Mean slope before changepoint (-.023,.015) Mean slope after changepoint.264 (.151,.425) Changepoint for mean47.0 (46.4,47.6) Exp(Variance intercept).0088 (.0067,.0112) Exp(Variance slope before changepoint) 1.09 (1.02,1.15) Exp(Variance slope after changepoint) 3.05 (2.60,3.68) Variance changepoint 45.2 (44.6,45.8)

Results: Population Mean for trend parameters (adjusted for parity, age at menarche, and mean and variance of cycles at age 25-29) InterceptNullpariousMenarcheMean 25-29SD Mean intercept3.28 (3.22,3.31) -.00 (-.02,.02).02 (-.03,.07).023 (.015,.029) (-.007,-.003) Mean slope before changepoint (-.044,.028).000 (-.016,.012) (-.050,.035) (-.008,.004).000 (-.002,.002) Mean slope after changepoint.307 (.135,.446) (-.061,.018).006 (-.126,.098) (-.024,.009).001 (-.004,.005) Changepoint for mean (46.11,48.03).09 (-.31,.38) -.02 (-1.16,.81).15 (-.01,.27) (-.001,-.000) Exp(Variance intercept).0091 (.0054,.0133) 1.09 (.87,1.23).920 (.507,1.395) (.973,1.123).972 (.928,1.006) Exp(Variance slope before changepoint) 1.00 (.89,1.09) 1.00 (.95,1.03) (.973,1.228).984 (.969,.997) (.998,1.007) Exp(Variance slope after changepoint) 3.20 (2.41,3.95).96 (.86,1.04).946 (.691,1.197) (.981,1.054).995 (.982,1.004) Variance changepoint (43.65,45.53).172 (-.235,.484).58 (-.57,1.47).181 (.018,.302) (-.087,.005)

Results: Population Mean for trend parameters (adjusted for parity, age at menarche, and mean and variance of cycles at age 25-29) Parity not associated with cycle structure. Weak evidence that earlier menarche associated with increasing variability before changepoint. Higher historical mean: Higher mean at 35. Decline in variability before changepoint. Later changepoints for both mean and variance. Higher historical variance: Lower mean at 35. Earlier changepoints for both mean and variance.

Results: Correlations among random effects Mean intercept Mean slope before change- point Mean slope after change- point Mean Change- point Exp(Var intercept) Exp(Var slope before change- point) Exp(Var slope after change- Point) Var Change -point Mean intercept Mean slope before changepoint Mean slope after changepoint Changepoint for mean Exp(Variance intercept) Exp(Variance slope before changepoint) Exp(Variance slope after changepoint) Variance changepoint 1

Results: Correlations among random effects Longer cycles at age 35 are associated with somewhat later changepoints in both mean and variance. Highly variable cycles at age 35 are associated with slower increases/declines in variability before their variability changepoint, but more rapid increases thereafter. Later mean changepoints are associated with slower increases or even declines in variability before their variability changepoint, and slower increases thereafter. Later mean changepoints are strongly associated with later variance changepoints.

Next Steps: Modeling Account for missing data (hormone, dropout). Impute missing cycles under model, and then obtain draws from posterior distribution of parameters conditional on observed and imputed data. Use results from alternative non-model-based approaches

Next Steps: Modeling Model checking “Eyeball” approach shows good fit Formalize with posterior predictive checks. Generate predictive data under model and compute posterior distributions of statistics of interest (chi-square measures, etc.) Compare with posterior distribution of statistics using observed (fixed) data.

Next Steps: Analysis Include Tremin II data. Add as covariate to population model Assess secular trends pre- and post- birth control use.

Next Steps: Analysis Relate to existing suggestions for FMP markers (60 days, 90 days, etc.). Consider predictive value of model for pre-FMP subjects. “Cross-validation” with Tremin data. Validation with other data sources.

Next Steps: Analysis Look for cycle behavior that might be reflective of disease