1 Finite Population Inference for Latent Values Measured with Error from a Bayesian Perspective Edward J. Stanek III Department of Public Health University.

Slides:



Advertisements
Similar presentations
Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
Advertisements

1 Parametric Empirical Bayes Methods for Microarrays 3/7/2011 Copyright © 2011 Dan Nettleton.
Bayesian dynamic modeling of latent trait distributions Duke University Machine Learning Group Presented by Kai Ni Jan. 25, 2007 Paper by David B. Dunson,
Week 11 Review: Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct
Unit 7 Section 6.1.
HW 4. Nonparametric Bayesian Models Parametric Model Fixed number of parameters that is independent of the data we’re fitting.
1 Point and Interval Estimates Examples with z and t distributions Single sample; two samples Result: Sums (and differences) of normally distributed RV.
3.3 Toward Statistical Inference. What is statistical inference? Statistical inference is using a fact about a sample to estimate the truth about the.
Probability theory 2010 Order statistics  Distribution of order variables (and extremes)  Joint distribution of order variables (and extremes)
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.

1 Introduction to Biostatistics (PUBHLTH 540) Multiple Random Variables.
1 When are BLUPs Bad Ed Stanek UMass- Amherst Julio Singer USP- Brazil George ReedUMass- Worc. Wenjun LiUMass- Worc.
SPH&HS, UMASS Amherst 1 Sampling, WLS, and Mixed Models Festschrift to Honor Professor Gary Koch Ed Stanek and Julio Singer U of Mass, Amherst, and U of.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
1 Finite Population Inference for the Mean from a Bayesian Perspective Edward J. Stanek III Department of Public Health University of Massachusetts Amherst,
1 Finite Population Inference for Latent Values Measured with Error that Partially Account for Identifable Subjects from a Bayesian Perspective Edward.
1 Sampling Models for the Population Mean Ed Stanek UMASS Amherst.
1 Introduction to Biostatistics (PUBHLTH 540) Sampling.
Probability theory 2010 Conditional distributions  Conditional probability:  Conditional probability mass function: Discrete case  Conditional probability.
: Appendix A: Mathematical Foundations 1 Montri Karnjanadecha ac.th/~montri Principles of.
Latent Variable Models Christopher M. Bishop. 1. Density Modeling A standard approach: parametric models  a number of adaptive parameters  Gaussian.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Yaomin Jin Design of Experiments Morris Method.
Learning Theory Reza Shadmehr Linear and quadratic decision boundaries Kernel estimates of density Missing data.
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Dr. Ahmed Abdelwahab Introduction for EE420. Probability Theory Probability theory is rooted in phenomena that can be modeled by an experiment with an.
Problem: 1) Show that is a set of sufficient statistics 2) Being location and scale parameters, take as (improper) prior and show that inferences on ……
Single-Factor Studies KNNL – Chapter 16. Single-Factor Models Independent Variable can be qualitative or quantitative If Quantitative, we typically assume.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Confidence Interval & Unbiased Estimator Review and Foreword.
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
Lecture 2: Statistical learning primer for biologists
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Nonparametric Bayesian Models. HW 4 x x Parametric Model Fixed number of parameters that is independent of the data we’re fitting.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Lecturer: Ing. Martina Hanová, PhD..  How do we evaluate a model?  How do we know if the model we are using is good?  assumptions relate to the (population)
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
Sampling and Sampling Distributions
STATISTICAL INFERENCE
Probability Theory and Parameter Estimation I
Ch3: Model Building through Regression
CH 5: Multivariate Methods
Lecture 09: Gaussian Processes
Statistics and Art: Sampling, Response Error, Mixed Models, Missing Data, and Inference Ed Stanek And others: Recai Yucel, Julio Singer, and others on.
Ed Stanek and Julio Singer
OVERVIEW OF BAYESIAN INFERENCE: PART 1
Comparisons among methods to analyze clustered multivariate biomarker predictors of a single binary outcome Xiaoying Yu, PhD Department of Preventive Medicine.
CSCI 5822 Probabilistic Models of Human and Machine Learning
Statistical Assumptions for SLR
数据的矩阵描述.
Problem 1.
Parametric Methods Berlin Chen, 2005 References:
Mathematical Foundations of BME
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Presentation transcript:

1 Finite Population Inference for Latent Values Measured with Error from a Bayesian Perspective Edward J. Stanek III Department of Public Health University of Massachusetts Amherst, MA

2 Collaborators Parimal Mukhopadhyay, Indian Statistics Institute, Kolkata, India Viviana Lencina, Facultad de Ciencias Economicas, Universidad Nacional de Tucumán, CONICET, Argentina Luz Mery Gonzalez, Departamentao de Estadística, Universidad Nacional de Colombia, Bogotá, Colombia Julio Singer, Departamento de Estatística, Universidade de São Paulo, Brazil Wenjun Li, Department of Behavioral Medicine, UMASS Medical School, Worcester, MA Rongheng Li, Shuli Yu, Guoshu Yuan, Ruitao Zhang, Faculty and Students in the Biostatistics Program, UMASS, Amherst

3 Outline Review of Finite Population Bayesian Models 1.Populations, Prior, and Posterior 2.Notation 3.Example 4.Exchangeable distribution Addition of Measurement Error 1.Latent values and Response 2.Heterogeneous variances 3.Prior distribution of response for a prior point (vector of latent values) 4.Prior and Data- matching the subjects in the data to random variables in the population 5.Subsets of prior points: i.for populations not including some subjects in the data ii.for populations including subjects in data, where the sample doesn’t include the subjects in the data iii.for populations and samples that include subjects in data. 6.Posterior points (corresponding to 5iii) 7.Marginal posterior points (over measurement error among remaining subjects)

4 Bayesian Model General Idea Populations # Posterior Populations: Data Prior Posterior # Prior Populations: Prior Probabilities Posterior Probabilities Review

5 Bayesian Model Population Notation Population Label Latent Value Labels Parameter Vector Data Vector Review

6 Exchangeable Prior Bayesian Model- Example: H=3, N=3, n=2 Populations Data Prior Posterior Review

7 Exchangeable Prior Bayesian Model- Example: H=3, N=3, n=2 Populations Data Prior Posterior Suppose the Data is Prior Review

8 Exchangeable Prior Bayesian Model- Example: H=3, N=3, n=2 Populations Data Prior Posterior Prior Review

9 Exchangeable Prior Bayesian Model- Example: H=3, N=3, n=2 Populations Data Prior Posterior Posterior Prior Review

10 Exchangeable Prior Populations General Idea When N=3 Each Permutation p of subjects in L (i.e. each different listing) Joint Probability Density Must be identical Exchangeable Random Variables The common distribution General Notation Assigns (usually) equal probability to each permutation of subjects in the population. Review

11 Exchangeable Prior Populations N=3 Potential Response for Each Listing of subjects Listings Latent Values for Listing Latent Values for permutations of listing Review

12 Exchangeable Prior Population Permutations Rose Daisy Lily Listing p=1 Review

13 Exchangeable Prior Populations N=3 Permutations Rose Daisy Lily Listing p=1 Review

14 Exchangeable Prior Populations N=3 Rose Daisy Lily Listing p=2 Review

15 Exchangeable Prior Populations N=3 Permutations of Listings Listing p=1 Listing Listing p=2 Listing p=3 Listing p=4 Listing p=5 Listing p=6 Review

16 Exchangeable Prior Populations N=3 Potential Response for Each Listing of subjects Listings Latent Value Vectors for permutations of listing Potential response for Random Variables For Listing p Circled points are equal and have equal probability, for different listings. Listing Review

17 Exchangeable Prior Populations N=3 Permutations of Listings Listing p=1 Same Point in Listing Listing p=2 Listing p=3 Listing p=4 Listing p=5 Listing p=6 Review

18 Exchangeable Prior Populations N=3 Potential Response for Each Listing of subjects Listings Latent Value Vectors for permutations of listing Potential response for Random Variables For Listing p Circled points are equal and have equal probability, And are the same point for different listings. Same Point in Listing Review

19 Bayesian Model Link between Prior and Data Populations Data Prior # Prior Populations: N=3 Suppose n=2 Realizations of are the Data Review

20 Bayesian Model Exchangeable Prior Populations N= Listing p=1 Sample Space n=2 Prior Review

21 Bayesian Model Exchangeable Prior Populations N=3: Sample Point n= Listing p= Listing p= Listing p=1 Listing p= Listing p= Listing p=6 Review

22 Exchangeable Prior Populations N=3 Sample Points Rose Daisy Lily Listing p=1 Review

23 Exchangeable Prior Populations N=3 Sample Points When Listing p=1 Sample Space n=2 when Prior Listing p=1 Review

24 Exchangeable Prior Populations N=3: Sample Points n= Listing p= Listing p= Listing p=1 Listing p= Listing p= Listing p=6 Positive Prob. Review

25 Exchangeable Prior Populations N=3 Sample Points with Positive Probability n= Listing p= Listing p= Listing p=1 Listing p= Listing p= Listing p=6 Review

26 Exchangeable Prior Populations N=3 Posterior Random Variables Prior Data If permutations of subjects in listing p are equally likely: Random variables representing the data are independent of the remaining random variables. The Expected Value of random variables for the data is the mean for the data. Review where and

27 Data without Measurement Error Data (set) Vectors permutation matrix, k=1,…,n! and to be anLet Data (set of vectors) Latent Value For simplicity, denote by

28 Data with and without Measurement Error No Measurement Error Latent Values Data With Measurement Error Vectors Sets Data Realization at t Potential Response

29 Data with Measurement Error the realization of on occasion t The realization of Sets Data un-observed latent value Assume: Measurement errors are independent between any two subjects Measurement errors are independent when repeatedly measured on a subject

30 Measurement Error Model The Data Vectors Potential response Define Latent ValuesResponse Error Variance

31 Measurement Error Model Prior Random Variables Populations Prior # Prior Populations: Population h Labels: Parameter Prior Probabilities Assume Random Variables representing a population are exchangeable Defines the axes for points in the prior and the measurement variance indicates initial order vector Latent Values: Measurement Variance:

32 Exchangeable Prior Populations N=3 Rose Daisy Lily Single Point

33 Measurement Error Model Prior Random Variables Population h, Prior # Prior Populations: Vectors Assume Random Variables representing a population are exchangeable When p=1, define Sets Prior

34 Prior Random Variables and Data with Measurement Error If permutations of subjects in listing p are equally likely: Assume Random Variables representing a population are exchangeable in each population Since or initial listing p=1 Prior Data

35 Prior Random Variables and Data with Measurement Error Prior Random Variables that will correspond to Latent values for subjects In the data Remaining Prior Random Variables Prior Data

36 Prior Random Variables and Data with Measurement Error Prior Random Variables that will correspond to Latent values for subjects In the data Remaining Prior Random Variables Prior Data

37 Consider Possible Populations in the Posterior Set of Subjects in Population: Set of Subjects in the Data: Populations possible in the posterior: Only Populations that include the set of subjects in the data: Prior Data

38 Possible Points for Possible Populations in the Posterior Set of Subjects in Population: Set of Subjects in the Data: Not in Posterior Populations possible in the posterior: Only Populations that include the set of subjects in the data: Also: Only points where the set of subjects in the data match the subjects representing the point corresponding to the data in the prior For possible population, possible points in the posterior: Prior Data

39 Initial Order k=1 Labels Latent Values Data Potential Response Also: the set of subjects in the data must match the set of subjects representing the point corresponding to the data in the prior Prior Data Measurement Variances Possible Points for Possible Populations with Measurement Error Possible Populations in the Posterior

40 Possible Points for Possible Populations with Measurement Error Initial Order k=1 Labels Latent Values Data Potential Response Define Prior Data Initial Listing Possible Populations in the Posterior The set of subjects in the data must match the subjects representing the point corresponding to the data in the prior Measurement Variances

41 Possible Points for Possible Populations with Measurement Error Initial Order k=1 Labels Latent Values Data Potential Response Define Prior Data Possible Populations in the Posterior The set of subjects in the data must match the subjects representing the point corresponding to the data in the prior Measurement Variances

42 Possible Points for Possible Populations with Measurement Error Initial Order k=1 Labels Latent Values Possible Populations in the Posterior Data Potential Response Measurement Variances Prior Data The set of subjects in the data must match the subjects representing the point corresponding to the data in the prior

43 Possible Points for Possible Populations with Measurement Error Possible Populations in the Posterior Data Potential Response Prior Data The set of subjects in the data must match the subjects representing the point corresponding to the data in the prior Possible Points in the Population Points in Prior where Subjects match Data or Points in the Posterior Labels Latent Values Measurement Variances

44 Possible Points for Possible Populations with Measurement Error Data Potential Response Points in the Posterior Possible Populations in the Posterior Possible Points in the Population Points Not in the Posterior Labels Latent Values Measurement Variances

45 Possible Points for Possible Populations with Measurement Error Possible Populations in the Posterior Data Potential Response Possible Points in the Population Labels Latent Values Measurement Variances Points in the Posterior

46 Possible Points for Possible Populations with Measurement Error Points in the Posterior Populations in the Posterior Prior Data Posterior

47 Possible Points for Possible Populations with Measurement Error Points in the Posterior Populations in the Posterior Prior Data Posterior Random Variables for subject labels (assuming permutations are equally likely in the prior) Posterior

48 Possible Points for Possible Populations with Measurement Error Points in the Posterior Populations in the Posterior Prior Data Posterior Random Variables for latent values (assuming permutations are equally likely in the prior) Posterior

49 Possible Points for Possible Populations with Measurement Error Points in the Posterior Populations in the Posterior Prior Data Posterior Random Variables for Measurement Variance (assuming permutations are equally likely in the prior) Posterior

50 Posterior Random Variables with Measurement Error Subjects Latent values Response

51 Posterior Random Variables with Measurement Error Subjects Latent valuesResponse Observed (in the data) Define Response Marginal over Measurement Error for Subjects that are not in the data Since Condition on to obtain the posterior distribution

52 Posterior Random Variables with Measurement Error If permutations of subjects in listing p are equally likely: where Posterior

53 Posterior Random Variables with Measurement Error If permutations of subjects in listing p are equally likely: where Posterior

54 Posterior Random Variables with Measurement Error If permutations of subjects in listing p are equally likely: where Posterior

55 Posterior Random Variables with Measurement Error Finite Population Mixed Model for the subjects in the Data: where random subject effect Use this model to obtain the best linear unbiased predictor of the latent value for a subject in the data (which we call the BLUP for a realized subject) Posterior

56 Finite Population Mixed Model (FPMM) for Subjects in the Data based on the Posterior Random Variables What is the Latent value for a subject in the data? Data (set) Vectors Latent Value for subject s with label FPMM Data (set of vectors) Latent Values

57 Finite Population Mixed Model (FPMM) for Subjects in the Data based on the Posterior Random Variables What is the Latent value for a subject in the data? Data Latent Value for subject s with label FPMM Example (n=2) In the data, the permutation matrices are not random- they just differ. FPMM In the posterior distribution, the permutation matrices are random! … a consequence of using the prior

58 Posterior Random Variables with Measurement Error If permutations of subjects in listing p are equally likely: where Posterior Finite Population Mixed Model for the subjects in the Data: where random subject effect Use this model to obtain the best linear unbiased predictor of the latent value for a subject in the data (which we call the BLUP for a realized subject)

59 Posterior Random Variables with Measurement Error FPMM to Predict Latent Values Finite Population Mixed Model for the subjects in the Data: where random subject effect Use this model to obtain the best linear unbiased predictor of the latent value for a subject in the data (which we call the BLUP for a realized subject)