Download presentation

Presentation is loading. Please wait.

Published byParker Lefort Modified about 1 year ago

1
Bayesian Factor Regression Models in the “Large p, Small n” Paradigm Mike West, Duke University Presented by: John Paisley Duke University

2
Outline Empirical Factor Regression (SVD) Latent Factor Regression Sparse Factor Regression

3
Linear Regression & Empirical Factor Regression Linear Regression SVD Regression D is a diagonal matrix of singular values

4
Empirical Factor Regression By definition, Regression is now done in factor space using generalized shrinkage (ridge regression) priors on, e.g. RVM Problem of inversion:has many-to-one mapping is canonical “least-norm” inverse

5
Example: Biscuit Dough Data NIR spectroscopy reflectance values are predictors Response is fat content of dough samples 39 training, 39 testing: data are pooled and testing data responses treated as missing values to be imputed Top 16 factors used, based on size of singular values

6
Example: Biscuit Dough Data (2) Left: Fitted and predicted vs true values Right: Least-norm inverse of beta ~ 1700 nm range is absorbance region for fat As can be seen, solution is not sparse

7
Latent Factor Regression Loosen to Under proper constraints on B, this finds common structure in X and isolates idiosyncrasies to noise Now, variation in X has less effect on y The implied prior is When variance, Phi 0, this reverts to empirical linear regression

8
Sparse Latent Factor Regression WRT gene expression profiling, “multiple biological factors underlie patterns of gene expression variation, so latent factor approaches are natural – we imagine that latent factors reflect individual biological functions… This is a motivating context for sparse models.” Columns of B represents the genes involved in a particular biological factor. Rows of B represent a particular gene’s involvement across biological factors.

9
Example: Gene Expression Data p = 6128 genes measured using Affymetrix DNA microarrays n = 49 breast cancer tumor samples k = 25 factors Factor 3 separates by red: estrogen receptor positive tumors blue: ER negative

10
Example: Gene Expression Data Comparison with results obtained using empirical SVD factors

11
Conclusion Sparse factor regression modeling is a promising framework for dimensionality reduction of predictors. Only those factors that are relevant (e.g. factor 3) are of interest. Therefore, only those genes with non-zero values in that column of B are meaningful.

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google