Presentation is loading. Please wait.

Presentation is loading. Please wait.

My first 100 Tb of data STATISTICAL METHODS FOR NEW TECHNOLOGY WORKING GROUP Ciprian M. Crainiceanu Johns Hopkins University

Similar presentations


Presentation on theme: "My first 100 Tb of data STATISTICAL METHODS FOR NEW TECHNOLOGY WORKING GROUP Ciprian M. Crainiceanu Johns Hopkins University"— Presentation transcript:

1 My first 100 Tb of data STATISTICAL METHODS FOR NEW TECHNOLOGY WORKING GROUP Ciprian M. Crainiceanu Johns Hopkins University http://www.biostat.jhsph.edu/smnt

2 Members of the group Key personnel C.M. Crainiceanu, B.S. Caffo, A.-M. Staicu, S. Greven, D. Ruppert, C.-Z. Di Senior Students V. Zipunnikov, J.-A. Goldsmith Other statisticians (>20) Scientific collaborators Direct collaboration Solving important scientific problems Diverse scientific applications

3 Scientific Collaborators Susan Bassett – fMRI, Alzheimers Danny Reich – DTI, DCE-MRI, MS Brian Schwartz – lead exposure, VBM, DTI, white matter imaging Stewart Mostofsky – fMRI, rsfcMRI, Autism, ADHD, Turrets Naresh Punjabi – EEG, sleep, sleep diseases Dzung Pham / Pilou Bazin – Cortical shape, thickness, lesion detection, MS Dean Wong – PET, fMRI substance abuse Susan Resnick – BLSA Jerry Prince – BLSA, ADNI Jim Pekar, Peter Van Zijl – 7T MRI, fMRI, rsfcMRI preprocessing, scanner physics Christos Davatzikos- RAVENS Susumu Mori – DTI, tractography Dana Boatman – ECOG, EEG, epilepsy Graham Redgrave – fMRI, DTI, Huntingtons, anorexia/bulimia Tudor Badea, Bruno Jednyak – Neuron classification, morphometry, 3D structure and shape Tom Glass – Gizmos Merck – EEG, neuroimaging Pfizer – imaging biomarkers?

4 Observational Studies 2.0

5

6

7 Longitudinal Functional Principal Component Analysis (LFPCA) I=1000, J=4, D=100: 15 I=1000, J=8, D=200: 70 Greven, Crainiceanu, Caffo, Reich, 2010. LFPCA, EJS, to appear

8 A simple regression formula Data compression via longitudinal PCA MoM estimators of covariance matrices, smoothing Need: all covariance operators Solution: regress Y ij (d)Y ik (d) on 1, T ik, T ij, T ik T ij, jk

9 Variance explained (FA, 3 yrs of long. data)

10 Longitudinal Penalized Functional Regression

11 LPFR: recipe and ingredients

12 PASAT/MD (Corp. Call.), PD (Cortic. spinal)

13 Functional regression No paper on longitudinal functional regression No paper published with this data structure Longitudinal extensions are not simple Technical details are hard without the correct recipe for known and published ingredients No available method that scales up Goldsmith, Feder, Crainiceanu, Caffo, Reich, 2010. PFR, JCGS, to appear Goldsmith, Crainiceanu, Caffo, Reich, 2010. LPFR, to appear?

14 Population Value Decomposition (PVD)

15 PVD Y i = P V i D + E i P is T*A D is B*F V i is A*B A << T, B << F

16 Singular Value Decomposition (SVD) summarizes variance Subject-specific Data Eigenvariates Eigenfrequencies Diagonal Matrix Frequency. Frequency Time One subject

17 Caffo BS, Crainiceanu CM, Verduzco G, Joel SE, Mostofsky SH, Bassett SS, Pekar JJ. Two-Stage decompositions for the analysis of functional connectivity for fMRI with application to Alzheimers disease risk. NeuroImage (In Press). Default PVD Subject-specific Data Low rank approximation Eigenvariates Eigenfrequencies... Stacked across subjects Population decomposition Projecting original data onto population bases (Start here) SVD … Subject-specific Data

18 Population eigenimages

19 Currently: Deploying PVD to the 1000 Functional Connectomes Project http://www.nitrc.org/projects/fcon_1000/ Comparing rsfcMRI in stroke versus normal subjects

20 HD-MFPCA/RAVENS Images

21 Multilevel Functional Principal Component Analysis (MFPCA)

22 MFPCA

23 HD-MFPCA

24 HD-MFPCA, Step 1

25 HD-MFPCA, Step 2

26

27 Main message, backed by 100Tb of data Eventually, good tech makes into observational and clinical trials Longitudinal/Multilevel FDA is the natural next step in FDA Data is changing the way we do business: availability, size, complexity Likely: funding will be based much more on relevance than on technical ability


Download ppt "My first 100 Tb of data STATISTICAL METHODS FOR NEW TECHNOLOGY WORKING GROUP Ciprian M. Crainiceanu Johns Hopkins University"

Similar presentations


Ads by Google