Download presentation

Presentation is loading. Please wait.

Published byJames Drake Modified over 3 years ago

1
Network Inference Umer Zeeshan Ijaz 1

2
Overview Introduction Application Areas cDNA Microarray EEG/ECoG Network Inference Pair-wise Similarity Measures Cross-correlation STATIC Coherence STATIC Autoregressive Granger Causality STATIC Probabilistic Graphical Models Directed Kalman-filtering based EM algorithm STATIC Undirected Kernel-weighted logistic regression method DYNAMIC Graphical Lasso-model STATIC

3
Introduction

4
cDNA Microarray

5
EoCG/EEG

6
For a pair of time series x i [t] and x j [t] of lengths n, the sample correlation at lag τ Cross-correlation based(1) Measure of Coupling is the maximum cross correlation: Use P-Value test to compare z ij with a standard normal distribution with mean zero and variance 1

7
Use Fisher Transformation: the resulting distribution is normal and has the standard deviation of Use scaled value that is expected to behave like the maximum of the absolute value of a sequence of random numbers. Using now established results for statistics of this form, we obtain therefore that *M. A. Kramer, U. T. Eden, S. S. Cash, E. D. Kolaczyk, Network inference with confidence from multivariate time series. Physical review E 79, 061916, 2009 Significance test: ANALYTIC METHOD Cross-correlation based (2)

8
Significance test: FREQUENCY DOMAIN BOOTSTRAP METHOD 1)Compute the power spectrum (Hanning tapered) of each series and average these power spectra from all the time series 2)Compute the standardized and whitened residuals for each time series 3)For each bootstrap replicate, RESAMPLE WITH REPLACEMENT and compute the surrogate data 4)Compute such instances and calculate maximum cross-correlation for each pair of nodes i and j 5)Finally compare the bootstrap distribution and assign a p-value Cross-correlation based (3)

9
1)Order m=N(N-1)/2 p-values 2)Choose FDR level q 3)Compare each to critical value and find the maximum i such that 4)We reject the null hypothesis that time series and are uncoupled for False Detection Rate Test *M. A. Kramer, U. T. Eden, S. S. Cash, and E. D. Kolaczyk. Network inference with confidence from multivariate time series, Physics Review E 79(061916), 1-13, 2009 Cross-correlation based (4)

10
Coherence: Signals are fully correlated with constant phase shifts, although they may show difference in amplitude Cross-phase spectrum: Provides information on time-relationships between two signals as a function of frequency. Phase displacement may be converted into time displacement Coherence based

11
Coherence based(2) *S. Weiss, and H. M. Mueller. The contribution of EEG coherence to the investigation of language, Brain and Language 85(2), 325-343, 2003

12
Directed Transfer Function: Directional influences between any given pair of channels in a multivariate data set Bivariate autoregressive process If the variance of the prediction error is reduced by the inclusion of other series, then based on granger causality, one depends on another. Now taking the fourier transform Granger causality from channel j to i: Granger Causality

13
- State Space Model (State Variable Model; State Evolution Model) State Equation Measurement Equation Measurement Update(Filtering) Time Update(Prediction) Kalman Filter

14
Probabilistic graphical models(1) Joint distribution over a set Bayesian Networks associate with each variable a conditional probability The resulting product is of the form AB CD E P(C|A,B) AB01 000.90.1 010.20.8 100.90.1 110.010.99

15
EM Algorithm: Predicting gene regulatory network Constructing the network:

16
Conditional distribution of state and observables Factorization rule for bayesian network Unknowns in the system EM Algorithm: Predicting gene regulatory network(2)

17
Construct the likelihood Marginalize with respect to x and introducing a distribution Q EM Algorithm: Predicting gene regulatory network(4)

18
Lets say we want to compute C Kalman filter based: Inferring network from microarray expression data(5)

19
Experimental Results: A standard T-Cell activation model *Claudia Rangel, John Angus, Zoubin Ghahramani, Maria Lioumi, Elizabeth Sotheran, Alessia Gaiba, David L. Wild, Francesco Falciani: Modeling T-cell activation using gene expression profiling and state-space models. Bioinformatics 20(9): 1361-1372 (2004) Kalman filter based: Inferring network from microarray expression data(9)

20
Probabilistic graphical models(2) Markov Networks represent joint distribution as a product of potentials D BC A E ABπ 1 (A,B) 001.0 010.5 100.5 112.0

21
x1x1 x6x6 x8x8 x5x5 x2x2 x3x3 x4x4 Kernel-weighted logistic regression method(1) Pair-wise Markov Random Field Logistic Function Log Likelihood Optimization problem

22
Kernel-weighted logistic regression method(2)

23
Kernel-weighted logistic regression method(3) Interaction between gene ontological groups related to developmental process undergoing dynamic rewiring. The weight of an edge between two ontological groups is the total number of connection between genes in the two groups. In the visualization, the width of an edge is propotional to the edge weight. The edge weight is thresholded at 30 so that only those interactions exceeding this number are displayed. The average network on left is produced by averaging the right side. In this case, the threshold is set to 20 *L. Song, M. Kolar, and E. P. Xing. KELLER: estimating time-varying interactions between genes. Bioinformatics 25, i128-i136, 2009

24
Graphical Lasso Model(1) *O. Banerjee, L. E. Ghaoui, A. dAspremont. Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. Journal of Machine Language Research 101, 2007

25
Solve the lasso problem for w 12 over jth column one at a time Graphical Lasso Model(2) *O. Banerjee, L. E. Ghaoui, A. dAspremont. Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. Journal of Machine Language Research 101, 2007

26
Graphical Lasso Model(3) *Software under development @ Oxford Complex Systems Group with Nick Jones *Results shown for Google Trend Dataset

27
27 THE END

Similar presentations

OK

Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.

Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.

© 2018 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on condition based maintenance software Ppt on object-oriented programming encapsulation Ppt on fibonacci series Ppt on carburetors Ppt on store design and layout Ppt on high level languages basic Ppt on nervous system Ppt on unity in diversity youtube Ppt on autonomous car news Ppt on conservation of environmental degradation