Presentation is loading. Please wait.

Presentation is loading. Please wait.

North Carolina Agricultural and Technical State University Inferring stable gene regulatory networks from steady-state data Presenter: Joy Edward Larvie.

Similar presentations


Presentation on theme: "North Carolina Agricultural and Technical State University Inferring stable gene regulatory networks from steady-state data Presenter: Joy Edward Larvie."— Presentation transcript:

1 North Carolina Agricultural and Technical State University Inferring stable gene regulatory networks from steady-state data Presenter: Joy Edward Larvie Collaborator: Mohammad Gorji Advisor: Dr. Abdollah Homaifar

2 North Carolina Agricultural and Technical State University Outline  Motivation  Background  Introduction  Existing Techniques  Objectives  Methodology  Results and Discussion  Conclusion  Future Work  References 2

3 North Carolina Agricultural and Technical State University Motivation  Genetic networks useful in drug discovery where it is crucial for identifying targeted pathways  Promotes biological knowledge and medical diagnosis  Existing genetic network identification techniques have inherent limitations 3

4 North Carolina Agricultural and Technical State University Background  Traditional techniques for gene expression studies is limited in both breadth and efficiency  Investigators/researchers could only study one or a few genes at a time  DNA microarray technology provides researchers the opportunity to analyze expression patterns of tens of thousands of genes at a time  Multi-step, data-intensive nature of technology has created a huge informatics and analytical challenge  It has become a standard tool for genomic research Fig. 1: Overview of DNA microarray experiment 4

5 North Carolina Agricultural and Technical State University Introduction  Understanding the nature of cellular functions requires the study of gene behavior from a global perspective  Genes typically regulated through complex interconnections of cellular components, such as proteins  Interactions form the basis of several cellular pathways and molecular processes in living cells  Identification of these gene regulatory networks (GRNs) promotes biological knowledge, medical diagnosis, drug design and also helps to identify molecular targets of pharmacological compounds 5

6 North Carolina Agricultural and Technical State University Introduction (cont’d)  Large number of genes involved in GRNs makes network recovery a very complex task  Advent of high throughput technologies such as DNA microarrays has provided a powerful tool allowing large expression data to be collected in a single experiment  Increasing availability of data has boosted the network recovery task  Two categories of expression data »Temporal expression data »Steady-state expression data  Several novel machine learning algorithms proposed for network identification  Commonly used approaches include clustering, Bayesian Networks, Boolean Networks and Dynamic Bayesian Networks 6

7 North Carolina Agricultural and Technical State University Existing Techniques Computational approachesStrengthWeakness Boolean network Can analyse large regulatory networks Easier to interpret due to its simplicity Phenomena of biological realistic complex can be represented by Simplistic Boolean formalism Deterministic in nature Unable to handle incomplete regulatory network data only involves two representative states for gene expression levelHigh computing time is needed Most BNs can only with with a small number of genes Probabilistic Boolean network Copes with uncertainties Two or more transition function for each variable is allowed the use of positive feedback and probabilities can make the model work more effectively Compared to DBN, PBN can explain more details in the regulatory roles of different sets of gene Difficult to apply for large scale networks High computational complexity Cannot cope with instantaneous interactions between variables Bayesian network Ability of handling noisy Handle with uncertainty Able to work on the logically interacting components with small number of variables Integrate the prior knowledge to strengthen the causal relationship Infer the structure of network statistically Hard to distinguish between the origin and the target of an interaction Feedback loops not allowed Failure to capture temporal information of time series microarray data Support small sized gene regulatory networks Combinatorial learning of Bayesian network Dynamic Bayesian network Able to model cyclic interaction among genes Can handle stochastic components Well-suited for handling time-series gene expression data Model indirect or direct causal relationships Handle perturbation or structural modification of networks Excessive computation time and cost Performance restricted by the missing values of gene expression data Supports small sized gene regulatory networks Ordinary differential Equation Produces directed signed graphs Suited for steady- state and time series expression profiles Can work entirely in classical category Only applicable to small networks Difficult to find appropriate parameter values that fit with the data Neural network Able to recognize input pattern Able to model any functional relationships and data structure Captures the nonlinear and dynamic interactions Noise resistant Difficult to obtain efficient training since learning rate must be defined for different data situation High computational complexity therefore can only apply to very small systems Table. 1: Comparison of existing network Identification techniques 7

8 North Carolina Agricultural and Technical State University Objectives  Infer a stable, sparse and causal genetic network from steady-state microarray data  Understand the inherent gene-gene interactions within inferred network for biological studies/research 8

9 North Carolina Agricultural and Technical State University Methodology 9

10 North Carolina Agricultural and Technical State University Methodology (cont’d) 10

11 North Carolina Agricultural and Technical State University Methodology (cont’d) 11

12 North Carolina Agricultural and Technical State University Methodology (cont’d) 12

13 North Carolina Agricultural and Technical State University The circles that bound the Eigenvalues are: C 1 : Center point (4,0) with radius r 1 = |2|+|3|=5 C 2 : Center point (-5,0) with radius r 2 =|-2|+|8|=10 C 3 : Center Point (3,0) with radius r 3 =|1|+|0|=1 Union of the Circles The red dots to the right mark the actual location of the Eigenvalues Consider the following example. 13

14 North Carolina Agricultural and Technical State University In the Gerschgorin Circle Theorem the y-axis is interpreted as the imaginary axis. Since the roots of the characteristic polynomial could be complex numbers they take on the form x+ i y where i is the square root of -1. The circles that bound the Eigenvalues are: C 1 : Center point (1,0) with radius r 1 = |0|+|7|=7 C 2 : Center point (-5,0) with radius r 2 =|2|+|0|=2 C 3 : Center Point (-3,0) with radius r 3 =|4|+|4|=8 All the eigenvalues lie inside the union of all the circles. 14

15 North Carolina Agricultural and Technical State University Methodology (cont’d) 15

16 North Carolina Agricultural and Technical State University Results and Discussion I  Identification algorithm is applied to a subnetwork of the SOS pathway in E. coli as shown  Main pathway depicted in network is that between the single- stranded DNA (ssDNA) and the protein LexA which works to repress several other genes  The protein RecA, activated by the single-stranded DNA, cleaves LexA, hence up- regulates the genes described  Steady-state data consists of 9 genes over 9 time steps  Maximum of 81 interactions to be identified Fig. 2: Diagram of interactions in SOS network in E. coli 16

17 North Carolina Agricultural and Technical State University Results and Discussion (cont’d)  Steady-state data for SOS pathway in E. coli from perturbation experiment Table. 2: Steady-state data for SOS network in E. coli 17

18 North Carolina Agricultural and Technical State University Results and Discussion (cont’d)  Recovered network from the steady-state data is as shown  Red arrows represent inhibition while green arrows depict activation  Network has 4 false activations, 8 false inhibitions, 25 false no- interactions, and 37 false identifications in total  Penalty parameter = 0.4  Satisfies desired constraints (i.e. stability, sparsity and causality) Fig. 3: Recovered network for SOS pathway in E. coli 18

19 North Carolina Agricultural and Technical State University Results and Discussion (cont’d)  Sparse network matrix, A = ; GenesrecAlexAssbrecFdinIumuDCrpoDrpoHrpoS recA-0.0010800000000 lexA0.677765-0.43355-0.300910.01902-0.94018-0.313690.097238-0.440070.696682 ssb00-0.9311-1.2052300-0.01077-0.206070 recF000-0.4452700000.444202 dinI-0.00119000.054514-0.12210-0.05714-0.008210 umuDC00000.527116-0.52819000 rpoD000000.39363-0.394700 rpoH0000000.558855-0.559920 rpoS00000000.509771-0.51084 Table. 3: Recovered sparse network as adjacency matrix 19

20 North Carolina Agricultural and Technical State University Results and Discussion II  Identification algorithm applied to perturbed subset data of Human Cancer Cell Line(HeLa)  Data consists of 20 genes over 20 time steps; Table. 2: Recovered sparse network of HeLa as adjacency matrix 20

21 North Carolina Agricultural and Technical State University Conclusion  Lasso-VAR technique models stable gene regulatory networks from steady-state data.  It is naturally possible to model networks that are sparse, causal and with feedback loops, without a priori knowledge of the network structure  Formulation of the proposed algorithm allows for scalability to large networks 21

22 North Carolina Agricultural and Technical State University Future Work 22

23 North Carolina Agricultural and Technical State University References [1] L. E. Chai, S. K. Loh, S. T. Low, M. S. Mohamad, S. Deris, and Z. Zakaria, “A review on the computational approaches for gene regulatory network construction,” Computers in Bio and Med, vol. 48, pp. 55–65, May 2014. [2] T. S. Gardner, D. Di Bernardo, D. Lorenz, and J. J. Collins, “Inferring genetic networks and identifying compound mode of action via expression profiling,” Science, vol. 301, no. 5629, pp. 102–105, 2003. [3] M. M. Zavlanos, A. A. Julius, S. P. Boyd, and G. J. Pappas, “Inferring stable genetic networks from steady-state data,” Automatica, vol. 47, no. 6, pp. 1113–1122, 2011. [4] M. M. Kordmahalleh, M. G. Sefidmazgi, A. Homaifar, A. Karimoddini, A. Guiseppi-Elie, and J. L. Graves, “Delayed and hidden variables interactions in gene regulatory networks,” in 2014 IEEE BIBE, Nov. 2014, pp. 23–29. [5] A. Fujita, J. R. Sato, H. M. Garay-Malpartida, R. Yamaguchi, S. Miyano, M. C. Sogayar, and C. E. Ferreira, “Modeling gene expression regulatory networks with the sparse vector autoregressive model,” BMC Sys Bio, vol. 1, no. 1, p. 39, Aug. 2007. [6] G. Michailidis and F. dAlch Buc, “Autoregressive models for gene regulatory network inference: Sparsity, stability and causality issues,” Mathematical biosciences, vol. 246, no. 2, pp. 326–334, 2013. [7] C. Sima, J. Hua, and S. Jung, “Inference of gene regulatory networks using time-series data: A survey,” Curr Gen, vol. 10, no. 6, pp. 416–429, Sep. 2009. [8] W. Nicholson, D. Matteson, and J. Bien, “Structured regularization for large vector autoregression,” Ph.D. dissertation, Cornell University, Sep. 2014. [9] H. Lütkepohl, New introduction to multiple time series analysis. Springer, 2007. 23

24 North Carolina Agricultural and Technical State University ANY QUESTIONS??


Download ppt "North Carolina Agricultural and Technical State University Inferring stable gene regulatory networks from steady-state data Presenter: Joy Edward Larvie."

Similar presentations


Ads by Google