Presentation on theme: "Motif Mining from Gene Regulatory Networks"— Presentation transcript:
1 Motif Mining from Gene Regulatory Networks Based on the publications of Uri Alon’s group…presented by Pavlos PavlidisTartu University, December 2005
2 Gene Regulatory Networks From WikipediaGene regulatory network is a collection of DNA segments in a cell which interact with each other and with other substances in the cell, thereby governing the rates at which genes in the network are transcribed into mRNAFrom DOEGene regulatory networks (GRNs) are the on-off switches and rheostats…dynamically orchestrate the level of expression for each gene….
3 Why networks can regulate Gene Expression? U. Alon and his group, stresses the importance of the building blocks of the network.These building blocks are called motifs
4 Motifs They are called also n-node subgraphs in a directed graph (The work has also been extended for undirected graphs)They are characterized from the number n of the nodes and the relations between them – directed edges
6 Feed Forward LoopIt regulates rapidly the production of Z
7 In what motifs they are interested Not in biologically significantThey don’t know a priori if a motif is biologically significantThey can calculate statistical significanceThe probability that a randomized network contains the same number or more instances of a particular motif must be smaller than P. Here P is 0.01.
8 Randomized NetworkA randomized network is not completely randomized. It has some properties:The same number of nodes as in the real networkFor each node the number of the incoming and outgoing edges equals to the real network.
9 Representation of the network as a matrix M Randomization: Select randomly two cells which are 1 e.g A(1,3), B(2,1).If A’(1, 1) and B’(2, 3) are 0 then swapGoal : The randomized network must have the same sum in columns and in rowsColumns: The number of outgoing edgesRows: The number of incoming edges
10 One more requirement:If we are looking for n-node subgraphs, then the number of n-1 node subgraphs must be the same in real and randomized networksThis is done to avoid assigning high significance to a structure only because of the fact that it includes a highly significant substructure.
11 Significance of a motif Three requirementsP < 0.01P was estimated (or bounded) by using 1000 randomized networks.The number of times it appears in the real network with distinct sets of nodes is at least U = 4.The number of appearances in the real network is significantly larger than in the randomized networks: Nreal – Nrand > 0.1Nrand (Why??).
13 What did they findThat in biological systems as in E.coli or in S.cerevisiae only some certain types of motifs are statistically important.When they studied other systems such as: Food webs. The database of seven ecosystem food webs Neuronal networks: the neural system of C.elegansWWWOTHER KIND OF MOTIFS WHERE STATISTICALLY IMPORTANT