Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological.

Similar presentations


Presentation on theme: "Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological."— Presentation transcript:

1 Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological networks 6.Other types of networks

2 [Qian, et al, J. Mol. Bio., 314:1053-1066] Expression networks

3 Regulatory networks [Horak, et al, Genes & Development, 16:3017-3033]

4 Expression networksRegulatory networks

5 Expression networksRegulatory networks Interaction networks

6 Metabolic networks [DeRisi, Iyer, and Brown, Science, 278:680-686]

7 Expression networks Regulatory networks Interaction networks Metabolic networks

8 ... more biological networks Hierarchies & DAGs [Enzyme, Bairoch; GO, Ashburner; MIPS, Mewes, Frishman]

9 Neural networks [Cajal] Gene order networks Genetic interaction networks [Boone]... more biological networks

10 Other types of networks Disease Spread [Krebs] Social Network Food Web Electronic Circuit Internet [Burch & Cheswick]

11 Part 2: Graphs, Networks Graph definition Topological properties of graphs -Degree of a node -Clustering coefficient -Characteristic path length Random networks Small World networks Scale Free networks

12 Graph: a pair of sets G={P,E} where P is a set of nodes, and E is a set of edges that connect 2 elements of P. Directed, undirected graphs Large, complex networks are ubiquitous in the world: -Genetic networks -Nervous system -Social interactions -World Wide Web

13 Degree of a node: the number of edges incident on the node i Degree of node i = 5

14 Clustering coefficient  LOCAL property The clustering coefficient of node i is the ratio of the number of edges that exist among its neighbours, over the number of edges that could exist Clustering coefficient of node i = 1/6 The clustering coefficient for the entire network C is the average of all the

15 Characteristic path length  GLOBAL property is the number of edges in the shortest path between vertices i and j The characteristic path length L of a graph is the average of the for every possible pair (i,j) i j Networks with small values of L are said to have the “small world property”

16 Models for networks of complex topology Erdos-Renyi (1960) Watts-Strogatz (1998) Barabasi-Albert (1999)

17 The Erdős-Rényi [ER] model (1960) Start with N vertices and no edges Connect each pair of vertices with probability P ER Important result: many properties in these graphs appear quite suddenly, at a threshold value of P ER (N) -If P ER ~c/N with c<1, then almost all vertices belong to isolated trees -Cycles of all orders appear at P ER ~ 1/N

18 The Watts-Strogatz [WS] model (1998) Start with a regular network with N vertices Rewire each edge with probability p For p=0 (Regular Networks): high clustering coefficient high characteristic path length For p=1 (Random Networks): low clustering coefficient low characteristic path length QUESTION: What happens for intermediate values of p?

19 1) There is a broad interval of p for which L is small but C remains large 2) Small world networks are common :

20 The Barabási-Albert [BA] model (1999) ER Model Look at the distribution of degrees ER ModelWS Model actorspower grid www The probability of finding a highly connected node decreases exponentially with k

21 GROWTH: starting with a small number of vertices m 0 at every timestep add a new vertex with m ≤ m 0 PREFERENTIAL ATTACHMENT: the probability Π that a new vertex will be connected to vertex i depends on the connectivity of that vertex: ● two problems with the previous models: 1. N does not vary 2. the probability that two vertices are connected is uniform

22 a) Connectivity distribution with N = m 0 +t=300000 and m 0 =m=1(circles), m 0 =m=3 (squares), and m 0 =m=5 (diamons) and m 0 =m=7 (triangles) b) P(k) for m 0 =m=5 and system size N=100000 (circles), N=150000 (squares) and N=200000 (diamonds)  Scale Free Networks

23 Part 3: Machine Learning Artificial Intelligence/Machine Learning Definition of Learning 3 types of learning 1.Supervised learning 2.Unsupervised learning 3.Reinforcement Learning Classification problems, regression problems Occam’s razor Estimating generalization Some important topics: 1.Naïve Bayes 2.Probability density estimation 3.Linear discriminants 4.Non-linear discriminants (Decision Trees, Support Vector Machines)

24

25

26

27 Bayes’ Rule: minimum classification error is achieved by selecting the class with largest posterior probability PROBLEM: we are given and we have to decide whether it is an a or a b Classification Problems

28 Regression Problems PROBLEM: we are only given the red points, and we would like approximate the blue curve (e.g. with polynomial functions) QUESTION: which solution should I pick? And why?

29

30

31

32 Naïve Bayes F 1F 2F 3…F n TARGET Gene 111.341…2.231 Gene 204.2444…2.31 Gene 313.5934…34.420 Gene 410.00164…24.30 Gene 506.876…6.50 ………………… Gene n14.5672…5.31 Example: given a set of features for each gene, predict whether it is essential

33 Bayes Rule: select the class with the highest posterior probability For a problem with two classes this becomes: otherwise, choose class then choose class if

34 whereand are called Likelihood Ratio for feature i. Naïve Bayes approximation: For a two classes problem:

35 Probability density estimation Assume a certain probabilistic model for each class Learn the parameters for each model (EM algorithm)

36 Linear discriminants assume a specific functional form for the discriminant function learn its parameters

37 Decision Trees (C4.5, CART) ISSUES: how to choose the “best” attribute how to prune the tree Trees can be converted into rules !

38 Part 4: Networks Predictions Naïve Bayes for inferring Protein-Protein Interactions

39 Network Gold-Standards The data [Jansen, Yu, et al., Science; Yu, et al., Genome Res.] Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard –

40 Network Gold-Standards Likelihood Ratio for Feature i: Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard –

41 Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/ 4 )/(3/6) =2 Likelihood Ratio for Feature i:

42 Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = ( 4 /4)/(3/6) =2 Likelihood Ratio for Feature i:

43 Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/4)/(3/ 6 ) =2 Likelihood Ratio for Feature i:

44 Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/4)/( 3 /6) =2 Likelihood Ratio for Feature i:

45 Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards Likelihood Ratio for Feature i: L 1 = (4/4)/(3/6) =2

46 Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/4)/(3/6) =2 L2 = (3/4)/(3/6) =1.5 For each protein pair: LR = L1  L2 log(LR) = log(L 1 ) + log(L 2 ) Likelihood Ratio for Feature i:

47 Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/4)/(3/6) =2 L2 = (3/4)/(3/6) =1.5 For each protein pair: LR = L1  L2 log(LR) = log(L 1 ) + log(L 2 ) Likelihood Ratio for Feature i:

48 1.Individual features are weak predictors, LR ~ 10; 2.Bayesian integration is much more powerful, LR cutoff = 600 ~9000 interactions


Download ppt "Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological."

Similar presentations


Ads by Google