Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.

Similar presentations


Presentation on theme: "Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli."— Presentation transcript:

1 Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli

2

3 +

4 + +

5 + + + + …

6 Learning Bayesian networks from data and prior knowledge

7 Bayesian networks A CB D EF NODES EDGES Marriage between graph theory and probability theory. Directed acyclic graph (DAG) representing conditional independence relations. It is possible to score a network in light of the data. We can infer how well a particular network explains the observed data.

8 Bayesian networks versus causal networks A CB A CB True causal graph Node A unknown

9 Bayesian networks versus causal networks A CB Equivalence classes: networks with the same scores. Equivalent networks cannot be distinguished in light of the data. We can only learn the undirected graph. Unless… we use interventions or prior knowledge. A CB A CB A CB

10 Learning Bayesian networks from data P(M|D) = P(D|M) P(M) / Z

11

12

13

14

15 Use TF binding motifs in promoter sequences

16 Biological prior knowledge matrix Biological Prior Knowledge Indicates some knowledge about the relationship between genes i and j

17 Biological prior knowledge matrix Biological Prior Knowledge Define the energy of a Graph G Indicates some knowledge about the relationship between genes i and j

18 Prior distribution over networks Energy of a network

19 Sample networks and hyperparameters from the posterior distribution Capture intrinsic inference uncertainty Learn the trade-off parameters automatically P(M|D) = P(D|M) P(M) / Z

20 Prior distribution over networks Energy of a network

21 Rewriting the energy Energy of a network

22 Approximation of the partition function

23 Multiple sources of prior knowledge

24 Rewriting the energy Energy of a network

25 Approximation of the partition function

26 MCMC sampling scheme

27 Sample networks and hyperparameters from the posterior distribution Metropolis-Hastings scheme Proposal probabilities

28 MCMC with one prior Sample graph and the parameter . Separate in two samples to improve the acceptance: 1.Sample graph with  fixed. 2.Sample  with graph fixed.

29 Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with  fixed. 2.Sample  with graph fixed.

30 Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with  fixed. 2.Sample  with graph fixed.

31 Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with  fixed. 2.Sample  with graph fixed.

32 Sample graph and the parameter . BGe BDe MCMC with one prior Separate in two samples to improve the acceptance: 1.Sample graph with  fixed. 2.Sample  with graph fixed.

33 Approximation of the partition function

34 MCMC with two priors Sample graph and the parameters    and  2 Separate in three samples to improve the acceptance: 1.Sample graph with  1 and  2 fixed. 2.Sample  1 with graph and  2 fixed. 3.Sample  2 with graph and  1 fixed.

35 Bayesian networks with biological prior knowledge Biological prior knowledge: Information about the interactions between the nodes. We use two distinct sources of biological prior knowledge. Each source of biological prior knowledge is associated with its own trade-off parameter:  1 and  2. The trade off parameter indicates how much biological prior information is used. The trade-off parameters are inferred. They are not set by the user!

36 Bayesian networks with two sources of prior Data BNs + MCMC Recovered Networks and trade off parameters Source 1 Source 2 11 22

37 Bayesian networks with two sources of prior Data BNs + MCMC Source 1 Source 2 11 22 Recovered Networks and trade off parameters

38 Bayesian networks with two sources of prior Data BNs + MCMC Source 1 Source 2 11 22 Recovered Networks and trade off parameters

39 Evaluation Can the method automatically evaluate how useful the different sources of prior knowledge are? Do we get an improvement in the regulatory network reconstruction? Is this improvement optimal?

40 Application to the Raf regulatory network

41 Raf regulatory network From Sachs et al Science 2005

42 Evaluation: Raf signalling pathway Cellular signalling network of 11 phosphorylated proteins and phospholipids in human immune systems cell Deregulation  carcinogenesis Extensively studied in the literature  gold standard network

43 Data Prior knowledge

44 Intracellular multicolour flow cytometry. Measured protein concentrations. 11 proteins: 1200 concentration profiles. We sample 5 separate subsets with 100 concentration profiles each. Flow cytometry data and KEGG

45 Microarray example Spellman et al (1998) Cell cycle 73 samples Tu et al (2005) Metabolic cycle 36 samples Genes time

46 Data Prior knowledge

47 KEGG PATHWAYS are a collection of manually drawn pathway maps representing our knowledge of molecular interactions and reaction networks. http://www.genome.jp/kegg/ Flow cytometry data and KEGG

48 Prior knowledge from KEGG

49 Prior distribution

50 The data and the priors + KEGG + Random

51 Evaluation Can the method automatically evaluate how useful the different sources of prior knowledge are? Do we get an improvement in the regulatory network reconstruction? Is this improvement optimal?

52 Bayesian networks with two sources of prior Data BNs + MCMC Recovered Networks and trade off parameters Source 1 Source 2 11 22

53 Bayesian networks with two sources of prior Data BNs + MCMC Source 1 Source 2 11 22 Recovered Networks and trade off parameters

54 Sampled values of the hyperparameters

55 Evaluation Can the method automatically evaluate how useful the different sources of prior knowledge are? Do we get an improvement in the regulatory network reconstruction? Is this improvement optimal?

56 How to compare the recovered networks?

57 True networkPredicted network compare Thresholding Performance evaluation

58 True networkPredicted network compare True Positives False Positives Counting Performance evaluation

59 True networkPredicted network compare True Positives False Positives Counting Performance evaluation DGE – Consider edge directions UGE – Discard the edge directions

60 Performance evaluation: ROC curves

61 We use the Area Under the Receiver Operating Characteristic Curve (AUC). AUC=0.75 AUC=1 AUC=0.5 Performance evaluation: ROC curves

62 Evaluation 2: TP scores We set the threshold such that we obtain 5 spurious edges (5 FPs) and count the corresponding number of true edges (TP count).

63

64 Flow cytometry data and KEGG

65 Evaluation Can the method automatically evaluate how useful the different sources of prior knowledge are? Do we get an improvement in the regulatory network reconstruction? Is this improvement optimal?

66 Learning the trade-off hyperparameter Repeat MCMC simulations for large set of fixed hyperparameters β Obtain AUC scores for each value of β Compare with the proposed scheme in which β is automatically inferred. Mean and standard deviation of the sampled trade off parameter

67 Learning the trade-off hyperparameters on simulated data Mean and standard deviation of the sampled trade-off parameter Repeat MCMC simulations for large set of fixed hyperparameters β Obtain AUC scores for each value of β Compare with the proposed scheme in which β is automatically inferred.

68 Regulation of Raf-1 by Direct Feedback Phosphorylation. Molecular Cell, Vol. 17, 2005 Dougherty et al New evidence for the accepted network

69 Conclusion Bayesian scheme for the systematic integration of different sources of biological prior knowledge. The method can automatically evaluate how useful the different sources of prior knowledge are. We get an improvement in the regulatory network reconstruction. This improvement is close to optimal.

70 Thank you


Download ppt "Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli."

Similar presentations


Ads by Google