Presentation is loading. Please wait.

Presentation is loading. Please wait.

Inferring gene regulatory networks with non-stationary dynamic Bayesian networks Dirk Husmeier Frank Dondelinger Sophie Lebre Biomathematics & Statistics.

Similar presentations


Presentation on theme: "Inferring gene regulatory networks with non-stationary dynamic Bayesian networks Dirk Husmeier Frank Dondelinger Sophie Lebre Biomathematics & Statistics."— Presentation transcript:

1 Inferring gene regulatory networks with non-stationary dynamic Bayesian networks Dirk Husmeier Frank Dondelinger Sophie Lebre Biomathematics & Statistics Scotland

2 Overview Introduction Non-homogeneous dynamic Bayesian network for non-stationary processes Flexible network structure Open problems

3 Can we learn signalling pathways from postgenomic data? From Sachs et al Science 2005

4 Network reconstruction from postgenomic data

5 Friedman et al. (2000), J. Comp. Biol. 7, 601-620 Marriage between graph theory and probability theory

6 Bayes net ODE model

7 A CB D EF NODES EDGES Graph theory Directed acyclic graph (DAG) representing conditional independence relations. Probability theory It is possible to score a network in light of the data: P(D|M), D:data, M: network structure. We can infer how well a particular network explains the observed data.

8 [A]= w1[P1] + w2[P2] + w3[P3] + w4[P4] + noise BGe (Linear model) A P1 P2 P4 P3 w1 w4 w2 w3

9 BDe (Nonlinear discretized model) P1 P2 P1 P2 Activator Repressor Activator Repressor Activation Inhibition Allow for noise: probabilities Conditional multinomial distribution P P

10 Model Parameters q Integral analytically tractable!

11 BDe: UAI 1994 BGe: UAI 1995

12 Dynamic Bayesian network

13 Example: 2 genes  16 different network structures Best network: maximum score

14 Identify the best network structure Ideal scenario: Large data sets, low noise

15 Uncertainty about the best network structure Limited number of experimental replications, high noise

16 Sample of high-scoring networks

17 Feature extraction, e.g. marginal posterior probabilities of the edges

18 Sample of high-scoring networks Feature extraction, e.g. marginal posterior probabilities of the edges High-confident edge High-confident non-edge Uncertainty about edges

19 Can we generalize this scheme to more than 2 genes? In principle yes. However …

20 Number of structures Number of nodes

21 Configuration space of network structures Find the high-scoring structures Sampling from the posterior distribution Taken from the MSc thesis by Ben Calderhead

22 Madigan & York (1995), Guidici & Castello (2003)

23 Configuration space of network structures MCMC Local change Ifaccept If accept with probability Taken from the MSc thesis by Ben Calderhead

24 Overview Introduction Non-homogeneous dynamic Bayesian networks for non-stationary processes Flexible network structure Open problems

25

26 Dynamic Bayesian network

27 Example: 4 genes, 10 time points t1t1 t2t2 t3t3 t4t4 t5t5 t6t6 t7t7 t8t8 t9t9 t 10 X (1) X 1,1 X 1,2 X 1,3 X 1,4 X 1,5 X 1,6 X 1,7 X 1,8 X 1,9 X 1,10 X (2) X 2,1 X 2,2 X 2,3 X 2,4 X 2,5 X 2,6 X 2,7 X 2,8 X 2,9 X 2,10 X (3) X 3,1 X 3,2 X 3,3 X 3,4 X 3,5 X 3,6 X 3,7 X 3,8 X 3,9 X 3,10 X (4) X 4,1 X 4,2 X 4,3 X 4,4 X 4,5 X 4,6 X 4,7 X 4,8 X 4,9 X 4,10

28 t1t1 t2t2 t3t3 t4t4 t5t5 t6t6 t7t7 t8t8 t9t9 t 10 X (1) X 1,1 X 1,2 X 1,3 X 1,4 X 1,5 X 1,6 X 1,7 X 1,8 X 1,9 X 1,10 X (2) X 2,1 X 2,2 X 2,3 X 2,4 X 2,5 X 2,6 X 2,7 X 2,8 X 2,9 X 2,10 X (3) X 3,1 X 3,2 X 3,3 X 3,4 X 3,5 X 3,6 X 3,7 X 3,8 X 3,9 X 3,10 X (4) X 4,1 X 4,2 X 4,3 X 4,4 X 4,5 X 4,6 X 4,7 X 4,8 X 4,9 X 4,10 Standard dynamic Bayesian network: homogeneous model

29 Limitations of the homogeneity assumption

30 Our new model: heterogeneous dynamic Bayesian network. Here: 2 components t1t1 t2t2 t3t3 t4t4 t5t5 t6t6 t7t7 t8t8 t9t9 t 10 X (1) X 1,1 X 1,2 X 1,3 X 1,4 X 1,5 X 1,6 X 1,7 X 1,8 X 1,9 X 1,10 X (2) X 2,1 X 2,2 X 2,3 X 2,4 X 2,5 X 2,6 X 2,7 X 2,8 X 2,9 X 2,10 X (3) X 3,1 X 3,2 X 3,3 X 3,4 X 3,5 X 3,6 X 3,7 X 3,8 X 3,9 X 3,10 X (4) X 4,1 X 4,2 X 4,3 X 4,4 X 4,5 X 4,6 X 4,7 X 4,8 X 4,9 X 4,10

31 t1t1 t2t2 t3t3 t4t4 t5t5 t6t6 t7t7 t8t8 t9t9 t 10 X (1) X 1,1 X 1,2 X 1,3 X 1,4 X 1,5 X 1,6 X 1,7 X 1,8 X 1,9 X 1,10 X (2) X 2,1 X 2,2 X 2,3 X 2,4 X 2,5 X 2,6 X 2,7 X 2,8 X 2,9 X 2,10 X (3) X 3,1 X 3,2 X 3,3 X 3,4 X 3,5 X 3,6 X 3,7 X 3,8 X 3,9 X 3,10 X (4) X 4,1 X 4,2 X 4,3 X 4,4 X 4,5 X 4,6 X 4,7 X 4,8 X 4,9 X 4,10 Our new model: heterogeneous dynamic Bayesian network. Here: 3 components

32 Learning with MCMC q k h Number of components (here: 3) Allocation vector

33 Non-homogeneous model  Non-linear model

34 [A]= w1[P1] + w2[P2] + w3[P3] + w4[P4] + noise BGe: Linear model A P1 P2 P4 P3 w1 w4 w2 w3

35 BDe: Nonlinear discretized model P1 P2 P1 P2 Activator Repressor Activator Repressor Activation Inhibition Allow for noise: probabilities Conditional multinomial distribution P P

36 Pros and cons of the two models Linear Gaussian model Restriction to linear processes Original data  no information loss Multinomial model Nonlinear model Discretization  information loss

37 Can we get an approximate nonlinear model without data discretization? y x

38 Idea: piecewise linear model y x

39 t1t1 t2t2 t3t3 t4t4 t5t5 t6t6 t7t7 t8t8 t9t9 t 10 X (1) X 1,1 X 1,2 X 1,3 X 1,4 X 1,5 X 1,6 X 1,7 X 1,8 X 1,9 X 1,10 X (2) X 2,1 X 2,2 X 2,3 X 2,4 X 2,5 X 2,6 X 2,7 X 2,8 X 2,9 X 2,10 X (3) X 3,1 X 3,2 X 3,3 X 3,4 X 3,5 X 3,6 X 3,7 X 3,8 X 3,9 X 3,10 X (4) X 4,1 X 4,2 X 4,3 X 4,4 X 4,5 X 4,6 X 4,7 X 4,8 X 4,9 X 4,10 Inhomogeneous dynamic Bayesian network with common changepoints

40 Inhomogenous dynamic Bayesian network with node-specific changepoints t1t1 t2t2 t3t3 t4t4 t5t5 t6t6 t7t7 t8t8 t9t9 t 10 X (1) X 1,1 X 1,2 X 1,3 X 1,4 X 1,5 X 1,6 X 1,7 X 1,8 X 1,9 X 1,10 X (2) X 2,1 X 2,2 X 2,3 X 2,4 X 2,5 X 2,6 X 2,7 X 2,8 X 2,9 X 2,10 X (3) X 3,1 X 3,2 X 3,3 X 3,4 X 3,5 X 3,6 X 3,7 X 3,8 X 3,9 X 3,10 X (4) X 4,1 X 4,2 X 4,3 X 4,4 X 4,5 X 4,6 X 4,7 X 4,8 X 4,9 X 4,10

41 NIPS 2009

42 Overview Introduction Non-homogeneous dynamic Bayesian network for non-stationary processes Flexible network structure Open problems

43 Non-stationarity in the regulatory process

44 Non-stationarity in the network structure

45 ICML 2010

46 Flexible network structure with regularization

47

48

49 Morphogenesis in Drosophila melanogaster Gene expression measurements over 66 time steps of 4028 genes (Arbeitman et al., Science, 2002). Selection of 11 genes involved in muscle development. Zhao et al. (2006), Bioinformatics 22

50 Transition probabilities: flexible structure with regularization Morphogenetic transitions: Embryo  larva larva  pupa pupa  adult

51 Comparison with: Dondelinger, Lèbre & Husmeier Ahmed & Xing

52

53

54

55

56

57 Collaboration with Frank Dondelinger and Sophie Lèbre NIPS 2010

58

59 Method based on homogeneous DBNs Method based on differential equations

60

61 Sample of high-scoring networks

62 Feature extraction, e.g. marginal posterior probabilities of the edges

63 Method based on homogeneous DBNs Method based on differential equations

64 Overview Introduction Non-homogeneous dynamic Bayesian network for non-stationary processes Flexible network structure Open problems

65

66 Exponential versus binomial prior distribution Exploration of various information sharing options

67 How to deal with static data?

68 Change-point process Free allocation

69 Allocation sampler versus change-point process More flexibility, unrestricted mixture model. Not restricted to time series Higher computational costs Incorporates plausible prior knowledge for time series. Reduced complexity Less universal, not applicable to static data

70 Marco Grzegorczyk University of Dortmund Germany Frank Dondelinger Biomathematics & Statistics Scotland United Kingdom Sophie Lèbre Université de Strasbourg France Acknowledgements

71 Further details for discussion during question time

72 Details on exponential prior

73 Hierarchical Bayesian model

74

75

76 MCMC scheme (for symmetric proposal distributions)

77 Details on other priors

78

79

80 where

81

82

83 Partition function Ignoring the fan-in restriction:  Number of genes

84 Simulation study We randomly generated 10 networks with 10 nodes each. Number of regulators for each node drawn from a Poisson distribution with mean=3. 5 time series segments Network changes: number of changes drawn from a Poisson distribution. For each segment: time series of length 50 generated from a linear regression model, interaction parameters drawn from N(0,1), iid Gaussian noise from N(0,1).

85 Synthetic simulation study No information sharing between adjacent segments Information sharing between adjacent segments Frank Dondelinger, Sophie Lèbre, Dirk Husmeier: ICML 2010


Download ppt "Inferring gene regulatory networks with non-stationary dynamic Bayesian networks Dirk Husmeier Frank Dondelinger Sophie Lebre Biomathematics & Statistics."

Similar presentations


Ads by Google