Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning Multiple Evolutionary Pathways from Cross-sectional Data Niko Beerenwinkel, Jorg Rahnenfuhrer, Martin Daumer, Daniel Hoffmann,Rolf Kaiser, Joachim.

Similar presentations


Presentation on theme: "Learning Multiple Evolutionary Pathways from Cross-sectional Data Niko Beerenwinkel, Jorg Rahnenfuhrer, Martin Daumer, Daniel Hoffmann,Rolf Kaiser, Joachim."— Presentation transcript:

1 Learning Multiple Evolutionary Pathways from Cross-sectional Data Niko Beerenwinkel, Jorg Rahnenfuhrer, Martin Daumer, Daniel Hoffmann,Rolf Kaiser, Joachim Selbig,Thomas Lengauer, RECOMB’04 Ning Wei Texas A&M University

2 Content Introduction Definition Algorithm Discussion Homework

3 Introduction A directed weighted trees, which generates the probability distribution on the set of all patterns of genetic events, to identify directed dependencies between mutational events in the evolutionary processes. Vertices represent the mutational events. Edge weights represent conditional probabilities between events. EM ( Expectation--Maximization )-like Learning Algorithm to generate the mutagenetic trees.

4 Definition Different events l {1, …, l}, initially as “null events”. A pattern xi of events: xi = (xi1,..., xil) A set of observed patterns is represented by the matrix: A mutagenetic tree consists of vertices V, edges E with at most one entering edge, null events or no entering edge vertex r, and a map p(e) =Pr(j2|j1) - >[0,1] (the conditional probability of event j2 given j1 occurred, if p(e)=0, delete e from E).

5 Definition (cont’d) Here is the example of the mutagenetic tree.

6 Definition (cont’d) Since the noisy real world data, they have likelihood zero under estimated mutagenetic tree and do not capture all of the pathways. K-mutagenetic trees mixture model: Given and is a random variable in [0,1], the model: The likelihood of a pattern x where and S is the set of all vertices reachable from r in the sub tree (V, E’)

7 Definition (cont’d) Example for calculating the likelihood between two events 70R and 219Q: L(x|T)=0.46*0.43*(1-0.46)*(1- 0.65)=0.037

8 Definition (cont’d) Under the mixture model M, the tree would be a star with all patterns of events positive likelihood. Assume that in addition to different pathways of accumulation of events, there is a certain probability  of any event occurring spontaneously independent of all other events referred as the noise component of the model. The algorithm assigns the  to T1, and T1 is a star tree with p(e)=  for all eE1.

9 EM-like Learning Algorithm To construct k-mutagenetic mixture model trees that maximize the log-likelihood of the data from the observed patterns X.

10 EM-like Learning Algorithm (cont’d) Initialize the parameters, set the responsibilities for estimating the joint probabilities:

11 EM-like Learning Algorithm (cont’d) Update the model parameters:

12 EM-like Learning Algorithm (cont’d) Compute the weight w from the previous pk and find out the maximum weight tree Tk: Pr(j) the marginal probability of event j; Pr(j1, j2) is the joint probability of events j1 and j2 estimated by the previous steps.

13 EM-like Learning Algorithm (cont’d) Update the responsibilities:

14 EM-like Learning Algorithm (cont’d) Iterate the above steps until the log-likelihood function doe not increase any more. According to the distribution of log-likelihood and the number of trees K, pick K=3.

15 EM-like Learning Algorithm (cont’d) 3-Mutagenetics trees mixture model:

16 Discussion The mixture model maps 87% of the observed patterns into their identified mutagenetic trees. Study the stability of the mutagenetic trees by bootstrap. Applications are the resistance pathways by drugs, designing the treatment protocols, combination therapy pathways and so on. EM learning algorithm could be extended to a model- based clustering algorithm. The mixture model may be adequate to identify a few parallel pathways, while directed acyclic graph (DAG) would be expected to be more appropriate for highly connected network of events aided by the ML (maximum likelihood) estimation of the model parameters.

17 Homework Modify the K-mutagenetic tress mixture model learning algorithm to obtain a clustering algorithm. Explain the reasons and the predefined conditions for applying your clustering algorithm. (Hint: section 6.2 of the paper.)

18 Thanks Any question, please email me at xiaoning76@neo.tamu.edu


Download ppt "Learning Multiple Evolutionary Pathways from Cross-sectional Data Niko Beerenwinkel, Jorg Rahnenfuhrer, Martin Daumer, Daniel Hoffmann,Rolf Kaiser, Joachim."

Similar presentations


Ads by Google