Learning Multiple Evolutionary Pathways from Cross-sectional Data Niko Beerenwinkel, Jorg Rahnenfuhrer, Martin Daumer, Daniel Hoffmann,Rolf Kaiser, Joachim.

Slides:

Advertisements

Similar presentations

Advertisements

Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.

Size-estimation framework with applications to transitive closure and reachability Presented by Maxim Kalaev Edith Cohen AT&T Bell Labs 1996.

Mixture Models and the EM Algorithm

Clustering Beyond K-means

Introduction of Probabilistic Reasoning and Bayesian Networks

Segmentation and Fitting Using Probabilistic Methods

From: Probabilistic Methods for Bioinformatics - With an Introduction to Bayesian Networks By: Rich Neapolitan.

GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.

Visual Recognition Tutorial

EE-148 Expectation Maximization Markus Weber 5/11/99.

EE 290A: Generalized Principal Component Analysis Lecture 6: Iterative Methods for Mixture-Model Segmentation Sastry & Yang © Spring, 2011EE 290A, University.

Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:

First introduced in 1977 Lots of mathematical derivation Problem : given a set of data (data is incomplete or having missing values). Goal : assume the.

Data Mining Techniques Outline

A gentle introduction to Gaussian distribution. Review Random variable Coin flip experiment X = 0X = 1 X: Random variable.

Lecture 9 Hidden Markov Models BioE 480 Sept 21, 2004.

Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.

Expectation Maximization for GMM Comp344 Tutorial Kai Zhang.

Expectation-Maximization

Visual Recognition Tutorial

Hidden Markov Models. Hidden Markov Model In some Markov processes, we may not be able to observe the states directly.

Bayesian Networks Alan Ritter.

Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9

Lecture 7 1 Statistics Statistics: 1. Model 2. Estimation 3. Hypothesis test.

Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture.

Estimating cancer survival and clinical outcome based on genetic tumor progression scores Jörg Rahnenführer 1,*, Niko Beerenwinkel 1,, Wolfgang A. Schulz.

Likelihood probability of observing the data given a model with certain parameters Maximum Likelihood Estimation (MLE) –find the parameter combination.

EM and expected complete log-likelihood Mixture of Experts

1 Physical Fluctuomatics 5th and 6th Probabilistic information processing by Gaussian graphical model Kazuyuki Tanaka Graduate School of Information Sciences,

1 HMM - Part 2 Review of the last lecture The EM algorithm Continuous density HMM.

Recitation on EM slides taken from:

First topic: clustering and pattern recognition Marc Sobel.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.

ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: 2620a.htm Office: TEL 3049.

HMM - Part 2 The EM algorithm Continuous density HMM.

CS Statistical Machine learning Lecture 24

Computer Vision Lecture 6. Probabilistic Methods in Segmentation.

CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.

Active learning Haidong Shi, Nanyi Zeng Nov,12,2008.

Lecture 2: Statistical learning primer for biologists

Learning Sequence Motifs Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 Mark Craven

Information Bottleneck versus Maximum Likelihood Felix Polyakov.

M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.

Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Estimating Volatilities and Correlations

Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.

A Cooperative Coevolutionary Genetic Algorithm for Learning Bayesian Network Structures Arthur Carvalho

Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.

For multivariate data of a continuous nature, attention has focussed on the use of multivariate normal components because of their computational convenience.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor ： Dr. Hsu Graduate ： Yu Cheng Chen Author: Lynette.

Information Bottleneck versus Maximum Likelihood Felix Polyakov.

Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.

Computational Intelligence: Methods and Applications Lecture 26 Density estimation, Expectation Maximization. Włodzisław Duch Dept. of Informatics, UMK.

EM Algorithm 主講人：虞台文大同大學資工所智慧型多媒體研究室. Contents Introduction Example  Missing Data Example  Mixed Attributes Example  Mixture Main Body Mixture Model.

Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.

An Algorithm to Learn the Structure of a Bayesian Network Çiğdem Gündüz Olcay Taner Yıldız Ethem Alpaydın Computer Engineering Taner Bilgiç Industrial.

MathematicalMarketing Slide 3c.1 Mathematical Tools Chapter 3: Part c – Parameter Estimation We will be discussing  Nonlinear Parameter Estimation  Maximum.

Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.

Classification of unlabeled data:

Latent Variables, Mixture Models and EM

Hidden Markov Models Part 2: Algorithms

Bayesian Models in Machine Learning

ITEC 2620M Introduction to Data Structures

Boltzmann Machine (BM) (§6.4)

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.

Presentation transcript:

Learning Multiple Evolutionary Pathways from Cross-sectional Data Niko Beerenwinkel, Jorg Rahnenfuhrer, Martin Daumer, Daniel Hoffmann,Rolf Kaiser, Joachim Selbig,Thomas Lengauer, RECOMB’04 Ning Wei Texas A&M University

Content Introduction Definition Algorithm Discussion Homework

Introduction A directed weighted trees, which generates the probability distribution on the set of all patterns of genetic events, to identify directed dependencies between mutational events in the evolutionary processes. Vertices represent the mutational events. Edge weights represent conditional probabilities between events. EM ( Expectation--Maximization )-like Learning Algorithm to generate the mutagenetic trees.

Definition Different events l {1, …, l}, initially as “null events”. A pattern xi of events: xi = (xi1,..., xil) A set of observed patterns is represented by the matrix: A mutagenetic tree consists of vertices V, edges E with at most one entering edge, null events or no entering edge vertex r, and a map p(e) =Pr(j2|j1) - >[0,1] (the conditional probability of event j2 given j1 occurred, if p(e)=0, delete e from E).

Definition (cont’d) Here is the example of the mutagenetic tree.

Definition (cont’d) Since the noisy real world data, they have likelihood zero under estimated mutagenetic tree and do not capture all of the pathways. K-mutagenetic trees mixture model: Given and is a random variable in [0,1], the model: The likelihood of a pattern x where and S is the set of all vertices reachable from r in the sub tree (V, E’)

Definition (cont’d) Example for calculating the likelihood between two events 70R and 219Q: L(x|T)=0.46*0.43*(1-0.46)*( )=0.037

Definition (cont’d) Under the mixture model M, the tree would be a star with all patterns of events positive likelihood. Assume that in addition to different pathways of accumulation of events, there is a certain probability  of any event occurring spontaneously independent of all other events referred as the noise component of the model. The algorithm assigns the  to T1, and T1 is a star tree with p(e)=  for all eE1.

EM-like Learning Algorithm To construct k-mutagenetic mixture model trees that maximize the log-likelihood of the data from the observed patterns X.

EM-like Learning Algorithm (cont’d) Initialize the parameters, set the responsibilities for estimating the joint probabilities:

EM-like Learning Algorithm (cont’d) Update the model parameters:

EM-like Learning Algorithm (cont’d) Compute the weight w from the previous pk and find out the maximum weight tree Tk: Pr(j) the marginal probability of event j; Pr(j1, j2) is the joint probability of events j1 and j2 estimated by the previous steps.

EM-like Learning Algorithm (cont’d) Update the responsibilities:

EM-like Learning Algorithm (cont’d) Iterate the above steps until the log-likelihood function doe not increase any more. According to the distribution of log-likelihood and the number of trees K, pick K=3.

EM-like Learning Algorithm (cont’d) 3-Mutagenetics trees mixture model:

Discussion The mixture model maps 87% of the observed patterns into their identified mutagenetic trees. Study the stability of the mutagenetic trees by bootstrap. Applications are the resistance pathways by drugs, designing the treatment protocols, combination therapy pathways and so on. EM learning algorithm could be extended to a model- based clustering algorithm. The mixture model may be adequate to identify a few parallel pathways, while directed acyclic graph (DAG) would be expected to be more appropriate for highly connected network of events aided by the ML (maximum likelihood) estimation of the model parameters.

Homework Modify the K-mutagenetic tress mixture model learning algorithm to obtain a clustering algorithm. Explain the reasons and the predefined conditions for applying your clustering algorithm. (Hint: section 6.2 of the paper.)

Thanks Any question, please me at