Learning Module Networks Eran Segal Stanford University Joint work with: Dana Pe’er (Hebrew U.) Daphne Koller (Stanford) Aviv Regev (Harvard) Nir Friedman.

Learning Module Networks Eran Segal Stanford University Joint work with: Dana Pe’er (Hebrew U.) Daphne Koller (Stanford) Aviv Regev (Harvard) Nir Friedman (Hebrew U.)

Learning Bayesian Networks Density estimation Model data distribution in population Probabilistic inference: Prediction Classification Dependency structure Interactions between variables Causality Scientific discovery Data INTL MSFT MOT NVLS

Stock Market Learn dependency of stock prices as a function of Global influencing factors Sector influencing factors Price of other major stocks Mar.’02May.’02Aug.’02Oct.’02Jan.’03 Jan.’02 MSFT DELL INTL NVLS MOTI 10 20 30 40 50 60 70 MSFT DELL INTL NVLS MOT

Stock Market Learn dependency of stock prices as a function of Global influencing factors Sector influencing factors Price of other major stocks Mar.’02May.’02Aug.’02Oct.’02Jan.’03 Jan.’02 MSFT DELL INTL NVLS MOTI 10 20 30 40 50 60 70 DELL INTL NVLS MOT MSFT

Stock Market Learn dependency of stock prices as a function of Global influencing factors Sector influencing factors Price of other major stocks Mar.’02May.’02Aug.’02Oct.’02Jan.’03 Jan.’02 MSFT DELL INTL NVLS MOTI 10 20 30 40 50 60 70 INTL MSFT DELL NVLS MOT Bayesian Network

Fragment of learned BN Stock Market 4411 stocks (variables) 273 trading days (instances) from Jan.’02 – Mar.’03 Problems Statistical robustness Interpretability

Key Observation Many stocks depend on the same influencing factors in much the same way Example: Intel, Novelus, Motorola, Dell depend on the price of Microsoft Many other domains with similar characteristics Gene expression Collaborative filtering Computer network performance … Mar.’02May.’02Aug.’02Oct.’02Jan.’03 Jan.’02 MSFT DELL INTL NVLS MOTI 10 20 30 40 50 60 70

INTL MSFT MOT DELL AMAT HPQ CPD 2 CPD 1 CPD 3 Bayesian Network The Module Network Idea CPD 6 CPD 3 CPD 5 CPD 1 CPD 2 CPD 4 INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I Module Network

Problems and Solutions Statistical robustness Interpretability Share parameters and dependencies between variables with similar behavior Explicit modeling of modular structure

Outline Module Network Probabilistic model Learning the model Experimental results

Module Network Components Module Assignment Function A(MSFT)=M I A(MOT)=A(DELL)=A(INTL) =M II A(AMAT)= A(HPQ)=M III INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I INTL MSFT MOT DELL AMAT HPQ

Module Network Components Module Assignment Function Set of parents for each module Pa(M I )=  Pa(M II )={MSFT} Pa(M III )={DELL, INTL} INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I

Module Network Components Module Assignment Function Set of parents for each module CPD template for each module INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I

Ground Bayesian Network A module network induces a ground BN over X A module network defines a coherent probabilty distribution over X if the ground BN is acyclic INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I INTL MSFT MOT DELL AMAT HPQ Ground Bayesian Network

Module Graph Nodes correspond to modules M i  M j if at least one variable in M i is a parent of M j INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I MIMI M II M III Module graph Theorem: The ground BN is acyclic if the module graph is acyclic Acyclicity checked efficiently using the module graph

Outline Module Network Probabilistic model Learning the model Experimental results

Learning Overview Given data D, find assignment function A and structure S that maximize the Bayesian score Marginal data likelihood Data likelihood Parameter prior Marginal likelihood Assignment / structure prior

Instance 3 Likelihood Function Module III Module II Module I INTL MSFT MOT DELL AMAT HPQ Instance 1 Instance 2 MIMI  M II |MSFT  M III |DELL,INTL Sufficient statistics of (X,Y) Likelihood function decomposes by modules

Bayesian Score Decomposition Bayesian score decomposes by modules INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I Delete INTL  Module III Module j variablesModule j parents

Bayesian Score Decomposition Bayesian score decomposes by modules INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I A(MOT)=2  A(MOT)=1

Algorithm Overview Find assignment function A and structure S that maximize the Bayesian score Dependency structure S Improve structure Improve assignments Find initial assignment A Assignment function A

Initial Assignment Function x[1] AMAT MSFTDELL MOT HPQ INTL x[2] x[3] x[4] Variables (stocks) Instances (trading days) Find variables that are similar across instances A(MOT)= M II A(INTL)= M II A(DELL)= M II MSFT MOTHPQ AMAT DELL INTL 123

Learning Dependency Structure Heuristic search with operators Add/delete parent for module Cannot reverse edges Handle acyclicity Can be checked efficiently on the module graph Efficient computation After applying operator for module M j, only update score of operators for module M j INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I MIMI M II M III X INTL  Module I INTL  Module III X  MSFT  Module II

Learning Dependency Structure Structure search done at module level Parent selection Reduced search space relative to BN Acyclicity checking Individual variables only used for computation of sufficient statistics

Learning Assignment Function A(DELL)=M I Score: 0.7 INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I DELL

Learning Assignment Function A(DELL)=M I Score: 0.7 A(DELL)=M II Score: 0.9 INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I DELL

Learning Assignment Function A(DELL)=M I Score: 0.7 A(DELL)=M II Score: 0.9 A(DELL)=M III Score: cyclic! INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I DELL

Learning Assignment Function A(DELL)=M I Score: 0.7 A(DELL)=M II Score: 0.9 A(DELL)=M III Score: cyclic! INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I

Ideal Algorithm Learn the module assignment of all variables simultaneously

Problem Due to acyclicity cannot optimize assignment for variables separately DELL Module I AMAT Module III MSFT Module II HPQ Module IV MIMI M II M III Module Network Module graph M IV A(DELL)=Module IV A(MSFT)=Module III DELL MSFT DELL

Learning Assignment Function Sequential update algorithm Iterate over all variables For each variable, find its optimal assignment given the current assignment to all other variables Efficient computation When changing assignment from M i to M j, only need to recompute score for modules i and j

Learning the Model Initialize module assignment A Optimize structure S Optimize module assignment A For each variable, find its optimal assignment given the current assignment to all other variables INTL MSFT MOT DELL AMAT HPQ Module III Module II Module I INTL MSFT MOT DELL AMAT HPQ MOT

Related Work Bayesian networks Parameter sharing PRMs OOBNs Module Networks Shared structure X  X X  Shared parameters   X   Learn parameter sharing X X  Langseth+al N/A Learn structure X   

Outline Module Network Probabilistic model Learning the model Experimental results Statistical validation Case study: Gene regulation

Learning Algorithm Performance -131 -130 -129 -128 05101520 Bayesian score (avg. per gene) Algorithm iterations Structure change iterations

-800 -750 -700 -650 -600 -550 -500 -450 020406080100120140160180200 Test data likelihood (per instance) Number of modules 25 instances 50 instances 100 instances 200 instances 500 instances Generalization to Test Data Synthetic data: 10 modules, 500 variables Best performance achieved for models with 10 modules

Generalization to Test Data Test data likelihood (per instance) Number of modules Synthetic data: 10 modules, 500 variables 25 instances 50 instances 100 instances 200 instances 500 instances Gain beyond 100 instances is small

Structure Recovery Graph Synthetic data: 10 modules, 500 variables Number of modules Recovered structure (% correct) 25 instances 50 instances 200 instances 500 instances 100 instances 74% of 2250 parent- child relationships recovered

Stock Market 4411 variables (stocks), 273 instances (trading days) Comparison to Bayesian networks (cross validation) Test Data Log-Likelihood (gain per instance) Number of modules 400 450 500 550 600 050100150200250300 0 Bayesian network performance

Regulatory Networks Learn structure of regulatory networks: Which genes are regulated by each regulator

Gene Expression Data Measures mRNA level for all genes in one condition Learn dependency of the expression of genes as a function of expression of regulators Experiments Genes Induced Repressed

Gene Expression 2355 variables (genes), 173 instances (arrays) Comparison to Bayesian networks Test Data Log-Likelihood (gain per instance) Number of modules -150 -100 -50 0 50 100 150 0 100200300400500 Bayesian network performance

Biological Evaluation Find sets of co-regulated genes (regulatory module) Find the regulators of each module Segal et al., Nature Genetics, 2003 46/50 30/50

Experimental Design Hypothesis: Regulator ‘X’ activates process ‘Y’ Experiment: Knock out ‘X’ and repeat experiment HAP4  Ypl230W true false true false X ? Segal et al., Nature Genetics, 2003

wt   Ypl230w 0 3 5 7 9 24 0 2 5 7 9 24 (hrs.) >16x 341 differentially expressed genes 0 7 15 30 60 wt  (min.)  Ppt1 >4x 602 0 5 15 30 60 wt  (min.)  Kin82 >4x 281 Differentially Expressed Genes Segal et al., Nature Genetics, 2003

Were the differentially expressed genes predicted as targets? Rank modules by enrichment for diff. expressed genes #ModuleSignificance 14 Ribosomal and phosphate metabolism 8/32, 9e3 11 Amino acid and purine metabolism11/53, 1e2 15 mRNA, rRNA and tRNA processing 9/43, 2e2 39 Protein folding6/23, 2e2 30 Cell cycle7/30, 2e2  Ppt1 # ModuleSignificance 39Protein folding7/23, 1e-4 29Cell differentiation6/41, 2e-2 5 Glycolysis and folding5/37, 4e-2 34Mitochondrial and protein fate 5/37, 4e-2  Ypl230w #ModuleSignificance 3Energy and osmotic stress I8/31, 1e4 2Energy, osmolarity & cAMP signaling9/64, 6e3 15 mRNA, rRNA and tRNA processing6/43, 2e2  Kin82 Biological Experiments Validation All regulators regulate predicted modules Segal et al., Nature Genetics, 2003

Summary Probabilistic model for learning modules of variables and their structural dependencies Improved performance over Bayesian networks Statistical robustness Interpretability Application to gene regulation Reconstruction of many known regulatory modules Prediction of targets for unknown regulators

Learning Module Networks Eran Segal Stanford University Joint work with: Dana Pe’er (Hebrew U.) Daphne Koller (Stanford) Aviv Regev (Harvard) Nir Friedman.

Similar presentations

Presentation on theme: "Learning Module Networks Eran Segal Stanford University Joint work with: Dana Pe’er (Hebrew U.) Daphne Koller (Stanford) Aviv Regev (Harvard) Nir Friedman."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning Module Networks Eran Segal Stanford University Joint work with: Dana Pe’er (Hebrew U.) Daphne Koller (Stanford) Aviv Regev (Harvard) Nir Friedman.

Similar presentations

Presentation on theme: "Learning Module Networks Eran Segal Stanford University Joint work with: Dana Pe’er (Hebrew U.) Daphne Koller (Stanford) Aviv Regev (Harvard) Nir Friedman."— Presentation transcript:

Similar presentations

About project

Feedback