1 A human B lymphocyte interactome for the dissection of dysregulated pathways in lymphoid malignancies. Andrea Califano: MAGNet: Center for the Multiscale Analysis of Genetic and Cellular Networks C2B2: Center for Computational Biology and Bioinformatics HICCC: Herbert Irving Comprehensive Cancer Center Columbia University W. Han and R.A.Weinberg, Nature Reviews A subway Map of Cancer
The germinal center Naïve BCentroblastCentrocyte Plasma Cell Memory B Pre-GC Germinal Center (GC) Post-GC B-Cell Subpopulations B-CLL Follicular Lymphoma Burkitt Lymphoma Diffuse Large Cell Lymphoma Hodgkin Disease Multiple Myeloma B-Cell Derived Malignancies Mantle Cell Lymphoma unmutated mutated Ig V region (CD5 + )
Differential Expression Analysis Cancer Research Approaches Long List of Genes Long List of Genes Unstable Gene Selection Unstable Gene Selection Sample Size Sample Size Experiment Selection Experiment Selection Causal Genes? Causal Genes? Dysregulation Absent Dysregulation Absent
Interactome Interactome Definition: Represents > 100 genes or proteins Represents > 100 genes or proteins Visually pleasing Visually pleasing Void of any information content (if possible) Void of any information content (if possible) Scale Free Scale Free Published in Nature(s) or Science Published in Nature(s) or Science
Using the Interactome as a Map TF TF M TF M M Drug-basedPerturbation CausalLesion
Information Theory Jane MYC JoeMary TERTNotch1
ARACNE Graphical representation: all 56 1 st neighbors all 56 1 st neighbors the top ranking nd neighbors the top ranking nd neighbors Previously reported c-MYC targets NeighborTotal #TotalValidated by ChIP 1 st56 22 (39.3%) 11 2 nd (19.4%) 251
ChIP Validation BYSL 5’ 3’ A(+158/+348)B no Ab IgG -MYC no DNA input DNA A(+158/+348) Bb)MRPL12 EIF3S9 EBNA1BP2 BOP1 ATIC BYSL no Ab IgG -MYC no DNA input DNA ZRF1 c4orf9 a) 39418_at NOL5A PRMT3 TCP1 c-MYC targets c-MYC targets NeighborTotal #TotalValidated by ChIP 1 st56 29 (51.8%) 22 (39.3%) 2 nd (19.4%) 251 (12.5%) Expected background ~10% (1,300/12,600)
Copyright ©2006 by the National Academy of Sciences Palomero et al. (2006) Activated NOTCH1 and c-MYC regulatory network in T-ALL. (a) A metagene based on the gene expression signature associated with levels of activated NOTCH1 protein in T-ALL cell lines was integrated in an ARACNe global regulatory network constructed with microarray expression data from T-ALL samples. Neighbors of the NOTCH1 metagene identified as NOTCH1 direct target genes by ChIP-on-chip are shaded in pink, neighbors regulated in T-ALL cells treated with a GSI are shown shaded in blue, and neighbor genes showing both significant promoter occupancy by ChIP-on-chip and regulation upon GSI treatment are shaded in purple. (b) Detailed representation of overlap between NOTCH1 neighbor genes and ChIP-on-chip data. The intensity of each neighboring node represents the significance level of promoter occupancy by NOTCH1. (c) Detailed representation of NOTCH1 neighbor genes regulated in T-ALL cells treated with a GSI. The intensity of each neighboring node, corresponding to the scale panel on the right, represents the significance level of gene regulation by GSI treatment. (d) Representation of direct neighbor genes associated with the NOTCH1 metagene and with c-MYC. Activated NOTCH1 and c-MYC network in T-ALL
A combination of 6 transcription factors combinatorially controls a mesenchymal gene expression signature in glioblastoma, which is associated with poor prognosis Iavarone – Califano Collaboration Mesenchymal Signature Regulation in Glioblastoma
Validation Results Method works well even on very heterogeneous data Validation Rate > ~60%
Alcohol and Cocain Addiction in Rats ChIP Validation of PBX1 Predicted Targets ARACNe Predictions: TF control of addiction signature Rat Brain Microarray Expression Profiles
Regulatory Control in Eukaryotic Cells The ability of a transcription factor (TF) to control its target genes is regulated at multiple levels:The ability of a transcription factor (TF) to control its target genes is regulated at multiple levels: –Post-transcriptionally (On TF protein) PhosphorylationPhosphorylation AcetylationAcetylation StabilityStability OthersOthers –Epigenetically (On Target Genes) DNA MethylationDNA Methylation Histone methylation and acetylationHistone methylation and acetylation Formation of transcription complex with co-factorsFormation of transcription complex with co-factors TF Competition or SynergyTF Competition or Synergy Thus, the function of a TF is dependent upon the presence of other molecules (modulators)
Modulator Analysis Joe Mary Tony MYCTERT GSK3 Degradation Signal MYCTERTGSK3
MINDY (Modulator Inference by Network Dynamics) Single Modulator, Multiple Targets Repressed Targets Activated Targets High/Low Modulator Expression Low/High Modulator Expression
MYC Modulation in Human B Lymphocytes Phenotypic variabilityPhenotypic variability –254 microarray gene expression profiles –27 distinct natural occurring and tumor derived B cell phenotypes MYC proto-oncogeneMYC proto-oncogene –Important genetic hub of B cell physiology –Transcription factor known to be widely differentially regulated
Kinases, Phosphatases & Acetyltransferases ModulatorM#M+M-ModeDescriptionEvidence CSNK2A Casein kinase 2, alpha 1 HPRD PPAP2B Phosphatidic acid phosphatase 2B Activates GSK3 HCK Hemopoietic cell kinase BCR pathway SAT Spermidine N1-acetyltransferase DUSP Dual specificity phosphatase 2 Dephosphorylates ERK2 MAP4K MAP kinase kinase kinase kinase 4 BCR pathway PPM1A Protein phosphatase 1A CSNK1D Casein kinase 1, delta GCAT Glycine C-acetyltransferase TRIO Triple functional domain PRKCI Protein kinase C, iota BCR pathway PRKACB Protein kinase, catalytic, beta BCR pathway STK Serine/threonine kinase 38 MTMR Myotubularin related protein 6 NEK NIMA-related kinase 9 MYST MYST histone acetyltransferase 1 MAPK MAP kinase 13 BCR pathway OXSR Oxidative-stress responsive 1 DUSP Dual specificity phosphatase 4 MAP2K MAP kinase kinase 3 BCR pathway PPP4R Protein phosphatase 4, R1 ERK MAP kinase 1 BCR pathway MAP4K MAP kinase kinase kinase kinase 1 BCR pathway CSNK1E Casein kinase 1, epsilon FYN FYN oncogene NEK NIMA-related kinase 7 CSNK2A Casein kinase 2, alpha Related to CSNK2A1 DUSP Dual specificity phosphatase 5 … PP2A Serine/threonine-protein phosphatase 2A HDAC1808- Histone deacetylase (12/28, 43%) STK38 co-precipitates with c-Myc in HeLa
Transcription Factor Analysis ModulatorM#M+M-ModeDescriptionEvidence P BS AHR Aryl hydrocarbon receptor - SMAD Mothers against DPP homolog 3 HPRD- CREM cAMP responsive element modulator 1× DDIT DNA-damage-inducible transcript DRAP DR1-associated protein 1 - ZKSCAN Zinc finger with KRAB and SCAN 1 - BHLHB Basic HLH domain containing, B2 9× NR4A Nuclear receptor subfamily 4, A1 - ATF Activating transcription factor 3 1× UBTF Upstream binding transcription factor - NR4A Nuclear receptor subfamily 4, A2 - SOX SRY-box HOXB Homeo box B7 - BACH bZIP transcription factor 1 5×10 -9 ARNT AHR nuclear translocator - ETV ETS variant gene 5 - IRF Interferon regulatory factor 1 6×10 -8 TCF Transcription factor 12 2×10 -3 GTF2I General transcription factor II, i HPRD- NFKB NFkB 2 (p49/p100) BCR pathway 0.70 FOS v-Fos oncogene homolog 2×10 -3 JUN v-Jun oncogene homolog 1×10 -7 ZNF354A Zinc finger protein 354A - CUTL Cut-like 1 protein - SMARCB Regulator of chromatin, B1 HPRD- CBFA2T Core-binding factor, alpha 2T3 - DBP Albumin D-box binding protein 6×10 -3 MAF v-Maf oncogene homolog - RELB v-Rel oncogene homolog B - ZNF Zinc finger protein ESR Estrogen receptor NFATC Nuclear factor of activated T-cells 4 BCR pathway 0.92 TFEC Transcription factor EC - CITED Cbp/p300-interacting transactivator 2 - NFYB NFkB binding protein HPRD 2×10 -3 … MEF2B14113+MADS box transc. enhancer factor 2, polypeptide B- (18/35, 51%)
Genome-wide Kinome-Transfactome DUSP10CDC25BSTK17BPPAP2BPRKACBTRIB2CCL2DUSP7HCKMAP3K5...ABL1FESAURKCEPHB2ROR2JARID1BAKT1AAK1PTPN3RAF1 ETS MAFF SMAD ATF ZNF SOX CREM CHD E2F GTF3A SS18L HOXC FOXD KLF WT HOXC MYT CRX DGCR6L DUX Kinome (846) Transfactome (889)
PPI Total +- MINDY Total = 889x846 Genome Wide Pathway Validation 2 = 470.6, p < 1x Consider a set of ‘gold standard’ modulators of a TF to include: Direct modulators: signaling proteins with known PPIs with the TFDirect modulators: signaling proteins with known PPIs with the TF Pathway modulators: signaling proteins on any pathway that includes the TFPathway modulators: signaling proteins on any pathway that includes the TF Kegg Pathways
The B Cell Interactome An integrated Cellular Network for the Dissection of Lymphoid Malignancies Celine Lefebvre Kai Wang Wei Keat Lim Katia Basso Riccardo Dalla Favera
Protein-Protein interactions Orthologous interactions in mouse ARACNE (254 B cell expression profiles) TFBS (MATCH) Target gene co- expression Protein-DNA interactions Y2H, MassSpec Orthologous interactions Biological Process annotations RGS4blockRASD1 CKS1AinteractSKP2 CD4bindTFAP2A GPNMBcontainPPFIBP1 TACR1requirePARP1 GeneWays Gene co-expression (254 B cell expression profiles) 3 Lefebvre et. al. A context-specific network of protein-DNA and protein-protein interactions reveals new regulatory motifs in human B cells. Submitted for Recomb Satellite on Systems Biology, Dec. 1-2, 2006
B Cell Interactome Mixed interaction network (Bayesian Evidence Integration) –40,000 Protein-Protein Interactions predicted from: Orthologous interactions (fly, mouse, worm, yeast) Yeast 2-hybrid human datasets Gene co-expression (B cells - mutual information) Gene Ontology biological process annotations GeneWays (literature mining algorithm) Structural Clues (to be added) –10,000 Protein-DNA Interactions from: Orthologous interactions in mouse (Transfac, BIND) Transcription factor binding sites (MATCH) GeneWays ARACNE –120,000 Post-transcriptional Modulator Interactions from: MINDY Structural Clues (to be added) –8,500 genes (900 TFs)
Using the Network for Cancer Research Network Disregulation in Cancer Kartik Mani Celine Lefebvre Kai Wang Wei Keat Lim Adam Margolin Katia Basso Riccardo Dalla Favera
Algorithm Overview 1. 1.Network Model Creation 2. 2.Dysregulated Edge Analysis 3. 3.Root Cause Integrative Analysis TF T1 T2 T3 T4 X X X EntrezNameScorep-value 5090PBX3Inf PHTF E NFX E ZNF E DRAP E-134
Edge Phenotype Map LOF GOF
Network Analysis Dysregulated P-P Normal P-P Modulation Normal P-D Dysregulated P-D Analyze the B Cell Knowledge Base Given a Phenotype Selection (E.g., Follicular Lymphoma) Score = - Log p(g i in S) - Log p(g i modulates S) Probability computed by Fisher Exact Test (FET) S = Set of dysregulated interactions and associated genes TF TF M TF M M Phenotype-Based Dysregulation Interactions TF TF M TF M M Analyze any Dysregulated Module TF TF M M
Gene Scoring TF1 T3T2T1 T5T4 TF2 T3 T2 T1 There are a total of 5 dysregulated interactions There are a total of 5 dysregulated interactions 3 out of 5 are connected to TF1 (out of 6 total) 3 out of 5 are connected to TF1 (out of 6 total) 2 out of 5 are modulated by TF1 2 out of 5 are modulated by TF1 We calculate probability of observing this pattern by chance (Fisher Exact Test) We calculate probability of observing this pattern by chance (Fisher Exact Test) Normal Interaction Dysregulated Interaction Score(TF1) = sig.(Direct Effect) + sig.(Modulator Effect)
Benchmarking (cont’d) PhenotypeCausal GeneDescriptionRankT-Test Germinal Center (GC)BCL6BCL6 necessary for GC formation12238 Follicular Lymphoma (FL)BCL2t(14;18)11696 Burkitt Lymphoma (BL) MYC (MTA1 2 nd ) t(8;14), t(8;2), t(8;22) MTA1 KO mice do not develop BL with MYC transl. 539 Mantle Cell Lymphoma (MCL) Cyclin D1/BCL1t(11;14)168
Conclusions Computational algorithms and integrative methods are starting to achieve high precision in human cells, equivalent to experimental assays (>80%) Cellular Context Specificity is Critical Higher-order interactions cannot be discarded a priori Networks are useful to dissect cellular phenotypes Tools availability: – –ARACNe and the B Cell Interactome are available: – –MINDY will be available by Summer ‘07
Acknowledgments Califano Lab ComputationalCalifano Lab Computational –Adam Margolin –Kai Wang –Ilya Nemenman (LANL) –Nilanjana Banerjee (Philips) –Celine Lefebvre –Wei Keat Lim –Kartik Mani –In Sock Jang –Alberto Ambesi-Imparato –Manjunath Kustagi –Sean Zhou –Kaushal Kumar –Quian Feng –Achint Sethi Califano Lab ExperimentalCalifano Lab Experimental –Rachel Cox –Mariano Alvarez –Brygida Bisikirska –Presha Rajbhandari Institute of Cancer GeneticsInstitute of Cancer Genetics –Riccardo Dalla Favera –Katia Basso –Mas Saito –Ulf Klein Rzhetsky Lab (GeneWays)Rzhetsky Lab (GeneWays) Honig LabHonig Lab –Alona Sosinski IBM CBCIBM CBC –Gustavo Stolovitzky –Yuhai Tu geWorkbench TeamgeWorkbench Team –John Watkinson –Beerooz Badii –Ken Smith –Matt Hall –Xiaoqing Zhang –Eileen Daly –Kiran Keshav –Pavel Morozov –Mary Van Ginholven Funding provided by NCI, NIAID, and the NIH Roadmap