MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia Armstrong et al, Nature Genetics 30, 41-47 (2002)

Slides:



Advertisements
Similar presentations
Traditional Pathology Meets Next-Generation in Acute Myeloid Leukemia
Advertisements

Chapter 11 Disorders of White Blood Cells and Lymphoid Tissues
From the homework: Distribution of DNA fragments generated by Micrococcal nuclease digestion mean(nucs) = bp median(nucs) = 110 bp sd(nucs+ = 17.3.
Adult Stem Cells Stem Cells, Bone Marrow and Blood
Acute Myelogenous Leukemia and its Impact on the Immune System
LEUKEMIA—HEMATOLOGY {S1}
HOXA9 – AcuteMyeloidLeukemia. HOX Genes encode transcription factors
E2A and acute lymphoblastic leukemias (ALL). A closer look at the E2A gene... Other names: TCF3, ITF1, and Factors E12/E47 Located on chromosome 19 Encodes.
Detecting Differentially Expressed Genes Pengyu Hong 09/13/2005.
Differentially expressed genes
. Differentially Expressed Genes, Class Discovery & Classification.
Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.
MARE 250 Dr. Jason Turner Hypothesis Testing III.
Expression of kinase genes in primary hyperparathyroidism; Adenoma versus hyperplastic parathyroid tissue Pinhas P. Schachter1 M.D., Suhail Ayesh2 PhD,
Generate Affy.dat file Hyb. cRNA Hybridize to Affy arrays Output as Affy.chp file Text Self Organized Maps (SOMs) Functional annotation Pathway assignment.
3 rd Summer School in Computational Biology September 10, 2014 Frank Emmert-Streib & Salissou Moutari Computational Biology and Machine Learning Laboratory.
E2A – bHLH transcription factor-fusion proteins in Leukemia
Statistics Or Do our Data mean Diddly?. Why are stat important Sometimes two data sets look different, but aren’t Other times, two data sets don’t look.
Blood Cancer & Chromosome 21 By Manasi Shah. Core Binding Factor Acute Myeloid Leukemia (CBF-AML) AML is a type of cancer that affects bone marrow and.
Stem cells are relatively ‘unspecialized’ cells that have the unique potential to develop into ‘specialized’ cell types in the body (for example, blood.
Biostatistics-Lecture 3 Estimation, confidence interval and hypothesis testing Ruibin Xi Peking University School of Mathematical Sciences.
Claims about a Population Mean when σ is Known Objective: test a claim.
Bioinformatics Module Lecture 1 Cell biology. Introduction to lecture 1 Introduction to cellular and multicellular biology: – Our current understanding.
Products of haematopoiesis. Leukaemia, the current hypothesis Defect in maturation of white blood cells-may involve a block in differentiation and/or.
MARE 250 Dr. Jason Turner Hypothesis Testing III.
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
Differential Gene Expression Dennis Kostka, Christine Steinhoff Slides adapted from Rainer Spang.
Normal haemopoiesis. ABNORMALITIES IN THE HEMOPOIETIC SYSTEM CAN LEAD TO HEMOGLOBINOPATHIES HEMOPHILIA DEFECTS IN HEMOSTASIS/THROMBOSIS HEMATOLOGICAL.
Multiple Testing in Microarray Data Analysis Mi-Ok Kim.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Adapted from: Wulff HR, Andersen B, Brandenhoff P, Guttler F (1987): What do doctors know about statistics? Statistics in Medicine 6:3-10 Suppose we conduct.
Introduction to Microarrays Dr. Özlem İLK & İbrahim ERKAN 2011, Ankara.
Quick and Simple Statistics Peter Kasper. Basic Concepts Variables & Distributions Variables & Distributions Mean & Standard Deviation Mean & Standard.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
MRNA Expression Experiment Measurement Unit Array Probe Gene Sequence n n n Clinical Sample Anatomy Ontology n 1 Patient 1 n Disease n n ProjectPlatform.
Application of Class Discovery and Class Prediction Methods to Microarray Data Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics.
E2A and Acute Lymphoblastic Leukemia (ALL) Jeremy Petree.
MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia Armstrong et al, Nature Genetics 30, (2002)
“Because clinical experience is still so limited, it is not possible to exclude longterm adverse effects of gene transfer therapy, such as might arise.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
Prof. Yechiam Yemini (YY) Computer Science Department Columbia University (c)Copyrights; Yechiam Yemini; Lecture 2: Introduction to Paradigms 2.3.
Stem Cell Identification Nozad H. Stem cell workshop Stem cell Research center Tabriz medical university In The Name of God.
The Broad Institute of MIT and Harvard Differential Analysis.
Chapter 7: The Distribution of Sample Means. Frequency of Scores Scores Frequency.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
If we are all the same species (Homo sapien), why don’t we all look the same?
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
The Cell Cycle & Cancer What went wrong?!? What is Cancer? Cancer is essentially a failure of cell division control or unrestrained, uncontrolled cell.
Date of download: 6/22/2016 Copyright © 2016 American Medical Association. All rights reserved. From: Association of a Leukemic Stem Cell Gene Expression.
3.1 REVIEW MI Mr. Nuri/EAST Academy.
Estimating the False Discovery Rate in Genome-wide Studies BMI/CS 576 Colin Dewey Fall 2008.
Rebecca L. King, Gerald Wertheim, Michele E. Paessler
Molecular consequences of chromosomal rearrangements that modify the AML1/CBFβ transcription factor complex, the most frequent target of reciprocal translocations.
The Genetic Basis of Cancer
E2A: master regulator of B-cell lymphopoiesis
Acute Myeloid Leukemia
IMMUNOPHENOTYPING LEUKEMIAS AND LYMPHOMAS
Cancer Stem Cells in Hematopoietic Malignancies
Cells and organs of Immune system
Cancer stem cells and their application into targeted therapy for cancer Mol. Bio. Lab Park Ji Won Supervisor ; Dae Youn Hwang.
Andrew G. Muntean, Jay L. Hess  Cancer Cell 
Chapter 7: The Distribution of Sample Means
Immortal Strands? Give Me a Break
Casey Brewer, Elizabeth Chu, Mike Chin, Rong Lu  Cell Reports 
Illustration of the 2-compartment hidden stochastic model for hematopoiesis. Illustration of the 2-compartment hidden stochastic model for hematopoiesis.
Volume 13, Issue 4, Pages (April 2008)
Presentation transcript:

MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia Armstrong et al, Nature Genetics 30, (2002)

E, C&N_log2E expression matrix E log2 E, center, normalize E. Canaani: A.L.L. LEUKEMIA 27 PATIENTS, 14 with MLL translocation, (“MLL”) 13 without (ALL) 3250 genes passed filter

E. Canaani: A.L.L. LEUKEMIA 27 PATIENTS, 14 with MLL translocation, (“MLL”) 13 without (ALL) MLLALL

AIM: FIND THE GENES THAT ARE EITHER ACTIVATED OR SUPPRESSED BY THE t(4,11) CHIMERIC MLL PROTEIN THESE GENES MAY BE THE CAUSE OF CANCER AND TARGETS FOR THERAPY Rozovskaia, Ravid, Getz,....., Canaani: PNAS (2003)

SUPERVISED ANALYSIS: HYPOTHESIS TESTING USING CLINICAL INFORMATION (MLL/ALL=NO TRANS.) IDENTIFY DIFFERENTIATING GENES Basic methodologies1 HYPOTHESIS: THE EXPRESSION LEVELS OF GENE g IN SAMPLES WITH THE MLL TRANSLOCATION AND WITHOUT – ARE DRAWN FROM THE SAME DISTRIBUTION t(4:11) MLL ALL without t(4:11) USE STANDARD STATISTICAL TESTS, FOR ONE GENE AT A TIME TO CALCULATE P g = probability that the 27 expression levels of gene g (14 from MLL, 13 ALL) are taken from the same distribution

gene Cluster Incl. U70321:Human herpes virus entry mediator mRNA MLL: ALL: mean = std = 0.3 mean = 0.81 std = 0.82

histograms mean = std = 0.3mean = 0.81 std = 0.82

NORMALIZED (FREQUENCIES) mean = std = 0.3 mean = 0.81 std = 0.82

t-test T = -6.6 P = 6e-7

SUPERVISED ANALYSIS; t-TEST n SAMPLES; n A KNOWN TO BELONG TO GROUP A, n B TO B (27) (14) (MLL) (13) (ALL) CONSIDER ONE PARTICULAR GENE, g; WE HAVE IT’S n A EXPRESSION LEVELS E gs FOR SAMPLES OF GROUP A n B EXPRESSION LEVELS E gs FOR SAMPLES OF GROUP B T-TEST NULL HYPOTHESIS: ALL n NUMBERS WERE DRAWN FROM THE SAME DISTRIBUTION (GENE g HAS SIMILAR EXPRESSION IN MLL AND ALL) - MEANS, AND - STANDARD DEVIATIONS OF THE TWO GROUPS OF n A, n B NUMBERS. TABLES P-VALUE

SUPERVISED ANALYSIS; t-TEST ASSUMING THAT THE VARIABLES A B ARE NORMAL DISTRIBUTED, THE DISTRIBUTION OF CAN BE READ OFF STANDARD TABLES. IF WE FOUND THE VALUE t g FOR GENE g, WE ASK WHAT IS THE PROBABILITY P TO OBTAIN UNDER THE NULL HYPOTHESIS. e.g. IF P=0.01, THIS MEANS THAT SUCH A VALUE OF t COULD HAVE BEEN OBTAINED BY CHANCE WITH PROBABILITY WHAT IF WE TESTED 2000 GENES AND FOUND 20 ? THE PROBLEM OF MULTIPLE COMPARISONS T-TEST2 RANDOM DATA t 0 Pr(t) p

SUPERVISED ANALYSIS; t-TEST ASSUMING THAT THE VARIABLES A B ARE NORMAL DISTRIBUTED, THE DISTRIBUTION OF CAN BE READ OFF STANDARD TABLES. IF WE FOUND THE VALUE t g FOR GENE g, WE ASK WHAT IS THE PROBABILITY P TO OBTAIN UNDER THE NULL HYPOTHESIS. e.g. IF P=0.01, THIS MEANS THAT SUCH A VALUE OF t COULD HAVE BEEN OBTAINED BY CHANCE WITH PROBABILITY T-TEST2 t 0 Pr(t) p

Cluster Incl. AB005298:Homo sapiens BAI 2 mRNA mean = std = 0.92 mean = 0.29 std = 1.00 MLL: ALL:

histograms

NORMALIZED (FREQUENCIES) mean = std = 0.92mean = 0.29 std = 1.00

t-test T = P = %

gene2000 Homo sapiens Human mRNA for beta-actin MLL: ALL: mean = 0.21 std = 1.2 mean = 0.18 std = 1.46

histograms mean = 0.21 std = 1.2mean = 0.18 std = 1.46

NORMALIZED (FREQUENCIES) mean = 0.21 std = 1.2mean = 0.18 std = 1.46

t-test T = P = 0.96

genes ordered by p-value 929 genes with p < 0.05 ordered by difference of means (ALL –MLL)

929 genes with p < 0.05 MLL ALL (OUT OF 3250 GENES TESTED) 143 genes with p < 0.05 (OUT OF 3250 GENES TESTED) RANDOM DATA

(OUT OF 3250 GENES TESTED) 143 genes with p < 0.05 RANDOM DATA

143 WITH P-value<0.05

SUPERVISED ANALYSIS: HYPOTHESIS TESTING USING CLINICAL INFORMATION (MLL/NO TRANS.) IDENTIFY DIFFERENTIATING GENES Basic methodologies1 HYPOTHESIS: THE EXPRESSION LEVELS OF GENE g IN SAMPLES WITH THE MLL TRANSLOCATION AND WITHOUT – ARE DRAWN FROM THE SAME DISTRIBUTION t(4:11) MLL ALL without t(4:11) USE STANDARD STATISTICAL TESTS, FOR ONE GENE AT A TIME TO CALCULATE P g = probability that the 27 expression levels of gene g (14 from MLL, 13 ALL) are taken from the same distribution BEWARE OF MULTIPLE COMPARISONS!!!

after ttest 0.05 order by diffmeans 929 genes with p < 0.05 Bonferroni – 30 pass at 0.05 This list of 30 genes is “error free” with prob BEWARE OF MULTIPLE COMPARISONS!!! Bonferroni: reject H 0 only for genes with p < 0.05 / N p < 0.05/3250 =

genes ordered by p-value 929 genes with p < 0.05 ordered by difference of means (ALL –MLL)

143 WITH P-value< genes with p < 0.05 RANDOM DATA

sorted p I=820 Q=0.15

how many out of 929 are false? FDR: 929*0.17=158 false separating genes

how many genes at FDR=0.05? 297*0.05=15 false separating genes

15 out of false 15 - false

random data

100separating (p<0.001), 1900 random

MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia Armstrong et al, Nature Genetics 30, (2002) PROBLEM: HIDDEN, CONFOUNDING VARIABLES( (FACTORS)

Hematopoiesis: HSCs can be categorized into long-term self-renewing HSCs, short-term self-renewing HSCs and multipotent progenitors (red arrows indicate self renewal). HSC give rise to common lymphoid progenitors (CLPs) and to common myeloid progenitors (CMPs). CMPs mature into red blood cells, megakaryocyte (cells producing platelets), granulocytes, dendritic cells, and macrophages. The CLP differentiate into B and T cell lymphocytes, natural killer cells and dendritic cells. (adapted from Reya et al., 2001) HEMATOPOIESIS: Differentiation from STEM CELLS to mature cells LEUKEMIA Mature cells: finite lifetime, finite number of divisions fixed class (B stays B etc) STEM CELLS: immortal, (unlimited number of divisions) multipotent (differentiation into many targets) B T NK Red differentiation LEUKEMIA : MALIGNANCY, INDUCED BY MUTATIONS, TRANSLOCATION,.. ACUTE MYELOID LEUKEMIA (AML) ACUTE LYMPHOID LEUKEMIA (ALL)

LEUKEMIA : normal differentiation MALIGNANCY INDUCED BY MUTATION OR TRANSLOCATION UNCONTROLED PROLIFERATION OVERCROWDING & DEATH OF NORMAL CELLS PRO B-cell ALL DIFFERENTIATION ARREST OF DIFFERENTIATION PRE B-cell ALL PRE T-cell ALL EARLY LATE

TRANSLOCATION TRANSLOCATION: DURING DNA REPLICATION TWO STRANDS, FROM TWO DIFFERENT CHROMOSOMES, CROSS MLL – GENE ON BAND 23 OF CHROMOSOME 11 t(4,11)MLL fusion protein AF4 partner gene AF4 – GENE ON BAND 21 OF CHROMOSOME 4 MLL TRANSLOCATIONS ARE IMPLICATED IN 10% OF ALL AND IN UP TO 80% OF INFANT LEUKEMIA

AIM: FIND THE GENES THAT ARE EITHER ACTIVATED OR SUPPRESSED BY THE t(4,11) CHIMERIC MLL PROTEIN THESE GENES MAY BE THE CAUSE OF CANCER AND TARGETS FOR THERAPY Rozovskaia, Ravid, Getz,....., Canaani: PNAS (2003)

EXPRESSION DATA: 27 ALL samples, 14 with MLL TRANS. t(4,11) 13 without TRANS GENES PASSED FILTER CANAANI USE STANDARD STATISTICAL TEST TO LOOK FOR GENES THAT SEPARATE t(4,11) MLL FROM ALL: PROBLEM OF MULTIPLE COMPARISONS SOLVED BY CONTROLLING THE FALSE DISCOVERY RATE (FDR) t(4:11) MLL ALL without t(4:11) There is another factor (differentiation) that separates these two groups of samples! Which of the 230 genes responds to the MLL translocation? 230 genes differentially expressed between ALL with t(4:11) and ALL without, at FDR = t(4:11) MLL ALL without t(4:11) ??

EXPRESSION DATA: 27 ALL samples, 14 with MLL TRANS. t(4,11) 13 without TRANS GENES PASSED FILTER CANAANI USE STANDARD STATISTICAL TEST TO LOOK FOR GENES THAT SEPARATE t(4,11) MLL FROM ALL: PROBLEM OF MULTIPLE COMPARISONS SOLVED BY CONTROLLING THE FALSE DISCOVERY RATE (FDR) t(4:11) MLL ALL without t(4:11) genes differentially expressed between ALL with t(4:11) and ALL without, at FDR = t(4:11) MLL ALL without t(4:11)

two t(4,11) samples seem different – closer to ALL without translocations Mistaken diagnosis? NO!! the t(4,11) samples are known to be EARLY DIFFERENTIATING. Perhaps some of the “separating genes” are not directly sensitive to the presence of the MLL abnormality, but to other characteristics of these cells (such as EARLY DIFFERENTIATION)? Identify the MLL sensitive genes - HOW? USE the CD10- : early differentiating ALL without translocations ALL without t(4:11) ? ALL t(4:11) MLL cd10-

MLL CD10- T, Pre-B ALL Late Differentiation CD10-Translocation Attribute Group No Yes MLL No Yes NoCD10- Yes No ALL Sensitive to Differentiation and/or CD10- Sensitive to Trasloc. and/or Differentiation MLL vs ALL FDR = 5% 448 genes MLL vs CD10- FDR = 12% 144 genes ALL vs CD10- FDR = 12% 167 genes Translocation sensitive Differentiation sensitive Sensitive to Translocation and/or CD10- WHERE ARE THE MLLs THAT LOOK LIKE ALL?

AIM: FIND THE GENES THAT ARE EITHER ACTIVATED OR SUPPRESSED BY THE t(4,11) CHIMERIC MLL PROTEIN FINDING: WE IDENTIFIED 46 GENES THAT ARE ACTIVATED OR SUPPRESSED BY THE MLL ONCOGENE. TARGETS OF NEXT STAGE EXPERIMENTS ON MICE SPINOFF: SOME MLLS ARE LATE DIFFERENTIATORS Rozovskaia et al. PNAS 2003

separation E1E1 E2E2 ALL MLL E 1 -2E 2 = 0 = E 1 - 2E 2 < 0 = E 1 - 2E 2 > 0

projection 1 E1E1 E2E2 ALL MLL w +/- PROJECTIONS ON w – DO SEPARATE ALL FROM MLL

projection 2 E1E1 E2E2 ALL MLL +/- PROJECTIONS ON w’ – DO NOT SEPARATE ALL FROM MLL w’

projection 3 E1E1 E2E2 WELL SEPARATED CENTERS OF MASS - NO SEPARATION OF THE TWO CLOUDS

projection 4 E1E1 E2E2 WEAK SEPARATION OF CENTERS OF MASS – GOOD SEPARATION OF THE TWO CLOUDS

Fisher to perceptron E1E1 E2E2 ALL MLL OPTIMAL LINE TO PROJECT ON FISHER PERCEPTRON