Timothy H. W. Chan, Calum MacAulay, Wan Lam, Stephen Lam, Kim Lonergan, Steven Jones, Marco Marra, Raymond T. Ng Department of Computer Science, University.

Slides:



Advertisements
Similar presentations
Animal, Plant & Soil Science
Advertisements

Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Darya Chudova, Alexander Ihler, Kevin K. Lin, Bogi Andersen and Padhraic Smyth BIOINFORMATICS Gene expression Vol. 25 no , pages
The Independent- Samples t Test Chapter 11. Independent Samples t-Test >Used to compare two means in a between-groups design (i.e., each participant is.
Design of Experiments and Analysis of Variance
Genomic Profiles of Brain Tissue in Humans and Chimpanzees II Naomi Altman Oct 06.
Topic 6: Introduction to Hypothesis Testing
Overview Chan, Timothy1, MacAulay, Calum2, Lam, Wan2, Lam, Stephen2, Lonergan, Kim2, Ng, Raymond2. University of British Columbia, Vancouver, Canada BC.
Using Statistics in Research Psych 231: Research Methods in Psychology.
10 Hypothesis Testing. 10 Hypothesis Testing Statistical hypothesis testing The expression level of a gene in a given condition is measured several.
Differentially expressed genes
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
. Differentially Expressed Genes, Class Discovery & Classification.
Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5 October 30,2008.
Chapter 14 Conducting & Reading Research Baumgartner et al Chapter 14 Inferential Data Analysis.
Today Concepts underlying inferential statistics
Introduction Project goal was to develop simple way to characterize level of access to journal literature in physical sciences and engineering provided.
Chapter 14 Inferential Data Analysis
COURSE: JUST 3900 Tegrity Presentation Developed By: Ethan Cooper Final Exam Review.
8/15/2015Slide 1 The only legitimate mathematical operation that we can use with a variable that we treat as categorical is to count the number of cases.
8/20/2015Slide 1 SOLVING THE PROBLEM The two-sample t-test compare the means for two groups on a single variable. the The paired t-test compares the means.
+ Quantitative Statistics: Chi-Square ScWk 242 – Session 7 Slides.
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
Means Tests Hypothesis Testing Assumptions Testing (Normality)
A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer Simon Chan, Thursday Trainee Seminar – October 11.
Quality Measures for Rehabilitation: Policy, Provider and Patient Perspectives Measuring Clinical Change: Quality Indicators ACRM-ASNR Pre-Conference Institute.
Chapter 15 Data Analysis: Testing for Significant Differences.
Chapter 11 Inference for Distributions AP Statistics 11.1 – Inference for the Mean of a Population.
Jesse Gillis 1 and Paul Pavlidis 2 1. Department of Psychiatry and Centre for High-Throughput Biology University of British Columbia, Vancouver, BC Canada.
Chi-square Test of Independence Steps in Testing Chi-square Test of Independence Hypotheses.
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
Types of Data in FCS Survey Nominal Scale – Labels and categories (branch, farming operation) Ordinal Scale – Order and rank (expectations, future plans,
A Comparison of Statistical Significance Tests for Information Retrieval Evaluation CIKM´07, November 2007.
Statistics Psych 231: Research Methods in Psychology.
Verna Vu & Timothy Abreo
Scenario 6 Distinguishing different types of leukemia to target treatment.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
CHAPTER 11 SECTION 2 Inference for Relationships.
Review of the Basic Logic of NHST Significance tests are used to accept or reject the null hypothesis. This is done by studying the sampling distribution.
Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.
MK346 – Undergraduate Dissertation Preparation Part II - Data Analysis and Significance Testing.
ANOVA: Analysis of Variance.
Application of Class Discovery and Class Prediction Methods to Microarray Data Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics.
KNR 445 Statistics t-tests Slide 1 Introduction to Hypothesis Testing The z-test.
Three Broad Purposes of Quantitative Research 1. Description 2. Theory Testing 3. Theory Generation.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
T tests comparing two means t tests comparing two means.
Inferential Statistics Significance Testing Chapter 4.
WINKS 7 Tutorial 3 Analyzing Summary Data (Using Student’s t-test) Permission granted for use for instruction and for personal use. ©
The Broad Institute of MIT and Harvard Differential Analysis.
Outline of Today’s Discussion 1.The Chi-Square Test of Independence 2.The Chi-Square Test of Goodness of Fit.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Handout Six: Sample Size, Effect Size, Power, and Assumptions of ANOVA EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr.
Copyright (c) Bani Mallick1 STAT 651 Lecture 8. Copyright (c) Bani Mallick2 Topics in Lecture #8 Sign test for paired comparisons Wilcoxon signed rank.
How to do Power & Sample Size Calculations Part 1 **************** GCRC Research-Skills Workshop October 18, 2007 William D. Dupont Department of Biostatistics.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
T tests comparing two means t tests comparing two means.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Statistical Inferences for Variance Objectives: Learn to compare variance of a sample with variance of a population Learn to compare variance of a sample.
Inferential Statistics Psych 231: Research Methods in Psychology.
PSY 325 AID Education Expert/psy325aid.com FOR MORE CLASSES VISIT
CHAPTER 15: THE NUTS AND BOLTS OF USING STATISTICS.
Inferential Statistics
Psych 231: Research Methods in Psychology
Gene expression profiles of T cells.
Distinct subtypes of CAFs are detected in human PDAC
Presentation transcript:

Timothy H. W. Chan, Calum MacAulay, Wan Lam, Stephen Lam, Kim Lonergan, Steven Jones, Marco Marra, Raymond T. Ng Department of Computer Science, University of British Columbia The British Columbia Cancer Research Centre  Previously analyzed publicly available Breast and Brain SAGE libraries using the permutation test (Ng. et al, Frontiers of Cardiovascular Science 2003) and had some success (60% of top ranked genes for breast SAGE data were verified to be related to the neoplastic process).  BC Cancer Research Centre has produced various Lung Cancer SAGE libraries including 5 CIS (carcinoma in situ), 6 Invasive and 17 Normal libraries.  It would be interesting to use the permutation test to contrast and compare the various stages of lung cancer and search for small transcriptional changes (pathway regulators, check points, switches).  To use the permutation test on normal and different stages of lung cancer (CIS and Invasive) SAGE libraries to discover candidate cancer-related genes.  To contrast and compare these two stages of lung cancer.  To demonstrate the advantages and power the permutation test holds over the T-test.  To reduce comparison errors, the tag frequencies are normalized by scaling each library up to 300,000.  Continue to use the permutation test to analyze other SAGE libraries.  The permutation also has the power to detect small transcriptional changes as long as the gene across all the libraries have a consistent Tag count. Further analysis of these low TAG count significant genes (with high permutation scores) is required as they could be vital pathway regulators, checkpoints or switches that may have led to the onset of lung cancer.  Validate genes further by experimentation.  Use validated genes for early cancer detection or derive new treatments from data.  The null hypothesis states that there is no difference between the mean of the normal and the cancer sample. If this were the case, it would make no difference if we “mix up the labels” of the libraries.  The alternative hypothesis states that it does make a difference and the mean of the normal and cancer sample are different.  An investigation is conducted on the top ranked genes for cancer-relation using the currently available literatures on PubMed. Verification Criteria:  Some tags map to more than one gene. To deal with this, the expression level of the tag is assigned to each gene the tag maps to. For instance, if tag A maps to genes 1, 2, and 3, all the genes will be assigned the tag count of tag A. Data Pre-Processing Scoring and Ranking Genes Literature Verification 99% confidence - Output Permutation Test  Null Hypothesis:  Alternative Hypothesis Simulated Normal Pool (same size as normal samples) Pool together cancer and normal libraries Simulated Cancer Pool (same size as cancer samples ) Simulated N PLOT Observed Score those >=99% confidence MeanSum of Squares Standard Deviation Permutation (Z) Score Criteria #Related to: AUp/Down regulated in Lung Cancer B Up/Down regulated in different type of cancer C Oncogene/Tumor suppresor/Mutator D Major component of the cell cycle (neoplastic process), or Angiogenesis ENot previously associated with cancer  Higher permutation scores correspond to either greater differences between the two samples or greater differential consistencies between the two samples.  For each tissue and significant genes, rank the genes by sorting the permutation scores in descending order.  1981 out 32,871 TAGS considered at 99% confidence failed the permutation test for Normal vs Invasive Lung Cancer.  1887 TAGS out of 40,476 TAGS considered at 99% confidence failed the permutation test for Normal vs CIS Lung Cancer  119 TAGS out of 20,077 TAGS considered failed the permutation test for CIS vs Invasive Lung Cancer Power of The Permutation Test  With the permutation test, the number of samples required for the test to be acceptable is relatively low compared to other statistical tests (ie. T-test, chi-square). Top N Ranked TAGSIntersections  The permutation test is great at picking out genes that are related to the neoplastic process.  It is also much better at picking out these genes than the T-test.  The permutation test between Invasive and CIS show that there are 119 Tags that are differentially expressed which suggests that the two stages of cancer have different genes turned on or off. In addition, the intersections between the top ranked genes between Normal vs Invasive And Normal vs CIS are quite low (top 200 only 25% of the Tags intersect) which also suggest differences between the 2 stages. Top 20 TAG That Map to Genes - T-test Results Criteria INV vs Normal CIS vs Normal A00 B01 C 01 D 00 E 56 Total Unique Significant Genes 5 8 Total Hypotheticals 118 Top 20 TAG That Map to Genes - Permutation Test Results Criteria INV vs Normal CIS vs Normal A13* B 45* C 01 D 12 E 83 Total Unique Significant Genes 1512 Total Hypotheticals51  Quality of these genes is mostly dependent on criteria A and B. Following closely are criteria C and D as they are important genes in the neoplastic process  Hypotheticals or genes who have no known function did not meet any of the criteria. * Indicates that there exists a duplicate (more than one TAG match to the same gene).  The low intersections suggest that CIS and Invasive stages of cancer are different.