Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine

Slides:



Advertisements
Similar presentations
Statistical Modeling and Data Analysis Given a data set, first question a statistician ask is, “What is the statistical model to this data?” We then characterize.
Advertisements

From the homework: Distribution of DNA fragments generated by Micrococcal nuclease digestion mean(nucs) = bp median(nucs) = 110 bp sd(nucs+ = 17.3.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Bioinformatics at WSU Matt Settles Bioinformatics Core Washington State University Wednesday, April 23, 2008 WSU Linux User Group (LUG)‏
AP Biology Teaching Biology Through Bioinformatics Real world genomics research in your classroom Kim B. Foglia Division Ave. High School Levittown.
Mathematical Statistics, Centre for Mathematical Sciences
Bioinformatics: One Minute and One Hour at a Time Laurie J. Heyer L.R. King Asst. Professor of Mathematics Davidson College
Microarray Data Analysis Stuart M. Brown NYU School of Medicine.
T-Tests.
t-Tests Overview of t-Tests How a t-Test Works How a t-Test Works Single-Sample t Single-Sample t Independent Samples t Independent Samples t Paired.
T-Tests.
Detecting Differentially Expressed Genes Pengyu Hong 09/13/2005.
A Statistical Framework for the Design of Microarray Experiments and Effective Detection of Differential Gene Expression by Shu-Dong Zhang, Timothy W.
Statistics for the Social Sciences
Microarray Data Preprocessing and Clustering Analysis
Differentially expressed genes
Statistical Analysis of Microarray Data
Integrating domain knowledge with statistical and data mining methods for high-density genomic SNP disease association analysis Dinu et al, J. Biomedical.
Statistics for the Social Sciences Psychology 340 Fall 2006 Review For Exam 1.
Lawrence Hunter, Ph.D., Director Computational Bioscience Program UCHSC School of Medicine Mathematics.
GCB/CIS 535 Microarray Topics John Tobias November 8th, 2004.
ViaLogy Lien Chung Jim Breaux, Ph.D. SoCalBSI 2004 “ Improvements to Microarray Analytical Methods and Development of Differential Expression Toolkit ”
Significance Tests P-values and Q-values. Outline Statistical significance in multiple testing Statistical significance in multiple testing Empirical.
Algorithms in Computational Biology Tanya Berger-Wolf Compbio.cs.uic.edu/~tanya/teaching/CompBio January 13, 2006.
Different Expression Multiple Hypothesis Testing STAT115 Spring 2012.
Type your project title here Your name Your teacher’s name Your school
Wfleabase.org/docs/tileMEseq0905.pdf Notes and statistics on base level expression May 2009Don Gilbert Biology Dept., Indiana University
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 – Multiple comparisons, non-normality, outliers Marshall.
A Multivariate Biomarker for Parkinson’s Disease M. Coakley, G. Crocetti, P. Dressner, W. Kellum, T. Lamin The Michael L. Gargano 12 th Annual Research.
KEY CONCEPT Science is a way of thinking, questioning, and gathering evidence.
1 Use of the Half-Normal Probability Plot to Identify Significant Effects for Microarray Data C. F. Jeff Wu University of Michigan (joint work with G.
Panu Somervuo, March 19, cDNA microarrays.
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
Statistics for Water Science: Hypothesis Testing: Fundamental concepts and a survey of methods Unite 5: Module 17, Lecture 2.
Agenda Introduction to microarrays
Exon Array Analysis Changing the Landscape of Gene Expresson Profiling Tzu L. Phang Ph. D. Department of Medicine Division of Pulmonary Sciences and Critical.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Introduction to Microarrays Dr. Özlem İLK & İbrahim ERKAN 2011, Ankara.
1.3 Scientific Thinking and Processes KEY CONCEPT Science is a way of thinking, questioning, and gathering evidence.
Multiple Testing Matthew Kowgier. Multiple Testing In statistics, the multiple comparisons/testing problem occurs when one considers a set of statistical.
Introduction to Statistical Analysis of Gene Expression Data Feng Hong Beespace meeting April 20, 2005.
Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments Presented by Nan Lin 13 October 2002.
Application of Class Discovery and Class Prediction Methods to Microarray Data Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics.
Statistical Testing with Genes Saurabh Sinha CS 466.
1 ArrayTrack Demonstration National Center for Toxicological Research U.S. Food and Drug Administration 3900 NCTR Road, Jefferson, AR
Tom Kepler Santa Fe Institute Normalization and Analysis of DNA Microarray Data by Self-Consistency and Local Regression
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
A Report on CAMDA’01 Biointelligence Lab School of Computer Science and Engineering Seoul National University Kyu-Baek Hwang and Jeong-Ho Chang.
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
For a specific gene x ij = i th measurement under condition j, i=1,…,6; j=1,2 Is a Specific Gene Differentially Expressed Differential expression.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
Manuel Holtgrewe Algorithmic Bioinformatics, Department of Mathematics and Computer Science PMSB Project: RNA-Seq Read Simulation.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
Chapter 7 Statistical Issues in Research Planning and Evaluation.
Other uses of DNA microarrays
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
ANalysis Of VAriance (ANOVA) Used for continuous outcomes with a nominal exposure with three or more categories (groups) Result of test is F statistic.
Canadian Bioinformatics Workshops
基于 R/Bioconductor 进行生物芯片数据分析 曹宗富 博奥生物有限公司
Statistics for the Social Sciences
Central Limit Theorem, z-tests, & t-tests
Example Problem 3.24 Complete analysis.
Topic: Medicine of the future Reading: Harbron, Chris (2006)
Differential Expression of RNA-Seq Data
Presentation transcript:

Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine Microarrays Tzu Lip Phang, Ph.D. Associate Professor of Bioinformatics Division of Pulmonary Sciences and Critical Care Medicine University of Colorado School of Medicine

Data Science AKA BIG DATA

The Devils is in the Details

Workshop

The Central Dogma Transcriptome Genome

Microarrys in the Literature

Microarray: Primer

Basic Statistical Analysis

Power Analysis How many biological replication? My experience; at least 3, preferably 5, even 7 Bioconductor: SSPA

Basic Statistical Analysis

QC Including image analysis, normalization, and data transformation Data normalization: – Remove systematic errors introduced in labeling, hybridization and scanning procedures – Correct these errors while preserve biological variability / information

Why normalization?

To normalize or not to …

Basic Statistical Analysis

Statistical Testing Hypothesis Testing: Is the means of two groups different from each other – Fold Change – Student-T Test

Student-T Test

What is Multiple Comparison Testing??! GenesP-values Critical levelHo Gene <=0.051 Gene <=0.051 Gene <=0.051 Gene <=0.051 Gene <=0.051 Gene 60.09<=0.050 Gene 70.05<=0.050 Gene 80.09<=0.050 Gene 90.2<=0.050 Gene 100.3<=0.050 Alpha level = 0.05

When large number of tests … GenesP-values Critical levelHo Gene <=0.051 Gene <=0.051 Gene <=0.051 Gene <=0.051 Gene <=0.051 Gene 60.09<=0.050 …………… …………… Gene <=0.050 Gene <=0.050 Alpha level = wrong genes …

Correction … Bonferroni GenesP-values Critical levelHo Gene <= Gene <= Gene <= Gene <= Gene <= Gene 60.09<= ……… … ……… … Gene <= Gene <= Alpha level = 0.05 / 1000 =

Strike the balance … BonferroniNo correction False Discovery Rate Most ConservativeMost Lenient The False Discovery Rate (FDR) of a set of predictions is the expected percent of false predictions in the set of predictions. Example: If the algorithm returns 100 genes with false discovery rate of 0.3, then we should expect 70 of them to be correct

Put them together

Basic Statistical Analysis

Biological Interpretation