Paper Review on Cross- species Microarray Comparison Hong Lu 2008-10-14.

Slides:



Advertisements
Similar presentations
Quick Lesson on dN/dS Neutral Selection Codon Degeneracy Synonymous vs. Non-synonymous dN/dS ratios Why Selection? The Problem.
Advertisements

Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Bivariate Analyses.
Departments of Medicine and Biostatistics
Sociology 601 Class 13: October 13, 2009 Measures of association for tables (8.4) –Difference of proportions –Ratios of proportions –the odds ratio Measures.
Microarray Normalization
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
Mutual Information Mathematical Biology Seminar
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
Lecture 9: One Way ANOVA Between Subjects
ViaLogy Lien Chung Jim Breaux, Ph.D. SoCalBSI 2004 “ Improvements to Microarray Analytical Methods and Development of Differential Expression Toolkit ”
Significance Tests P-values and Q-values. Outline Statistical significance in multiple testing Statistical significance in multiple testing Empirical.
Social Research Methods
An Introduction to Logistic Regression
STATISTICS David Pieper, Ph.D.
Adaptive Molecular Evolution Nonsynonymous vs Synonymous.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
ANOVA and Regression Brian Healy, PhD.
Genomic Profiles of Brain Tissue in Humans and Chimpanzees.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Inferential Statistics: SPSS
Wednesday PM  Presentation of AM results  Multiple linear regression Simultaneous Simultaneous Stepwise Stepwise Hierarchical Hierarchical  Logistic.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
Gene Set Enrichment Analysis (GSEA)
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Proliferation cluster (G12) Figure S1 A The proliferation cluster is a stable one. A dendrogram depicting results of cluster analysis of all varying genes.
Epigenetic Analysis BIOS Statistics for Systems Biology Spring 2008.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
1 Review of ANOVA & Inferences About The Pearson Correlation Coefficient Heibatollah Baghi, and Mastee Badii.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
Construction of Substitution Matrices
1 Inferences About The Pearson Correlation Coefficient.
Education 793 Class Notes Multiple Regression 19 November 2003.
Chapter 13 Multiple Regression
Quantitative analysis of 2D gels Generalities. Applications Mutant / wild type Physiological conditions Tissue specific expression Disease / normal state.
Regression & Correlation. Review: Types of Variables & Steps in Analysis.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
MRNA Expression Experiment Measurement Unit Array Probe Gene Sequence n n n Clinical Sample Anatomy Ontology n 1 Patient 1 n Disease n n ProjectPlatform.
Differential analysis of Eigengene Networks: Finding And Analyzing Shared Modules Across Multiple Microarray Datasets Peter Langfelder and Steve Horvath.
Gene expression & Clustering. Determining gene function Sequence comparison tells us if a gene is similar to another gene, e.g., in a new species –Dynamic.
Cluster validation Integration ICES Bioinformatics.
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
Regression Analysis: Part 2 Inference Dummies / Interactions Multicollinearity / Heteroscedasticity Residual Analysis / Outliers.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
Asymmetric Sequence Divergence of Duplicate Genes Experimented By: Gavin Conant and Andreas Wagner Presented By: Jennifer Case and Jonathan Hobbs.
Equivalent Opposite PTPRC low  CD19 low FAM60A low  NUAK1 high XIST high  RPS4Y1 low COL3A1 high  SPARC high Boolean analysis of large gene-expression.
Sequence Alignment.
CGH Data BIOS Chromosome Re-arrangements.
NCode TM miRNA Analysis Platform Identifies Differentially Expressed Novel miRNAs in Adenocarcinoma Using Clinical Human Samples Provided By BioServe.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Nonparametric Statistics
Additional File 4. Relationships between numbers of probes remaining after masking and rates of evolution. For each tissue and species, the number of 1H_1C.
Nonparametric Statistics
Fig. 1. Genomic structure of the csd gene in A
Factorial Experiments
Nonparametric Statistics
What are the Patterns Of Nucleotide Substitution Within Coding and
Behaviorally dependent allele-specific expression.
Volume 3, Issue 1, Pages (July 2016)
Volume 85, Issue 4, Pages (February 2015)
Michal Levin, Tamar Hashimshony, Florian Wagner, Itai Yanai 
Varying Intolerance of Gene Pathways to Mutational Classes Explain Genetic Convergence across Neuropsychiatric Disorders  Shahar Shohat, Eyal Ben-David,
Derek de Rie and Imad Abuessaisa Presented by: Cassandra Derrick
Presentation transcript:

Paper Review on Cross- species Microarray Comparison Hong Lu

Title: Conservation of Regional Gene Expression in Mouse and Human Brain Authors: Strand AD, Olson JM., et.al Year: 2007 Journal: PLoS genetics

Purpose In-species comparison: To find the differences to distinguish resistant and sensitive tissues and cell types. Cross-species comparison: To provide a framework to explore the ability of mouse to model diseases of the human brain.

HumanGroup IGroup IITotal Tissue 3: caudate, cerebellum, motor cortex 2: caudate cerebellum Persons man8715 woman426 Total12921 Total Slides 12 x 3 = 369 x 2 = 1854 Age Range36 ~ 7722 ~ 7222 ~ 77 Mean AffymetrixHG-U133A Probesets #22,283 Materials

SpeciesHumanMouse(C57BL) Tissue 3 caudate, cerebellum, motor cortex Sample Male81 Female45 Total126 Total Slides 12 x 3 = 366 x 3 = 18 Age Range36 ~ 77 (years)35 (days) Mean58 (years)35 (days) AffymetrixHG-U133AMOE_430A_2 Probesets #22,28322,690

Microarray analysis 1) Normalize the CEL files with Robust Multiple- array Average (RMA). 2) Fit a linear model for each of three pairs with LIMMA (bioconductor package) gene expression ≈ donor + tissue type Caudate/Cerebellum BA4 Cortex/Cerebellum BA4 Cortex/Caudate 3) Get log ratio, paired t-statistics and p-values

Sample result (human) ScoreCaudate/Cerebellum… CaudateCerebellumMotor cortexProbeset IDLog Ratio tP.value… _at E-21… _at E-22… _at E-22… Caudate score = t-score(Caudate/Cerebellum) + t-score(Caudate/BA4 Cortex)

Different Regions of the Brain Show Many Statistically Significant Differentially Expressed Genes

To select sets of genes whose expression was highly enriched in one of the three regions 1) p < and log ratio ≥ 1 in both relevant pair-wise comparisons. 2) The log ratios of the two relevant comparisons were summed, such as log2(BA4/caudate) + log2(BA4/cerebellum) would be candidate BA4 genes Caudate CerebellumBA4 Cortex

3) Order sum of log ratios 4) if summed regional score >2 in more than one region, probesets were culled from the list. Table 3:Selected Regionally Enriched Genes in Human and Mouse Brain Tissues

Gene Expression Variation between Tissues and Individuals gene expression ≈ donor + tissue type Within-tissue variance VS Between-tissue variance The variance for a probeset, across n samples, was calculated by where x i is the RMA signal for probeset i on array n.

The between-tissue variability was greater for 89% of the human probesets and 85% of the mouse probesets. Conclusion: Compared to expression dictated by regional identity, age and gender appear to have effects of small magnitude or of large magnitude on a small fraction of genes, even in humans.

Cross-Species Comparison of Regional Gene Expression What’s the relationship between mouse probesets and human probesets? ENSEMBL Mouse probesets  Mouse ENSEMBL identities (Example: _at) Human probesets  Human ENSEMBL identities (209141_at)

dN/dS dN (number of nonsynonymous substitutions / number of nonsynonymous sites) dS (number of synonymous substitutions / number of synonymous sites) dN/dS was generated using the codeml (PAML package, pair-wise Maximum Likelihood Method) with F3 × 4 codon evolution model

Pick up 2,998 one-to-one orthologus pairs. Compute normalized Euclidian distance between all possible nonself pairs of tissues. where there are g probesets and x and y are any two mouse or human samples. Euclidian distances between regions were calculated using the mean RMA probeset signals for each tissue.

Conclusion: Orthologous Brain Regions between Species Are More Similar to Each Other than to Different Regions within a Species

Analysis of GO categories Human: 70.6% of the probesets had an assigned GO category. Mouse: 66.2% of the probesets had an assigned GO category. For each GO category, The total number of probes in that category (a) VS The number of probes appearing on a list of differentially expressed probes (p < 0.05) (b) Fisher's exact testPearson chi-square If a or b < 10Otherwise To detect which category is over-represented.

Conclusion: Mouse and Human Brain Regions Share a Higher Number of Overrepresented Functional Groups than Would Be Expected by Chance

Relationships between Tissue-Specific Expression, Conservation of Sequence, and Conservation of Expression (A)X-axis: dN/dS ratios, least conserved (left) to most conserved (right). Y-axis: Correlation coefficient between human and mouse log ratios. (B)X-axis: The percent nucleotide identity, low (left) to high (right). Y-axis: Correlation coefficient between human and mouse log ratios.

Conclusion: Genes with High Variance across Tissues Have Greater Conservation of Nucleotide Sequence

Conclusion 1)In-species comparison: The different brain regions have distinctly different expression profiles. 2)Cross-species comparison: Region-specific genes are conserved at both the sequence and gene expression levels. (positive correlated)

Advantage and Shortage?

Thanks