Presentation is loading. Please wait.

Presentation is loading. Please wait.

Microarray Data Analysis Using R Studies in Tissue Databases Mark Reimers, NCI.

Similar presentations


Presentation on theme: "Microarray Data Analysis Using R Studies in Tissue Databases Mark Reimers, NCI."— Presentation transcript:

1 Microarray Data Analysis Using R Studies in Tissue Databases Mark Reimers, NCI

2 Outline The GNF tissue database The GNF tissue database Exploratory analysis - clustering Exploratory analysis - clustering Positional co-regulation Positional co-regulation Insight via co-regulation Insight via co-regulation Apoptotic configuration of tissues Apoptotic configuration of tissues Probe level analysis Probe level analysis

3 The GNF Expression Atlas Su et al ( PNAS 2004) hybridized 150 samples from 61 tissues to Affymetrix U133A and custom arrays Su et al ( PNAS 2004) hybridized 150 samples from 61 tissues to Affymetrix U133A and custom arrays Variation in gene expression (as proportion of transcriptome) Variation in gene expression (as proportion of transcriptome) 95% show at least one 2-fold change among 61 tissues 95% show at least one 2-fold change among 61 tissues 37% show more than 2-fold differences between lowest 10% and highest 10% 37% show more than 2-fold differences between lowest 10% and highest 10%

4 Clustering samples All biological replicates are nearest neighbors All biological replicates are nearest neighbors Dendrogram reflects discrepancy between healthy and cancerous Dendrogram reflects discrepancy between healthy and cancerous

5 Co-regulation of Nearby Genes Some groups of genes next to one another on chromosome show high correlation across tissues Some groups of genes next to one another on chromosome show high correlation across tissues

6 Significance of Co-regulation How often would such correlations happen ‘by chance’ - eg. by selecting genes at random? How often would such correlations happen ‘by chance’ - eg. by selecting genes at random? Three random measures would have correlation greater than 0.6 with p < 10 -20 ! Three random measures would have correlation greater than 0.6 with p < 10 -20 ! However 3 genes selected at random from atlas have probability ~ 10 -3 of having all corrs > 0.6 However 3 genes selected at random from atlas have probability ~ 10 -3 of having all corrs > 0.6 In 30,000 positions, we should see 30 In 30,000 positions, we should see 30 156 regions of high correlation determined 156 regions of high correlation determined Many are paralogs Many are paralogs Perhaps 50% false discovery rate among the rest Perhaps 50% false discovery rate among the rest

7 Prediction of Function Zhang, et al (J. Biol, 2004, 3:21) hybridized 55 mouse tissues to spotted oligo arrays Zhang, et al (J. Biol, 2004, 3:21) hybridized 55 mouse tissues to spotted oligo arrays Hypothesis: genes with similar tissue expression patterns share similar function Hypothesis: genes with similar tissue expression patterns share similar function Able to recover prediction of GO biological process for known genes with better than 50% accuracy for many categories Able to recover prediction of GO biological process for known genes with better than 50% accuracy for many categories Extended prediction to 1,092 uncharacterized transcripts Extended prediction to 1,092 uncharacterized transcripts

8 Investigation of Poorly Characterized Gene - Top1MT 10-fold variation in expression (odd for a ‘housekeeping gene’) 10-fold variation in expression (odd for a ‘housekeeping gene’) >50 genes with expression highly correlated (.75) with Top1MT across tissue database >50 genes with expression highly correlated (.75) with Top1MT across tissue database Large proportion are splicing factors Large proportion are splicing factors Top1MT has an odd splice junction in intron 1, and may depend critically on abundant splicing factors Top1MT has an odd splice junction in intron 1, and may depend critically on abundant splicing factors

9 Apoptosis Patterns Majority of epithelial tissues show common pattern (indisposed to apoptosis) Majority of epithelial tissues show common pattern (indisposed to apoptosis) Blood cells show variety of patterns Blood cells show variety of patterns

10 Exploration of Probe Sets Examine correlation of probe sets across 150 samples Examine correlation of probe sets across 150 samples All but one probe verified to match latest Unigene build for gene All but one probe verified to match latest Unigene build for gene Probes organized by position in 3’ end Probes organized by position in 3’ end Red: 1; White: < 0

11 Quality of Arrays Regional bias images Regional bias images


Download ppt "Microarray Data Analysis Using R Studies in Tissue Databases Mark Reimers, NCI."

Similar presentations


Ads by Google