Download presentation
Presentation is loading. Please wait.
Published byMervyn Lyons Modified over 5 years ago
1
Inferring Cellular Processes from Coexpressing Genes
Daniel Korenblum November 26, 2001
2
Motivation for Clustering
High throughput experiments Reduce complexity by coarse graining: Extract essential features Visualize data matrix entries with efficient display Obtain similarities that reflect biological properties
3
1998: Eisen, Spellman, Brown, & Botstein
Average Linkage Clustering of Time Courses Correlation measures similarity (scale invariant) Fixed offset: Genes assumed symmetric with respect to changes from reference state Reorder genes: Permute rows of expression data matrix Proximity corresponds to similarity
4
What determines the Patterns
Assess the significance of the clusters Could results be statistical artifacts? Swap matrix elements Apply clustering algorithm: See different patterns No prolonged correlations Signal from different conditions counteracts noise from single observations and cDNA variations Biologically interpretable implies significant
5
Gene Shaving Avoids a single reordering for all genes
Different genes may require different measures of similarity Use the principle component of a set of genes (eigengene) as a reference state Select genes with high covariance with the eigengene
6
Gene Shaving, Cont'd High variation across samples
Strong correlation across genes (coherence) Hierarchical methods address variations over samples Supervising affects average gene effects to select strong contributions on predictvie abilities
7
Conclusions Change in methodology over the past few years
Array data holds comprehensive picture of cellular processes
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.