Presentation on theme: "Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences,"— Presentation transcript:
Visualization and analysis of large data collections: a case study applied to confocal microscopy data Wim de Leeuw, Swammerdam Institute for Life Sciences, Amsterdam Pernette Verschure, Swammerdam Institute for Life Sciences, Amsterdam Robert van Liere, Center for Mathematics and Computer Science, Amsterdam
2 Motivation (1): Context: cell biology experiments Phenomenon captured using digital microscopy Experiment characteristics: Biological diversity Not all biological parameters can be controlled Many measurements needed
3 Motivation (2): Visualization and analysis of collections of data sets High variability Non-trivial information extraction (eg segmentation) Noise Visualization Modes: Interactive vs Batch Interactive control+feedback vs static settings of parameters Time consuming vs multiple data sets processed simultaneously Aim: combine advantages of Interactive and Batch Visualization
4 Agenda Biological Problem Chromatin structure and gene control Visualization Problem Data collection description Analysis with visual summaries
5 Chromatin Structure and Gene Control Chromatin Structure Low level : DNA, nucleosomes, 30 nm fiber High level: fiber folding Gene control Regulation of gene activity Biological research question: Relation chromatin structure and gene control Is there, what is, when, etc....
6 Experiment Question: influence of Hetrochromatin protein 1 on chromatin structure? Approach: Prepare collection of cells with a specific region Control group: target GFP to the region HP1 group : target GFP/HP1 to the region Observe regions with confocal microscope Data analysis question: Identify and quantify the differences between control and HP1 group
7 Collection of data sets 60 data sets (30 control group, 30 HP1 group) Each data set: 512 x 512 x 32 Sample images: Control group (left) HP1 group (right) Data analysis questions: Accurately detect region of interest Quantify region attributes (size, roughness, roundness, etc) What are the attribute differences in the control and HP1 groups ?
9 Interactive Visualization of Collection Advantages Control over visualization tools and parameters Segmentation Attribute computations Direct feedback Disadvantages Laborious Error prone
10 Batch processing of collection Advantage All sets are processed automatically A-priori parameter settings Disadvantage No feedback on the process
11 Visual Summaries Definition: a user defined compact visual representation of the data during (batch) processing Governing idea: the visual summary is used to visualize the steps in batch process Examples: General strategy: Interactive setup (determine parameters, attributes, etc) Batch processing using setup Information visualization with visual summaries
13 Discriminating groups Red: HP1 sets, Green: control Region granularity vs number of spots in region Granularity attribute Average intensity gradient of region Plot tells us: Large variation, some outliers HP1 and control seem different
14 Large variation, some outliers Brush / link outliers Investigate visual summary Problems with data set Corrupt data
15 HP1 and control seem different Further analysis Histograms Box plots Statistical tests Wilcoxon Wilcoxon tells us that there is indeed a significant difference
16 Lessons learned Showing a significant difference in granularity vs number of spots tells us that the HP1 effects the structure of chromatin. The effect is that chromatin is condensed in a number of compact regions. Biological significant result. Two papers published Strategy for analysis of collections of confocal data sets Interactive visualization and batch processing are both needed Information visualization is used for the analysis of batch output Visual summaries are used to link back to original data set or previous steps in batch process Strategy has been implemented as the Argos system
17 Generality Argos has been used for the analysis of an experiment consisting of 2500+ confocal data sets Argos has been used for the analysis of micro array data