Presentation on theme: "We are sorry, but the presentation on this USB device was prepared on an IBM computer…it cannot be run on this Dell Computer…"— Presentation transcript:
We are sorry, but the presentation on this USB device was prepared on an IBM computer…it cannot be run on this Dell Computer…
ISAC's perspective on data: standards, structure and analysis J. Paul Robinson SVM Professor of Cytomics Purdue University President, International Society for Analytical Cytology (ISAC)
The HTS rationale The infinite monkey theorem defines the HTS rationale. It states that a monkey hitting keys at random on a typewriter keyboard will almost surely eventually type every book in France's Bibliothèque Nationale de France (National Library). In the restatement of the theorem most popular among English speakers, the monkeys eventually type out the collected works of William Shakespeare. The original image was presented in Émile Borel's 1913 book "Mécanique Statistique et Irréversibilité”. http://forum.swarthmore.edu/dr.math/problems/bridge8.5.98.html http://www.nutters.org/monkeys.html
One perspective… BioIT Magazine 2002 How many standards can you afford?
Sssssssss Ssss Sss ss Monkey Business Let’s look at the infinite monkey theorem again… In 2003, scientists at Paignton Zoo and the University of Plymouth, in Devon in England reported that they had left a computer keyboard in the enclosure of six Sulawesi Crested Macaques for a month Not only did the monkeys produce nothing but five pages consisting largely of the letter S, they started by attacking the keyboard with a stone, and continued by urinating and defecating on it.
Historical Picture Began with flow cytometry –Invented in 1960s –1970s started with single fluorescence signal and two laser scatter signals – total 3 variables –1977 Herzenberg et al 2 color flow compensation –1980s two fluorescence signals and two scatter signals –1990s three to 11 fluorescence signals and two scatter signals –2000s 32 fluorescence signals and 10-15 scatter signals
Main challenges HCS activities must leverage information sciences – integrating information from genomics, emerging protein interaction networks, and ongoing chemical-genetic studies into a public knowledge base of biological systems Standards have to be defined and followed!
What is the issue? There are thousands of analytical and diagnostic instrument in clinics and labs These instruments are manufactured by 10-20 different companies They use reagents from 1-200 companies for the same tests Many tests used fluorescence as the reporter system Imaging systems are not uniform – there are no accepted imaging standards There are no algorithm evaluation standards There is no way to take data sets into any single management engine There are no organized databases that can evaluate the vast amount of research or clinical data collected
How Big a Problem is it? 20,000 flow cytometers - 10 manufacturers 5-8,000 confocal microscopes - 10 manufacturers 20-50,000 other fluorescence systems – 20 manufacturers 5-10,000 DNA microarray readers – 10 manufacturers 500 Laser Scanning Microscopes – 2 manufacturers 1000 plus HCS instruments – 20 manufacturers
How much data is enough? Flow assay Standard 7 tube assay Each tube 7 colors plus 2 scatter parameters 50,000 cells per tube 450,000 parameters per tube x 7 =3,150,000 Run 25 patients/tests per day= 25x7=175 assays= 551,250,000 points per day If you ran this assay 100 times in a year you would have 55,125,000,000 points
How much data is enough? HCS assay 384 well plate assay 6 images per well = 2304 per plate Each image contains 100 cells (450k per image 600x800 pixels) (1,036,800k of image space = 1 Gbyte) We collect 20 parameters per cell We have 230,400 x 20 parameters = 4,608,000 For a 10 plate assay we have 46,080,000 parameters If we run 2 assays a day 5 days a week for 40 weeks 46,080,000 x 2 x 5 x 40 = 18,432,000,000 parameters Total storage space =1 Gbyte x 10 x 2 x 5 x 40 = 4 Tbytes
How many images are there… Industry estimates indicate that 80 billion new images are created every year 219,178,082 per day 9,132,420 per hour 152,207 per minute 2,536 per second http://a06.cgpublisher.com/proposals/244/index_html
Image Data Management Original Image Daughter Images Thumbnail Analyzed ImageAnalysis Result Metadata Archived ImageArchived data Report Audit Trail
Calibration and Standards? Very few real standards Local calibration if at all Standards processes must be created and implemented across several fields Necessary to identify –Instrument standards –Reagent Standards –Analysis Standards –Data structure standards –Metadata standards –Algorithm identification (at least)
Identical microbeads with various calibrated binding capacities of goat-anti-mouse IgG on their surface: Events Mean fluorescence intensity (MFI) 12 34 Blank QSC, Cat. No. 815 Bangs Laboratories, Inc. www.bangslabs.com Antibody binding capacity (ABC) provided by the manufacturer : Blank. 0 MESF 1.6851 MESF 2.23379 MESF 3.58333 MESF 4.213369 MESF bead Ab site MESF=Molecules of equivalent soluble fluorochrome QSC Beads (Quantum Simply Cellular)
Noise measurement with a standard R. M. Zucker and O. Price, Cytometry 43 (2001) 273 - 294
QC- Optical Filters Depending on location, filters can be placed under extreme stress Environmental conditions (humidity)
Excitation Efficiency Profiles (note – there really isn’t a 545 nm line available!!) Management compensation of fluorescence overlap becomes crucial
Noise measurement in the images II Wavelet transform 1 Wavelet transform 2 signal wavelet shrinkage effect ? noise YES NO scale corr. s/n filter s/n childs/n parent raw image denoised image (signal) diff.
Light detector stability analysis I trend component: random walks (RW) periodic component: dynamic harmonic regression
Will there be solid standards in the future? Yes – we must have instruments that are properly designed so that the hardware itself is well calibrated Using modeling we will be able to determine the accuracy of actual measurements and predict new possible systems Light sources can have built in calibrators or independent calibrators Software should be more sophisticated and perform quantitative calculations Results should be truly quantitative and not “relative units”
What Standards are available? Beads for size, intensity, color No calibration tools available for high resolution optical microscope (Richardson slide no longer manufactured) 1990 we created the Handbook of Flow Cytometry Methods to exactly define methods 1997 we created Current Protocols in Cytometry
About original data… “It is crucially important to keep your original digital or analog data exactly as they were acquired and to record your instrument settings. This primary rule of good scientific practice will allow you or others to return to your original data to see whether any information was lost by the adjustments made to the images. In fact, some journal reviewers or editors request access to such primary data to ensure accuracy.” J Cell Biol. 166:11-15, 2004
Workshop on Standards and Calibration in Cytometry and Biological Imaging Modalities Jointly sponsored by International Society for Analytical Cytology (ISAC) & the Society for Biomolecular Sciences (SBS) Site and date not yet set To highlight the areas of cell analysis that need to be standardized To develop a series of recommendations on: – Data file standards – Imaging standards – Archival/storage standards – Compression modalities – Algorithms and processing – Analytical technologies
ISAC 21 st Century Flow and imaging are equally emphasized in ISAC Standards and Calibration Biosafety issue Core managers support Education Public Policy
www.isac-net.org www.cyto.purdue.edu –Cytometry web/email discussion –Educational materials, Tutorials, Lectures Next ISAC Congress May 17-21, 2008, Budapest, Hungary Some References R.A. Hoffman, Current Protocols in Cytometry, 1997 : 1.3.1-1.3.19 J.C.S. Wood, Current Protocols in Cytometry, 1997 : 1.4.1-1.4.12 Cytometry, Volume 33, Number 2, 1998 R. M. Zucker and O. Price, Cytometry 43 (2001) 273 - 294