Thanks to: DARPA BioComp DNA&RNA Polonies: Mitra, Shendure, Zhu Protein MS: Jaffe, Leptos Metabolism/Proliferation models : Segre, Vitkup, Badarinarayana.

Slides:



Advertisements
Similar presentations
Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Advertisements

RNA-Seq as a Discovery Tool
Molecular & Genomic Surgery Eric M. Wilson 1/5/10.
The Central Dogma & Data DNA mRNA Transcription Protei n Translation Metabolite Cellular processes Phenotype Embryology Organismal Biology Genetic Data.
DARPA BAA 01-26: BIO-COMP Technical challenges and risks: “DNA computing” so far focused on computing.
Reference books: Molecular Biology of the Cell, 4th edition, by B. Alberts et al., Molecular Cell Biology, 5th edition, by H. Lodish et al., 2004.
Bioinformatics Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 1 Introduction Aleppo University Faculty of technical engineering.
Gene expression analysis summary Where are we now?
Molecular Genomic Imaging Center (CEGS) Harvard / Wash U George Church, Rob Mitra Greg Porreca, Jay Shendure Sequencing by Ligation on Polony Beads with.
Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.
Thanks to the Lipper Center for Computational Genetics Government and private grant agencies: NHLBI, NSF, ONR, DOE, DARPA, HHMI, Armenise Corporate collaborators.
DNA Sequencing and Gene Analysis
3 September, 2004 Chapter 20 Methods: Nucleic Acids.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
David Goodsell. GtL Workshop B: Experimental Technology Development and Integration Tue at 2 PM Co-Chairs – George Church, Harvard Medical School Ham.
1 Characterization, Amplification, Expression Screening of libraries Amplification of DNA (PCR) Analysis of DNA (Sequencing) Chemical Synthesis of DNA.
Carbon meets Silicon (& the $1000 human genome) Oct 9, 2002 HBS.
Thanks to: George Church (Harvard GTL & CEGS Centers) 5-Jan-2006 HPCGG Landsdowne 2 PM Personal Genomics meets Quantitative Proteomics NHGRI Seq Tech 2004:
Introduce to Microarray
Genetic Technologies By: Brenda, Dale, John, and Brady.
HST Advisory Council Thursday 16-Nov :00 to 2:20 PM Personal Genomes & Medicine Thanks to: Broad Inst., DARPA-BioComp, DOE-GTL, EU-MolTools, NGHRI-CEGS,
Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment.
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Genomic DNA purification
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Urbana, IL| MAY 22, 2009 Anatomical Localization BeeSpace 5 th Annual Workshop Institute for Genomic Biology University of Illinois at Urbana-Champaign.
CS 6293 Advanced Topics: Current Bioinformatics
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Diabetes and Endocrinology Research Center The BCM Microarray Core Facility: Closing the Next Generation Gap Alina Raza 1, Mylinh Hoang 1, Gayan De Silva.
Bioinformatics/PCR Lab How does having a certain genetic marker affect chances of getting brain cancer?
with an emphasis on DNA microarrays
TOPICS IN (NANO) BIOTECHNOLOGY Lecture 7 5th May, 2006 PhD Course.
Analyzing your clone 1) FISH 2) “Restriction mapping” 3) Southern analysis : DNA 4) Northern analysis: RNA tells size tells which tissues or conditions.
AP Biology Ch. 20 Biotechnology.
-The methods section of the course covers chapters 21 and 22, not chapters 20 and 21 -Paper discussion on Tuesday - assignment due at the start of class.
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Announcements Lab notebooks due Monday by 5 No Ch. 9 Part 2 homework
Restriction Nucleases Cut at specific recognition sequence Fragments with same cohesive ends can be joined.
Microarray Technology
Finish up array applications Move on to proteomics Protein microarrays.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
 DNA (gene mutations, paternity, organs compatibility for transplantations)  RNA  Proteins (gene expression)
Finnish Genome Center Monday, 16 November Genotyping & Haplotyping.
Sequencing DNA 1. Maxam & Gilbert's method (chemical cleavage) 2. Fred Sanger's method (dideoxy method) 3. AUTOMATED sequencing (dideoxy, using fluorescent.
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Chapter 10: Genetic Engineering- A Revolution in Molecular Biology.
Polymerase Chain Reaction (PCR) Nahla Bakhamis. Multiple copies of specific DNA sequences; ‘Molecular Photocopying’
Chapter 20: DNA Technology and Genomics - Lots of different techniques - Many used in combination with each other - Uses information from every chapter.
Microarrays and Other High-Throughput Methods BMI/CS 576 Colin Dewey Fall 2010.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
MCT = Molecular Colony Technique Alexander Chetverin Institute of Protein Research of the Russian Academy of Sciences References: NAR(10)2349 from 1993.
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
Microarray: An Introduction
Green with envy?? Jelly fish “GFP” Transformed vertebrates.
Next-generation sequencing technology
Part 3 Gene Technology & Medicine
Next generation sequencing
Gel electrophoresis analysis Automated DNA analyzer.
Next-generation sequencing technology
Chapter 20: DNA Technology and Genomics
DNA profiling DNA profiling is a technique by which individuals can be identified and compared via their respective DNA profiles. Definitions you will.
Relationship between Genotype and Phenotype
Relationship between Genotype and Phenotype
Proteomics Informatics David Fenyő
A perspective on proteomics in cell biology
Chapter 20: DNA Technology and Genomics
Proteomics Informatics David Fenyő
Relationship between Genotype and Phenotype
Presentation transcript:

Thanks to: DARPA BioComp DNA&RNA Polonies: Mitra, Shendure, Zhu Protein MS: Jaffe, Leptos Metabolism/Proliferation models : Segre, Vitkup, Badarinarayana 20-Mar-2003 New Methods for Genomic Systems Biology

gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Modeling successes: 3D & Sequence alignment

Agenda for March 20 15' George Church -- proteomics & polonies 20' Daniel Segre – Metabolic modeling 10' Matt Wright – 3D & 4D modeling 25' Jingdong Tian -- minigenome 10' Wayne Rindone – BioSpice Discussion throughout is welcome. 10’ Financial, etc.

DNA RNA Proteins Metabolites Replication rate Environment Biosystems Integrating Measures & Models Microbes Cancer & stem cells Darwinian optima In vitro replication Small multicellular organisms RNAi Insertions SNPs interactions

Improving Models & Measures Why model? “Killer Applications”: Share, Search, Merge, Check, Design

The issue is not speed, but integration. Cost per 99.99% bp : Including Reagents, Personnel, Equipment/5yr, Overhead/sq.m Sub-mm scale : 1  m = femtoliter ( ) Instruments $2-50K per CPU Why improve measurements? Human genomes (6 billion) 2 = bp Immune & cancer genome changes >10 10 bp per time point RNA ends & splicing: in situ bits/mm 3 Biodiversity: Environmental & lab evolution Compact storage 10 5 now to bits/ mm 3 eventually & How ? ($1K per genome, bits/$ )

Projected costs determine when biosystems data overdetermination is feasible. In 1984, pre-HGP (  X, pBR322, etc.) 0.1bp/$, would have been $30B per human genome. In 2002, (de novo full vs. resequencing ) ABI/Perlegen/Lynx: $300M vs. $3M 10 3 bp/$ (4 log improvement) Other data I/O (e.g. video) bits/$

Steeper than exponential growth Kurzweil/Moore's law of ICs 1965

New sequencing approaches in commercial R&D Method liter/bp LengthError Test-set $/device bp/hr Capil  fluidics e-6600 <0.1% 1e11 350k 80k ABI, Amersham, GenoMEMS, Caliper*, RTS* SeqByHyb e-12  1 <5% 1e9 200k 1M Perlegen-Affymetrix*, Xeotron* Mass Spectrometry Sequenom, Bruker* Single molecule>e-24 >>40? >80 30k-1M 180k Pore(Agilent*) Fluor(USGenomics, Solexa) FRET(VisiGen,Mobious) In vitro DNA-Amplification (e.g. Polonies) -- Multiplex cycles: Lynx*e-15 20<3% 1e7 ? 1M Pyroseq.* e-6>40<1% 1e6 100k 5k HMS* e-13  1M? ParAllele, 454, RTS* *GMC has a potential financial interest (or Harvard license)

Why single molecules? Integration from cells/genomes/RNAs to data Geometric constraints : Who’s “in cis” on a molecule, complex, or cell. e.g. DNA Haplotypes & RNA splice-forms

Polymerase colonies (Polonies) along a DNA or RNA molecule

A’ B B B B B B A Single Molecule From Library B B A’ 1st Round of PCR Primer is Extended by Polymerase B A’ B Polymerase colony (polony) PCR in a gel Primer A has 5’ immobilizing Acrydite Mitra & Church Nucleic Acids Res. 27: e34

Hybridize Universal Primer Add Red (Cy3) dTTP. Wash. Add Green (FITC) dCTP Wash; Scan BB’ 3’5’ A G T. T C BB’ 3’5’ G C G.. C Sequence polonies by sequential, fluorescent single-base extensions

Inexpensive, off-the-shelf equipment MJR in situ Cycler $10K Automated slide fluidics $4K Microarray Scanner $26K+

Human Haplotype: CFTR gene 45 kbp Rob Mitra Vincent Butty Jay Shendure Ben Williams

Quantitative removal of Fluorophores Rob Mitra

Template ST30: 3' TCACGAGT Base added: (C) A G T (C) (A) G (T) C (A) (G) T C A 3' TCACGAGT AGTGCTCA Sequencing multiple polonies Rob Mitra

Mutiple Image Alignment Metric based on optimal coincidence of high intensity noise pixels over a matrix of local offsets (0.4 pixel precision)

Polony exclusion principle & Single pixel sequences Mitra & Shendure

DNA RNA Proteins Metabolites Replication rate Environment Biosystems Integrating Measures & Models Microbes Cancer & stem cells Darwinian optima In vitro replication Small multicellular organisms RNAi Insertions SNPs interactions

Alternatively Spliced Cell Adhesion Molecule Specific variable exons are up-or-down-regulated in various cancers Controversial prospective diagnostic / prognostic marker (>1000 papers) Can full isoforms resolve controversy and/or act as superior markers? Eph4 = murine mammary epthithelial cell line Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic) CD44 Exon Combinatorics (Zhu & Shendure)

1. Search Signature Image for qualified ‘objects’ a. > 50 connected pixels with same signature value b. ‘solidity’ of > 0.50 c. long axis / short axis ratio < 3 OR a. > 25 connected pixels with same signature value b. ‘solidity’ of > 0.80 c. long axis / short axis ratio < Search for internal regional maxima within each object (lest two adjacent polonies with same signature get counted as one) 3. Assign centroid locations as qualified individual ‘polonies’ Trial & Error Derived Algorithm for Polony Finding

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10

Examples of Counts (isoforms) of 8000 analyzed Jun Zhu

Summary of Counts (isoforms) Jun Zhu Eph4 = murine mammary epthithelial cell line Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic)

Polony Flavors 1.Replica Plating of DNA images [Mitra et al. NAR 1999] 2.Long Range Haplotyping [Mitra et al. PNAS 2003] 3.Allelic mRNA Quantitation (HEP) [Mitra et al. 2003] 4.Alternative Splicing Combinatorics [Zhu et al. 2003] 5.Precise SNP-mutant & mRNA ratios [Merrill et al. 2003] 6.Fluor in situ Sequencing (FISSEQ 1) [Mitra et al. 2003] 7.Multiplex Genotyping (ApoE, Hyman, Shendure & Williams) 8.In situ / single-cell extensions of the above (Zhu & Williams)

DNA RNA Proteins Metabolites Replication rate Environment Biosystems Integrating Measures & Models Microbes Cancer & stem cells Darwinian optima In vitro replication Small multicellular organisms RNAi Insertions SNPs interactions

Link et al Electrophoresis 18: (Pub) (Pub) Comparison of predicted with observed protein properties (abundance, localization, postsynthetic modifications) E.coli

Circadian Cycle Proteogenomic Map 1/4

Circadian Cycle Proteogenomic Map 2/4

Circadian Cycle Proteogenomic Map 3/4

Circadian Cycle Proteogenomic Map 4/4

Numbers on top in basepairs ORFs are predicted. Proteomic Model is based on Mass-spectrometry of peptides at 24h time points. DifferenceMap indicates new peptide regions. The 6 colors represent ORFs in the 6 reading frames. (Harvard-MIT GtL: Jaffe, Church, Lindell, Chisholm, et al. ) Circadian &Cell Cycle Proteogenomic Map (zoom)

Circadian time-series (Prochlorococcus) RNA & protein quantitation: R 2 =.992 R 2 =.635 Linear Regression R 2 =.1 (Harvard-MIT GtL: Jaffe, Church, Lindell, Chisholm, et al. ) RNA (3 AM)

DNA RNA Proteins Metabolites Replication rate Environment Biosystems Integrating Measures & Models Microbes Cancer & stem cells Darwinian optima In vitro replication Small multicellular organisms RNAi Insertions SNPs interactions