Virginia Commonwealth University

Slides:



Advertisements
Similar presentations
In Silico Primer Design and Simulation for Targeted High Throughput Sequencing I519 – FALL 2010 Adam Thomas, Kanishka Jain, Tulip Nandu.
Advertisements

Huong Le Department of Molecular & Clinical Genetics, Royal Prince Alfred Hospital Click mouse to move to the next slide.
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Sequencing a genome. Definition Determining the identity and order of nucleotides in the genetic material – usually DNA, sometimes RNA, of an organism.
GENOME SEQUENCING AND OBJECTIVES
SEQUENCING-related topics 1. chain-termination sequencing 2. the polymerase chain reaction (PCR) 3. cycle sequencing 4. large scale sequencing stefanie.hartmann.
1 Computational Molecular Biology MPI for Molecular Genetics DNA sequence analysis Gene prediction Gene prediction methods Gene indices Mapping cDNA on.
9 Genomics and Beyond Brief Chapter Outline
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
Sequencing Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
Sequencing Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
CS273a Lecture 4, Autumn 08, Batzoglou Hierarchical Sequencing.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
Utilizing Fuzzy Logic for Gene Sequence Construction from Sub Sequences and Characteristic Genome Derivation and Assembly.
© 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey Chapter 4 Genome Sequencing Strategies and procedures for.
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Genome Analysis Determine locus & sequence of all the organism’s genes More than 100 genomes have been analysed including humans in the Human Genome Project.
Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications.
Genome Sequencing. Bacteriophage fX174, the first genome to be sequenced, is a viral genome with only 5,368 base pairs (bp). Fred Sanger invented "shotgun"
Sequencing a genome (a) outline the steps involved in sequencing the genome of an organism; (b) outline how gene sequencing allows for genome-wide comparisons.
© Wiley Publishing All Rights Reserved. Working with a Single DNA Sequence.
Genome Sequencing and Assembly High throughput Sequencing Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Last lecture summary. recombinant DNA technology DNA polymerase (copy DNA), restriction endonucleases (cut DNA), ligases (join DNA) DNA cloning – vector.
BioInformatics (2). Physical Mapping - I Low resolution  Megabase-scale High resolution  Kilobase-scale or better Methods for low resolution mapping.
Sequence Analysis with Artemis & Artemis Comparison Tool (ACT) South East Asian Training Course on Bioinformatics Applied to Tropical Diseases (Sponsored.
Mouse Genome Sequencing
Large-scale genome projects
Tomato Chromosome 4: A Mapping & Sequencing Update 28 th September 2005 Christine Nicholson Mapping Core Group Welcome Trust Sanger Institute, UK.
Phred/Phrap/Consed Analysis A User’s View Arthur Gruber International Training Course on Bioinformatics Applied to Genomic Studies Rio de Janeiro 2001.
Steps in a genome sequencing project Funding and sequencing strategy source of funding identified / community drive development of sequencing strategy.
Biological Motivation for Fragment Assembly Rhys Price Jones Anne R. Haake.
A Sequenciação em Análises Clínicas Polymerase Chain Reaction.
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
SIZE SELECT SHEAR Shotgun DNA Sequencing (Technology) DNA target sample LIGATE & CLONE Vector End Reads (Mates) SEQUENCE Primer.
FINISHING WORKSHOP APRIL 2008 CHROMOSOME 7 THE FRENCH CONTRIBUTION TG216 TG438 T1112 T1355 T1328 T1428 T1962 T1414 T1497 T0676 TM18 CT54 T0966 T0731 TM15.
Recombinant DNA Technology and Genomics A.Overview: B.Creating a DNA Library C.Recover the clone of interest D.Analyzing/characterizing the DNA - create.
Double-Ended Shotgun Sequencing of PA14 Daniel G. Lee 10/30/02.
Wageningen, April 24-25, 2008 II Tomato Finishing Workshop Chromosome 12 Update ENEA, Rome University of Naples ‘Federico II’ CRIBI and Univ. of Padua.
Applied Bioinformatics Week 5. Topics Cleaning of Nucleotide Sequences Assembly of Nucleotide Reads.
Human Genome.
GENE SEQUENCING. INTRODUCTION CELL The cells contain the nucleus. The chromosomes are present within the nucleus.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Genomics Education Partnership: a flexible approach to implement Genomic teachings and research in the classroom Matthew W. Wadsworth and Consuelo J. Alvarez,
Automatic DNA and Genome Sequencing
A guided tour of Ensembl This quick tour will give you an outline view of what Ensembl is all about. You will learn: –Why we need Ensembl –What is in the.
Mojavensis: Issues of Polymorphisms Chris Shaffer GEP 2009 Washington University.
Chapter 5 Sequence Assembly: Assembling the Human Genome.
Genome sequencing and annotation Week 2 reading assignment - pages 63-78, 93-98, Boxes 2.1 and don’t worry about details of similarity scoring.
Title: Studying whole genomes Homework: learning package 14 for Thursday 21 June 2016.
DNA Sequencing First generation techniques
Microbial genomics.
Human Genome Project.
COMPUTATIONAL GENOMICS GENOME ASSEMBLY
Basics of Comparative Genomics
Pre-genomic era: finding your own clones
Basic Techniques Project Design Process Improvements
Today’s Lecture Genetic mapping studies: two approaches
Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome.
Genomes and Their Evolution
DNA Sequencing The DNA from the genome is chopped into bits- whole chromosomes are too large to deal with, so the DNA is broken into manageably-sized overlapping.
Genome sequencing informatics
Today… Review a few items from last class
Discovery tools for human genetic variations
A Sequenciação em Análises Clínicas
New Class Offering.
Introduction to Sequencing
Sequence the 3 billion base pairs of human
Basics of Comparative Genomics
Human Genome Project Seminal achievement. Scientific milestone.
Presentation transcript:

Virginia Commonwealth University Genome Research Ping Xu For BBSI The Philips Institute Virginia Commonwealth University Richmond, Virginia

Microbial genome projects in Virginia Commonwealth University 1. Cryptosporidium hominis 2. Streptococcus sanguis 3. Trypanosoma cruzi 4. Human BAC or cosmid clones 5. Bacterial phages

General Procedures in Sequencing Subclone (genomic shotgun library) Production sequencing Template isolation Sequencing reactions Fragment separation Data acquisition Base calling Finishing Assembly Gap filling Conflict resolution Verification Analysis Gene predictions Homology searches Annotation

PE/ABI 3700 Capillary Sequencer - automated - faster runs (8 per day) - capillary (easy to use) - 96 capillaries per run - $300,000 per machine Truly automated High Throughput sequencing

Raw data from ABI 3700 Prism Sequencer 2/9/2018 Raw data from ABI 3700 Prism Sequencer

Both strand Gap Single clone Gap Single strand

Strategies to sequence a genome Whole genome shotgun sequence Whole genome mapping based Hybrid above two strategies

From Rob Martienssen Cold Spring Harbor Laboratory

Hybrid strategy Assemble shotgun sequences Alignment with BAC clones Overlay with other genomes Blast search

Filtering Way to remove a large percentage of the repetitive regions of the genome under whole genome sequencing. Methyl filtration is one approach Physical methods may also be useful (hybridization methods)

Skimming 1. Carry out 1-3 fold coverage of the region 2. Can be whole genome or clone based 4. Covers ~66 – 97% of the target sequence 5. 99% or grater accuracy on average 3. Clone based can therefore be targeted 1. Carry out 1-3 fold coverage of the region

Rough draft 1. Typically 5X coverage 2. Can be thought of as: High coverage skimming Low coverage complete sequencing 3. Advantages and disadvantages are intermediate between skimming and complete sequencing 4. Some are proposing 10X rough draft as “finished”

Complete Sequence 1. More than 10X coverage All base accurate 2. Finishing Assembly Gap filling Conflict resolution Verification 3. Analysis Gene predictions Homology searches Annotation

Goals of Complete Genome project 1. Complete filling gaps 2. Complete finishing 3. The base accuracy: 1 error/10 K 4. Complete annotation

Locating and filling the gaps 1. Find the shotgun or BAC sequence pairs to bridge the gaps 2. Comparison with other closely related finished genomes 3. Blast search to find the hits over two contigs 4. Re-sequence shotgun clones with short sequences to extent the contigs 5. Design primers for genome walking 6. Multiple PCR to orient contigs for no-hit contigs and PCR amplification to bridge the gaps.

Finishing Finishing is the process of assembling and refining raw sequence data into a highly accurate final genomic sequence 1. Automated sequence editing 2. Manual, interactive sequence inspection 3. Directed sequencing 4. Assembly verification 5. Remove contaminated sequences

High Accuracy of Sequence 1. Diagnostics, forensics, etc. 2. Protein coding predictions 3. Repeat sequence, polymorphisms, SNPs, etc. 4. Evolution analysis and phylogenics

How do we analyze sequence once obtained for gene functions? Computational analysis Database searches (DNA or protein) Compositional and domain analysis Comparative genomics etc. “Wet lab” Individual gene analysis Chip analysis Knock out

Function Conservation Homology searches Biochemical function The function almost always conserved between homologous Physiology function The phenotypes related to conserved orthologous proteins 3. The biochemical function can be reliable inferred from genetic homology. The physiology function cannot.

2/9/2018

2/9/2018

Sequence processing Basecalling + Quality trimming with Phred 2/9/2018 Sequence processing Basecalling + Quality trimming with Phred Phred quality score: 10 means 1 error in 10 20 means 1 error in 100 30 means 1 error in 1000 40 means 1 error in 10000

A common program for sequence assembling and finishing Phred/Phrap/Consed A common program for sequence assembling and finishing

Phred Phred is a base-calling program for DNA sequence traces. The program was developed by Drs. Phil Green and Brent Ewing. It is widely used by the largest academic and commercial sequencing laboratories.

Phrap Phrap is a leading program for DNA sequence assembly. Phrap is routinely used in some of the largest sequencing projects in the Human Genome Sequencing Project and in the biotech industry.

Consed Consed is a graphical tool for sequence finishing. It is a program for editing sequence assemblies created with Phrap assembly program. In addition to a full set of standard features (view traces, edit reads by inserting a base, deleting a base, substituting a base, etc.), it supports an efficient editing procedure designed for use by Phrap in subsequent reassemblies of the same data set.

2/9/2018

2/9/2018

2/9/2018