Professional Development Course 1 – Molecular Medicine Genome Biology June 12, 2012 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services Health Sciences Library System University of Pittsburgh
Genomic achievements since the Human Genome Project
Objective Organism Whole Genome Sequence Databases Genome Browsers
Topics Genome Sequencing Projects NCBI Genome resources Integrated Microbial Genome UCSC Genome Bioinformatics Genome Browsers UCSC Genome Browser UCSC Table Browser NCBI Map viewer Generic Genome Browser (Gbrowse)
Genome Biology Human Genome Project Video
Chromosome Structure
Genome Biology: Karyotype Adapted from NGHRI Trisomy 21 Monosomy X
Genome Biology: Karyotype NHGRI
Genome Biology: Molecular Cloning p53 CFTRNFkB 8 September,
Genome Biology : Time Line 1976 RNA Bacteriophage MS Human Genome Draft Seq 2003 Published Complete Human Ref Genome 2007 Diploid Genome seq of an Individual Human 2011 Published Complete Genomes: 1863 organisms 1995 Haemophilus Influenza 2008 Jim Watson Genome Yeast C. elegans 2002 Drosophila
DNA Sequencing Cost
Oxford Nanopore A 20-node installation, using 8,000-nanopore cartridges, is expected to deliver a complete human genome at 50- fold coverage in 15 minutes, according to the company, or 3 terabases of data per day, based on a sequencing speed of 300 bases per second. For that setup, the cost per gigabase is expected to be under $10.
Organism Whole Genome Sequences
Organism Whole Genome Sequences Human Mouse Rat Dog Cow Chimp Rabbit ……..
Genomes OnLine Database (GOLD) Global comprehensive access to information regarding complete and ongoing genome projects, as well as metagenomes & metadata
Genome Resources
Search for organism’s whole genome sequence
Genome Resources NCBI: Genomes Resources : LinkLink Genome: JGI: Integrated Microbial genome LinkLink
NCBI Genome
NCBI BioProject Query: Check the status of genome sequencing for an organism, such as honey bee. Answer: Enter search term under BioProject Select the appropriate organism The BioProject summary page will provide information of available projects and sequencing status Click on Project Type for more detailed information Explore Related Resources
Link to the video tutorial: Resources NCBI Genome Project: NCBI Genome: Find the genomic sequence for an organism, such as rabbit.
NCBI Genome Project A collection of complete and in-progress large-scale sequencing, assembly, annotation, and mapping projects for cellular organisms. The database is organized into organism-specific overviews that function as portals for browsing and retrieving projects pertaining to each organism. CLICK Rabbit
NCBI Genome Project : Rabbit Genome
NCBI Genome Project : Rabbit Genome
Link to the video tutorial: Resources Integrated Microbial Genome (IMG): Find the genomic sequence for a bacteria, such as Salmonella enterica
Human genome sequence
Genomic achievements since the Human Genome Project
Genome Biology: Structural Variations
Genome Reference Consortium Link to the PLoS Biology paper on the GRC :
NCBI Genome Resources /
What is a Genome Browser? Genome Browsers enable researchers to visualize & browse entire genomes with annotated data including: gene prediction and structure proteins expression regulation variation comparative analysis etc. Annotated data is usually from multiple diverse sources.
Eukaryotic Genome Browsers Display: Vertical Display: Horizontal
Non-vertebrate Genome Browsers
Genome Browsers The Big Three NCBI MapViewer UCSC Genome Browser EBI Ensemble Generic Genome Browser (Gbrowse) Display: Vertical Display: Horizontal
UCSC Genome Browser
UCSC Genome Browser Default Tracks
UCSC Genome Browser Page mRNA and EST Tracks Expression (such as microarray) Comparative Genomics As a group Individual species Variation and Repeats (including SNPs, copy number variation) Groups of data (Tracks) ENCODE Tracks Phenotype and Disease Tracks Regulation (including TFBS)
Navigating the Human Genome Browse the region of human chromosome 7 between 54, to 55,974,438 bp (chr7:54,318,043-55,974,438)
Link to the video tutorial: Resource UCSC Genome Browser: Browse the region of human chromosome 7 between 54, to 55,974,438 bp. What genes are present in this region ?
UCSC Genome Browser
UCSC Genome Browser: Navigating a Genomic Region
UCSC Genome Browser: Navigating a Genomic Region What genes are present in this region?
Bioinformatics Institutions
UCSC Genome Browser: Navigating a Genomic Region What is RefSeq ?
NCBI Sequence Databases GenBank archival database of nucleotide sequences from >160,000 organisms More infoMore info RefSeq based on GenBank record, non-redundant expert verified databases of reference sequences More infoMore info
International Nucleotide Sequence Database Collaboration
Primary Vs Derivative databases
RefSeq Scope & Accessions Genomic DNA NC_ complete genome, complete chromosome, complete plasmid NG_ genomic region NT_ genomic contig mRNA - NM_ Protein - NP_ more about RefSeq scope and accessions...
RefSeq Status Codes Provisional Reviewed Predicted Genome Annotation more about RefSeq status codes more about RefSeq status codes
UCSC Genome Browser: Navigating a Genomic Region
UCSC Genome Browser: Navigating a Genomic Region
Display Options Hide: removes a track from view Dense: all items collapsed into a single line Squish: each item = separate line, but 50% height + packed Pack: each item separate, but efficiently stacked (full height) Full: each item on separate line
UCSC Genome Browser: Navigating a Genomic Region
Gene Description
Gene Description Informative description other resource links microarray data mRNA secondary structure links to sequences protein domains/structure orthologs in other species Gene Ontology™ descriptions mRNA descriptions pathways genetic association studies comparative toxicology gene model
UCSC Genome Browser: Navigating a Genomic Region Find SNPs present in this region
Link to the video tutorial: File: UCSC_part2.swf Resource UCSC Genome Browser: Browse the region of human chromosome 7 between 55,033,691 to 55,282,150 bp. What genetic variations are present in this region ? Retrieve the DNA sequence of this genomic region showing SNPs in red and all gene exons in blue
UCSC Genome Browser: Navigating a Genomic Region
UCSC Genome Browser: Navigating a Genomic Region
UCSC Genome Browser: Navigating a Genomic Region
BLAT: Map a protein sequence into the genome
UCSC Blat: Place a Peptide Seq into the Genome Peptide Seq: NKSSHFYSNVGLQIQTYELQESNVQLKLTVVET Nucleotide seq: AAATCCTCACATTTTTACTCAA ATGTTGGACTTCAAATTCAGACAT ATGAACTTCAGGAAAGC AATGTTCA
Link to the video tutorial: File: Blat.swf Resource UCSC BLAT: Place a mRNA or peptide sequence into the human genome
UCSC Blat
UCSC Blat
UCSC Blat Peptide Seq: NKSSHFYSNVGLQIQTYELQESNVQLKLTVVET
Thank you! Any questions? Carrie IwemaAnsuman Chattopadhyay