GENBANK FILE FORMAT LOCUS –LOCUS NAME Is usually the first letter of the genus and species name, followed by the accession number –SEQUENCE LENGTH Number.

Slides:



Advertisements
Similar presentations
LG 4 Outline Evolutionary Relationships and Classification
Advertisements

On line (DNA and amino acid) Sequence Information Lecture 7.
The National Center for Biotechnology Information (NCBI) a primary resource for molecular biology information Database Resources.
SEQUENCE DATABASES Daniel Svozil. Primary sequence databases All published genome sequences are available over the internet requirement of every scientific.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Nomenclature is the science of naming organisms Evolution has created an enormous diversity, so how do we deal with it? Names allow us to talk about groups.
Classification of Living Things. 2 Taxonomy: Distinguishing Species Distinguishing species on the basis of structure can be difficult  Members of the.
Classification.
How to use the web for bioinformatics Molecular Technologies Ethan Strauss X 1171
DNA Transcription and Translation
Archives and Information Retrieval
Biological databases.
Lecture 2.21 Retrieving Information: Using Entrez.
How to use the web for bioinformatics Molecular Technologies February 11, 2005 Ethan Strauss X 1373
KEY WORDS – CELLS, DNA, INFORMATION All living things are made from Deoxyribonucleic acid is abbreviated This molecule stores that helps cells carry.
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – The Transcription.
CHAPTER 25 TRACING PHYLOGENY. I. PHYLOGENY AND SYSTEMATICS A.TAXONOMY EMPLOYS A HIERARCHICAL SYSTEM OF CLASSIFICATION  SYSTEMATICS, THE STUDY OF BIOLOGICAL.
How to use the web for bioinformatics Ethan Strauss X 1171
KEY WORDS – CELLS, DNA, INFORMATION All living things are made from Deoxyribonucleic acid is abbreviated This molecule stores that helps cells carry.
2.7 DNA Replication, transcription and translation
Mutations Section 12–4 This section describes and compares gene mutations and chromosomal mutations.
Comparing protein structure and sequence similarities Sumi Singh Sp 2015.
Pattern databasesPattern databasesPattern databasesPattern databases Gopalan Vivek.
On line (DNA and amino acid) Sequence Information
Chapter 13.2 (Pgs ): Ribosomes and Protein Synthesis
Gene Expression Omnibus (GEO)
Information Resources for Bioinformatics 1 MARC: Developing Bioinformatics Programs July, 2008 Alex Ropelewski Hugh Nicholas
Systematics the study of the diversity of organisms and their evolutionary relationships Taxonomy – the science of naming, describing, and classifying.
Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.
NCBI’s Bioinformatics Resources Michele R. Tennant, Ph.D., M.L.I.S. Health Science Center Libraries U.F. Genetics Institute January 2015.
Biological Databases By : Lim Yun Ping E mail :
1 Orthology and paralogy A practical approach Searching the primaries Searching the secondaries Significance of database matches DB Web addresses Software.
1 LSM2241 P1 & P2 – Extra Discussion Questions. Features of major databases (PubMed and NCBI Protein Db) 2.
Organizing information in the post-genomic era The rise of bioinformatics.
Chapter 11 DNA and GENES. DNA: The Molecule of Heredity DNA, the genetic material of organisms, is composed of four kinds nucleotides. A DNA molecule.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
 Read quality  Adaptor trimming  Read sequence collapse Preprocessing Genome mapping  Map read to the spruce genome (Pabies1.0- genome.fa) using Patman
Biological databases Exercises. Discovery of distinct sequence databases using ensembl.
NCBI Literature Databases: PubMed
Computer Storage of Sequences
1 Discussion Practical 1. Features of major databases (PubMed and NCBI Protein Db) 2.
Phylogeny & the Tree of Life
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
Classification. Cell Types Cells come in all types of shapes and sizes. Cell Membrane – cells are surrounded by a thin flexible layer Also known as a.
Copyright OpenHelix. No use or reproduction without express written consent1.
Classification.
The Genetic Code. The DNA that makes up the human genome can be subdivided into information bytes called genes. Each gene encodes a unique protein that.
{ Early Earth and the Origin of Life Chapter 15.  The Earth formed 4.6 billion years ago  Earliest evidence for life on Earth  Comes from 3.5 billion-year-old.
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – the Transcription.
1 Discussion Practical 1. Features of major databases (PubMed and NCBI Protein Db) 2.
Last Class Protein Protein Carbohydrate Carbohydrate Lipid Lipid Nucleic acid Nucleic acid.
1 Classification & Phylogeny of Animals Zoology Chapter 4 Homework: Read pages (up to Taxonomic char…) Do Questions 1, 2, 3, 4 page 86 Due: Tuesday.
Nucleic acids  Links to G.C.S.E  D.N.A, genes, chromosomes  Bonding.
EMBOSS "The European Molecular Biology Open Software Suite "
Bioinformatics Computing 1 CMP 807 – Day 4 Kevin Galens.
SC.912.L.16.3 DNA Replication. – During DNA replication, a double-stranded DNA molecule divides into two single strands. New nucleotides bond to each.
Sequence: PFAM Used example: Database of protein domain families. It is based on manually curated alignments.
Annotation Presentation
Retrieving Information: Using Entrez
Genetics.
What is Bioinformatics?
Biological Classification: The science of taxonomy
Biological Classification: The science of taxonomy
Introduction to Bioinformatics II
Essential Question: How cells make proteins
Chapter 3. THE GENBANK SEQUENCE DATABASE
Google Patents google.com/patents.
DNA Transcription and Translation
Comparison Of DNA And RNA Synthesis in Prokaryotes and Eukaryotes
How to search NCBI.
Presentation transcript:

GENBANK FILE FORMAT LOCUS –LOCUS NAME Is usually the first letter of the genus and species name, followed by the accession number –SEQUENCE LENGTH Number of nucleotide base pairs in the sequence record –MOLECULE TYPE The type of molecule that was sequenced. Example: DNA –GENBANK DIVISION The three letter abbreviation that describes a records division –MODIFICATION DATE Date of last modification –Example:

GENBANK FILE FORMAT DEFINITION –Brief description of sequence –Includes information such as Source organism Gene name / protein name Sequence function ACCESSION –Unique identifier for a sequence record VERSION –Is in the format accession.version –GI GenInfo Identifier –Sequence identification number A new GI is assigned if sequence is altered KEYWORDS –Word or phrase describing the sequence –Are generally present in older records –If entry has no keyword, the field contains a period

GENBANK FILE FORMAT SOURCE –Organism name in an abbreviated form –ORGANISM Formal scientific genus and species name Lineage information according to the phylogenetic classification scheme REFERENCE –Publications that discuss reports on sequence data –Contains information such as: AUTHORS TITLE JOURNAL PUBMED –Refers to a PubMed identifier to link to a corresponding record

GENBANK FILE FORMAT FEATURES –Biologically significant regions in the sequence –SOURCE length of sequence scientific name of source organism Taxon ID (Taxonomy reference) –GENE Name assigned to the region of biological interest –CDS Coding Sequence Includes amino acid translation ORIGIN –May be blank –Or may give a local pointer to the sequence start

GENBANK RECORD LAYOUT

Genbank Record Summary Genbank is an annotated collection of all publicly available biological sequences The Genbank record format must be flexible enough to allow for biological data from numerous sources to be integrated without difficulty Genbank records contain comprehensive information on an entry, including information on the source, distinguishing characteristics, and information on journal articles pertaining to the entry