Single Nucleotide Polymorphisms Jennifer Lyon Eskind Biomedical Library May 1, 2009 CRC Workshop Series.

Slides:



Advertisements
Similar presentations
1 of 25 Sequence Variation in Ensembl. 2 of 25 Outline SNPs SNPs in Ensembl Linkage disequilibrium SNPs in BioMart DAS sources.
Advertisements

Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id
Outline to SNP bioinformatics lecture
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
ECE 501 Introduction to BME
Biological Databases Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
PolyPhen and SIFT: Tools for predicting functional effects of SNPs Epi 244 Spring 2009 Sam S. Oh.
Chapter 17 From Gene to Protein. Gene Expression The process by which DNA directs the synthesis of proteins 2 stages: transcription and translation Detailed.
Genome Variations & GWAS
DbSNP: the NCBI database of genetic variation S. T. Sherry, M.H. Ward, M. Kholodov, J. Baker, L. Phan, E. M. Smigielski and K. Sirotkin, Nucleic Acids.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Simple Nucleotide.
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
On line (DNA and amino acid) Sequence Information
Chapter 17 Notes From Gene to Protein.
Presented by: Andrew McMurry Boston University Bioinformatics Children’s Hospital Informatics Program Harvard Medical School Center for BioMedical Informatics.
Problem Set I review BIOL221T: Advanced Bioinformatics for Biotechnology Irene Gabashvili, PhD.
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
5. Point mutations can affect protein structure and function
Mutations.
RNA and Protein Synthesis
Gene Mutations Higher Human Biology Unit 1 – Human Cells.
Genome Organization & Evolution. Chromosomes Genes are always in genomic structures (chromosomes) – never ‘free floating’ Bacterial genomes are circular.
Chapter 13. The Central Dogma of Biology: RNA Structure: 1. It is a nucleic acid. 2. It is made of monomers called nucleotides 3. There are two differences.
Online Mendelian Inheritance in Man (OMIM): What it is & What it can do for you Knowledge Management & Eskind Biomedical Library January 27, 2012 helen.
1 of 32 Sequence Variation in Ensembl. 2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific.
Gene Regulations and Mutations
Main Idea #4 Gene Expression is regulated by the cell, and mutations can affect this expression.
A Biology Primer Part III: Transcription, Translation, and Regulation Vasileios Hatzivassiloglou University of Texas at Dallas.
Mutations are changes in the genetic material of a cell or virus
 The central concept in biology is:  DNA determines what protein is made  RNA takes instructions from DNA  RNA programs the production of protein.
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
Diving into the gene pool: Chromosomes, genes and DNA
Mutations.
Biodiversity. Genetic Mutations Change in base pairs Affect sequence May affect protein production Can alter genetic makeup within species.
12/16/14 StarterConnection/Exit: What is the true meaning of the word mutation? Are mutations bad / harmful? 12/16/14 Protein Synthesis Writing
CHAPTER 13 RNA and Protein Synthesis. Differences between DNA and RNA  Sugar = Deoxyribose  Double stranded  Bases  Cytosine  Guanine  Adenine 
In The Name of GOD Genetic Polymorphism M.Dianatpour MLD,PHD.
Single nucleotide polymorphisms and Large scale variation
Introduction A mutation is a change in the normal DNA sequence. They are usually neutral, having no effect on the fitness of the organism. Sometimes,
Lesson Four Structure of a Gene. Gene Structure What is a gene? Gene: a unit of DNA on a chromosome that codes for a protein(s) –Exons –Introns –Promoter.
Mutations in DNA changes in the DNA sequence that can be inherited can have negative effects (a faulty gene for a trans- membrane protein leads to cystic.
Chapter 2 Genetic Variations. Introduction The human genome contains variations in base sequence from one individual to another. Some sequence variants.
Protein Synthesis Transcription and Translation RNA Structure Like DNA, RNA consists of a long chain of nucleotides 3 Differences between RNA and DNA:
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Genetics 3.1 Genes. Essential Idea: Every living organism inherits a blueprint for life from its parents.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
Mutation. What you need to know How alteration of chromosome number or structurally altered chromosomes can cause genetic disorders How point mutations.
A change in the nucleotide sequence of DNA Ultimate source of genetic diversity Gene vs. Chromosome.
Javad Jamshidi Fasa University of Medical Sciences, September 2016 Session 2 Medical Genetics The Cellular and Molecular Basis of Inheritance.
Lesson Four Structure of a Gene.
Molecular mechanism of mutation
Lesson Four Structure of a Gene.
Wild-type hemoglobin DNA Mutant hemoglobin DNA LE Wild-type hemoglobin DNA Mutant hemoglobin DNA 3¢ 5¢ 3¢ 5¢ mRNA mRNA 5¢ 3¢ 5¢ 3¢ Normal hemoglobin.
Gene Mutations.
Ch 10: Protein Synthesis DNA to RNA to Proteins
Sequence Alignments—part 2
GENETIC MUTATIONS Section 5.6 Pg. 259.
Forensic DNA Analysis Protein Synthesis.
Types of Mutations.
School of Pharmacy, University of Nizwa
From Gene to Protein Central Dogma of Biology: DNA  RNA  Protein
Mutations changes in the DNA sequence that can be inherited
Do Now What is the central dogma of biology?
Mutations.
School of Pharmacy, University of Nizwa
BLAT Blast Like Alignment Tool
Mutation Notes.
Mutations.
Presentation transcript:

Single Nucleotide Polymorphisms Jennifer Lyon Eskind Biomedical Library May 1, 2009 CRC Workshop Series

Types of Genetic Variations Single Nucleotide Polymorphisms (SNP) –Single base pair changes GTCATTCGATT GTCAGTCGATT Indels –Small insertion/deletions CTT------GATC CTTACGGATC Small variable repeats – microsatellites –ACGACGACGACGACGACG (6 copies) –ACGACGACGACGACGACGACG (7 copies) Variable Long tandem repeats (can be dozens to hundreds to thousands) Chromosomal Aberrations: Translocations, Inversions, etc.

Focusing on SNPs Types of SNPs SNP nomenclature Resources for SNPs Examples and Challenges in Finding SNPs

SNPs Types SNPs can be categorized in a number of ways, the most common are by location and function (relative to a gene) Intragenic SNPs are often categorized by function – are they in a coding region, an intron, part of the mRNA, outside the mRNA but still in the gene locus (i.e., in the promoter) Extragenic SNPs may be considered simply ‘genomic’ or might be labeled relative to the nearest gene, ie. 5’ or 3’ to a gene An ‘extragenic’ SNP may affect regulatory regions important in gene expression or other DNA functions such as DNA replication.

SNP Functional Categories coding nonsynonymous –Missense, nonsense, frame shift coding synonymous Intronic –splice site mRNA utr –5' utr or 3' utr (gene) locus region (5’ or 3’ to the gene) – ‘near gene’ usually means within ~2000bp of gene genomic/extragenic (distant from any gene)

Coding Nonsynonymous SNPs Missense – change an aa

Coding Non-Synonymous SNPs Nonsense –Change an aa to a stop codon –Results in a shortened protein Frame Shift –Are really single-base indels –Drop or add one base and the triplet reading frame is thrown out of shift, altering all downstream aa’s and usually resulting in an earlier stop codon

SNP Nomenclature The Human Genome Variation Society ( has proposed some guidelines for SNP nomenclature, but at the moment, there is minimal consistency. Different sources will refer to the same SNP in different ways While dbSNP identifiers (rs# ) are becoming common, they are not required of publishing authors and not used in all cases.

SNPs at Base-Pair Level The base-pair change is given in various forms: A/C T→GC>T432G>CT73C The HGVS nomenclature recommendations: "c." for a coding DNA sequence (like c.76A>T) "g." for a genomic sequence (like g.476A>T) "m." for a mitochondrial sequence (like m.8993T>C "r." for an RNA sequence (like r.76a>u)

Position, position, position! The big issue with SNPs is identifying their location (numerically). Position can be specified: –Number location within a specific sequence –Relative to another genetic landmark Start site for a coding region of a gene Start or end of an exon or intron Relative to a marker Published articles are not always clear on this!!! Different resources may use different landmarks/numbering Numbering is always relative to the chosen sequence

Coding SNPs These are easier because they can be identified by the amino acid position rather than the base- pair position Most common nomenclature uses either 3-letter or single amino acid codes: Asn332AspOR A95V The HGVS recommendation is similar: "p." for a protein sequence (like p.Lys76Asn) Amino Acid (protein) coding sequence positions becoming more consistent, but are not always consistent

Database of SNPs (dbSNP) dbSNP is the international central repository for both single base nucleotide substitutions and short deletion and insertion polymorphisms accepts data submissions from scientists is integrated with the NCBI’s Entrez system

dbSNP Content The SNP database has two major classes of content: Submitted data, i.e., original observations of sequence variation: Submitted SNPs (SS) with ss# (ss ) Computed/curated data: Reference SNP Clusters (Ref SNP) with rs# (rs )

Reference SNP Clusters Ref SNP clusters are computer-generated and curated by NCBI staff Ref SNP Clusters define a non-redundant set of SNPs All individual SNPs submitted by a researcher are given a submitter SNP number (ss#) and then redundant (repetitive) submitter SNPs are combined into a RefSNP cluster record, with a unique rs# Ref SNP clusters may contain multiple submitted SNPs

Searching dbSNP dbSNP is searched like any other Entrez db Specialized fields include: FieldTagNotes Allele[Allele]Uses IUPAC codes for bases Chromosomal Location[CHRPOS]Uses chromosomal base-pair locations Contig Position[ctpos]Uses contig base-pair locations Function Class[Func]Includes coding synonymous, missense, nonsense, intron, utr, etc. SNP Class[SNP_Class]Includes snp, indel, mixed

SNP Limits Page

Creating a Complex Search Retrieve all synonymous coding reference SNPs for the human norepinephrine transporter gene (Slc6a2) from dbSNP Search Strategy: human[orgn] AND Slc6a2[gene] AND “coding synonymous” [FUNC] Note: To use the [gene] (gene name) field, it is necessary to have the official gene name or gene symbol as per the Human Gene Nomenclature Committee. Entrez Gene can be used to find these.

dbSNP Output – Graphical Display

dbSNP - Live Let’s look at a dbSNP reference SNP page: = http:// =

Finding SNPs - Challenges If rs# is available – start with it Not all rs#s have information in all databases Another database of interest is the Online Mendelian Inheritance in Man (OMIM)Online Mendelian Inheritance in Man (OMIM) OMIM doesn’t always provide rs#s even when there is one dbSNP records may link to OMIM or may not, even if the SNP is in an OMIM record

Example 1 rs (C>T) → Ile164Thr in ADRB2 gene HGVS nomenclature –NP_ :p.T164I To Find in OMIM Search with rs – yield nothing Search with ADRB2[gene] – find record Look at allelic variants:.0003 BETA-2- ADRENORECEPTOR AGONIST, REDUCED RESPONSE TO [ADRB2, THR164ILE ] It is a match

Example 2 rs A/G SNP located 5’ to CYP3A4 HGVS nomenclature: –NT_ :g C>T To find in OMIM Search with rs yields nothing Search with gene name CYP3A4 – find record Find list of allelic variants CYP3A4 PROMOTER POLYMORPHISM [CYP3A4, a-g PROMOTER] Compare info in dbSNP to info in OMIM (look at sequence)

Other Databases OMIM – NCBIOMIM HapMap - International HapMap ProjectHapMap ALFRED – Allele Frequence DatabasesALFRED HGVbaseG2P - Human Genome Variation database of Genotype-to-Phenotype informationHGVbaseG2P PharmGKB – Pharmacogenomics KnowledgebasePharmGKB F-SNP – Functional SNPsF-SNP