Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to genomes & genome browsers Content  Introduction to genomes  The human genome  Human genetic variation SNPs CNVs Alternative splicing.

Similar presentations


Presentation on theme: "Introduction to genomes & genome browsers Content  Introduction to genomes  The human genome  Human genetic variation SNPs CNVs Alternative splicing."— Presentation transcript:

1 Introduction to genomes & genome browsers Content  Introduction to genomes  The human genome  Human genetic variation SNPs CNVs Alternative splicing  Browsing the human genome Celia van Gelder CMBI UMC Radboud December 2014

2 Exponential Growth in Genomic Sequence Data # of genomes First 2 bacterial genomes complete First eukaryote complete (yeast) First metazoan complete (flatworm)

3

4

5 Ebola

6 The human genome Genome: the entire sequence of DNA in a cell 3 billion basepairs (3Gb) 22 chromosome pairs + X en Y chromosomes Chromosome length varies from ~50Mb to ~250Mb About protein-coding genes ( average gene length 3000 bases, but largest known gene is 2.4 Mb (dystrophin)) Human genome is 99.9% identical among individuals This means that every 2 persons differ in 3 million nts!!

7 Eukaryotic Genomes: more than collections of genes Genes & regulatory sequences make up 5% of the genome – Protein coding genes – RNA genes (rRNA, snRNA, snoRNA, miRNA, tRNA) – Structural DNA (centromeres, telomeres) – Regulation-related sequences (promoters, enhancers, silencers, insulators) – Parasite sequences (transposons) – Pseudogenes (non-functional gene-like sequences) – Simple sequence repeats

8 The human genome cntnd From: Molecular Biology of the Cell (4 th edition) (Alberts et al., 2002) Only 1.2% codes for proteins Long introns, short exons Large spaces between genes More than half consists of repetitive DNA Alu repeat ~300 bp > million copies

9 Non coding DNA

10 Human Genetic Variation Genetic variation explains some of the differences among people, such as: – Blood group – Eye color, Skin color, Hair color – Length – Higher or lower risk for getting particular diseases Cystic fibrosis, Sickle cell disease, Diabetes, Cancer, Arthritis, Asthma etc

11 Variations in the Genome Common Sequence Variations Polymorphism Deletions Translocations Insertions Chromosome

12 Today’s focus 1.Single Nucleotide Polymorphisms (SNPs) 2.Copy number variations (CNV) 3.Alternative transcripts

13 Single Nucleotide Polymorphisms (SNPs) SNPs are DNA sequence variations that occur when a single nucleotide (A,T,C,or G) in the genome sequence is altered. For a variation to be considered a SNP, it must occur in at least 1% of the population. SNPs make up about 90% of all human genetic variation and occur every 100 to 300 bases. SNPs can occur in coding (gene) and non coding regions of the genome; <1% alter the protein sequence

14 SNPs determine properties like eye color, hair (curly or straight), or if you can taste bitter or not. are used for identification and forensics are used for estimating predisposition to disease can cause drug side–effects and/or non responsiveness for the drug have impact on how humans respond to environmental factors like bacteria, viruses, toxins and chemicals are used to predict specific genetic traits are used for classifying patients in clinical trials are used for mapping and genome-wide association studies of complex diseases

15 SNP - Bitter tasting, TAS2R38

16 SNP & disease, Alzheimer Alzheimer's disease (AD) & apolipoprotein E (APOE) Apolipoprotein E is a cholesterol carrier that is found in the brain and other organs. APOE is suspected to be involved in amyloid beta aggregation and clearance, influencing the onset of amyloid beta deposition. APOE contains 2 SNPs that result in 3 possible alleles: E2, E3, E4. Variant rs rs7412 E2 T + T E3T + C E4C + C A person who inherits at least one E4 allele will have a greater chance of developing AD.

17 Today’s focus 1.Single Nucleotide Polymorphisms (SNPs) 2.Copy number variations (CNV) 3.Alternative transcripts

18 Copy Number Variation Copy Number Variations (CNVs): segment of DNA (> 1 kB) which is present at variable copy number in two or more genomes When there are genes in the CNV areas, this can lead to variations in the number of gene copies between individuals CNVs contribute to our uniqueness. CNVs can also influence the susceptibility to disease. CNVs may either be inherited or caused by de novo mutation

19 Copy Number Variation Normal cell deletion amplification CN=0 CN=1 CN=3 CN=4 CN=2

20 CNVs and their possible effects on gene expression. Cabianca D S, Gabellini D J Cell Biol 2010;191: © 2010 Cabianca and Gabellini

21 CNVs & disease Many inherited genetic diseases result from CNVs; – Gene copy number can be elevated in cancer cells – Autism – Schizophrenia (dept. human genetics) – Mental retardation (dept. human genetics) – Parkinsons disease There are CNVs that protect against HIV infection and malaria. The contribution of CNV to the common, complex diseases, such as diabetes and heart disease, is currently less well understood

22 Today’s focus 1.Copy number variations (CNV) 2.Single Nucleotide Polymorphisms (SNPs) 3.Alternative transcripts

23 Alternative splicing

24 Defects in alternative splicing have been implicated in many diseases, including: – neuropathological conditions such as Alzheimer disease – cystic fibrosis, those involving growth and developmental defects – many human cancers, e.g. BRCA1 in breast cancer – Beta-globin in Beta-thalassemia – Parkinsons Disease

25 Annotating & Browsing the Human Genome

26 Annotating the genome Annotation: attaching biological information to sequences. Two main steps: identifying elements on the genome attaching biological information to these elements.

27 Basic & Advanced Genome Annotation Basic: – Genomic location – Gene features: Exons, Introns, UTRs – Transcript(s) – Pseudogenes, Non-coding RNA – Protein(s) – Links to other sources of information Advanced – Cytogenetic bands – Polymorphic markers – Genetic variation, including SNPs & CNVs – Repetitive sequences – cDNAs or mRNAs from related species – Genomic sequence variation – Regulation sequences (enhancers, silencers, insulators)

28 [Human] Genome Browsers EBI Ensembl NCBI Map Viewer UCSC Genome Browser Not limited to only human data

29 Ensembl ©EMBL-EBI

30 Other Ensembl Installations ©EMBL-EBI (2013)

31 genes & predictions variations & repeats cross-species comparative data & many more types of data from expression & regulation to mRNA and ESTs… Gene X Description Transcript data Structure Gene Ontology Pathway Data Homologous Genes Expression Data Etc…. Organized Data Based on Chromosome Location tracks

32

33 HGNC – a unique name and symbol for every gene in human ENSG### Ensembl Gene ID ENST### Ensembl Transcript ID ENSP### Ensembl Peptide ID ENSE### Ensembl Exon ID

34 Ensembl: An Example Click for more details tracks

35 Direction of transcription Above blue line: forward strand Below blue line: reverse strand

36 Ensembl Transcripts ©EMBL-EBI A red transcript comes from Ensembl or VEGA/Havana. A transcript from the Ensembl annotation pipeline starts with 2 (MYO6-201) A transcript with Vega/Havana manual curation starts with 0 (MYO6-001) A gold, or merged, transcript is identical between Ensembl automated annotation and VEGA/Havana manual curation. Only human, mouse, and zebrafish will have gold transcripts. This transcript can be thought of as stable (unlikely to change), and is coloured gold. It is assigned a number beginning with 0. A blue, pink or grey transcript is non-coding. See the 'NON-CODING TRANSCRIPTS' section below for more.

37

38

39 Synopsis- What can I do with Ensembl ? View, examine & explore annotated information for any chromosomal region: – Genes, – ESTs, mRNAs, alternative transcripts – Proteins – SNPs, and SNPs across strains (rat, mouse), populations (human), or even breeds (dog) – homologues and phylogenetic trees across more than 40 species – whole genome alignments – conserved regions across species – gene expression profiles Upload your own data and use BLAST/BLATagainst any Ensembl genome Export sequence, or create a table of gene information

40 Help Glossary FAQ Help & Documentation -> Tutorials Save configuration Share this link functionality Share this image functionality


Download ppt "Introduction to genomes & genome browsers Content  Introduction to genomes  The human genome  Human genetic variation SNPs CNVs Alternative splicing."

Similar presentations


Ads by Google