Presentation is loading. Please wait.

Presentation is loading. Please wait.

Human Genome Sequence and Variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary,

Similar presentations


Presentation on theme: "Human Genome Sequence and Variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary,"— Presentation transcript:

1 Human Genome Sequence and Variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu Medical Genomics Course – Debrecen, Hungary, May 2006

2 Lecture overview 1. Genome sequencing strategies, sequencing informatics 2. Genome annotation, functional and structural features in the human genome 3. Genome variability, DNA nucleotide, structural, and epigenetic variations

3 1. The Human genome sequence

4 The nuclear genome (chromosomes)

5 The genome sequence the primary template on which to outline functional features of our genetic code (genes, regulatory elements, secondary structure, tertiary structure, etc.)

6 Completed genomes ~1 Mb ~100 Mb >100 Mb ~3,000 Mb

7 Main genome sequencing strategies Clone-based shotgun sequencing Whole-genome shotgun sequencing Human Genome ProjectCelera Genomics, Inc.

8 Hierarchical genome sequencing BAC library construction clone mapping shotgun subclone library construction sequencing sequence reconstruction (sequence assembly) Lander et al. Nature 2001

9 Clone mapping – “sequence ready” map

10 Hierarchical genome sequencing BAC library construction clone mapping shotgun subclone library construction sequencing/read processing sequence reconstruction (sequence assembly) Lander et al. Nature 2001

11 Shotgun subclone library construction BAC primary clone cloning vector sequencing vector subclone insert

12 Hierarchical genome sequencing BAC library construction clone mapping shotgun subclone library construction sequencing/read processing sequence reconstruction (sequence assembly) Lander et al. Nature 2001

13 Sequencing

14 Robotic automation Lander et al. Nature 2001

15 Base calling PHRED base = A Q = 40

16 Vector clipping

17 Hierarchical genome sequencing BAC library construction clone mapping shotgun subclone library construction sequencing/read processing sequence reconstruction (sequence assembly) Lander et al. Nature 2001

18 Sequence assembly PHRAP

19 Repetitive DNA may confuse assembly

20 Sequence completion (finishing) CONSED, AUTOFINISH gap region of low sequence coverage and/or quality

21 2. Human genome annotation

22 Genome annotation – Goals protein coding genesRNA genes repetitive elements GC content

23 The starting material AGCGTGGTAGCGCGAGTTTGCGAGCTAGCTAGGCTCCGGATGCGA CCAGCTTTGATAGATGAATATAGTGTGCGCGACTAGCTGTGTGTT GAATATATAGTGTGTCTCTCGATATGTAGTCTGGATCTAGTGTTG GTGTAGATGGAGATCGCGTAGCGTGGTAGCGCGAGTTTGCGAGCT AGCTAGGCTCCGGATGCGACCAGCTTTGATAGATGAATATAGTGT GCGCGACTAGCTGTGTGTTGAATATATAGTGTGTCTCTCGATATGT AGTCTGGATCTAGTGTTGGTGTAGATGGAGATCGCGTGCTTGAG TCGTTCGTTTTTTTATGCTGATGATATAAATATATAGTGTTGGTG GGGGGTACTCTACTCTCTCTAGAGAGAGCCTCTCAAAAAAAAAGCT CGGGGATCGGGTTCGAAGAAGTGAGATGTACGCGCTAGXTAGTAT ATCTCTTTCTCTGTCGTGCTGCTTGAGATCGTTCGTTTTTTTATGCT GATGATATAAATATATAGTGTTGGTGGGGGGTACTCTACTCTCTCT AGAGAGAGCCTCTCAAAAAAAAAGCTCGGGGATCGGGTTCGAAGA AGTGAGATGTACGCGCTAGXTAGTATATCTCTTTCTCTGTCGTGCT

24 Coding genes – ab initio predictions ATGGCACCACCGATGTCTACGTGGTAGGGGACTATAAAAAAAAAAA Open Reading Frame = ORF Stop codon Start codon PolyA signal

25 Ab initio predictions Gene structure

26 Ab initio predictions …AGAATAGGGCGCGTACCTTCCAACGAAGACTGGG… splice donor site splice acceptor site

27 Ab initio predictions Genscan Grail Genie GeneFinder Glimmer etc… EST_genome Sim4 Spidey EXALIN

28 Homology based predictions ATGGCACCACCGATGTCTACGTGGTAGGGGACTATAAAAAAAAAAA ACGGAAGTCT known coding sequence from another organism GGACTATAAA expressed sequence genes predicted by homology Genomescan Twinscan etc…

29 Consolidation – gene prediction systems Otto Ensembl FgenesH Genscan Grail Genewise Sim4 dbEst

30 ncRNA genes prediction based on structure (e.g. tRNAs) for other novel ncRNAs, only homology-based predictions have been successful

31 Repeat annotations Repeat annotation are based on sequence similarity to known repetitive elements in a repeat sequence library

32 The landscape of the human genome

33 Gene annotations – # of coding genes Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001

34 Gene annotations – gene length Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001

35 Gene annotations – gene function Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001

36 GC content and coding potential Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001

37 ncRNAs Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001

38 Segmental duplications Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001

39 Repeat elements Lander et al. Initial sequencing and analysis of the human genome, Nature, 2001

40 Genes and repeats

41 Physical vs. genetic map (Mb/cM) 0.4 cM1.3 cM0.7 cM 0.4 Mb0.7 Mb0.3 Mb

42 3. Human genome variability

43 DNA sequence variations the reference Human genome sequence is 99.9% common to each human being sequence variations make our genetic makeup unique SNP the most abundant human variations are single-nucleotide polymorphisms (SNPs) – 10 million SNPs are currently known

44 DNA sequence variations insertion-deletion (INDEL) polymorphisms

45 Structural variations Speicher & Carter, NRG 2005

46 Structural variations Feuk et al. Nature Reviews Genetics 7, 85–97 (February 2006) | doi:10.1038/nrg1767

47 Detection of structural variants Feuk et al. Nature Reviews Genetics 7, 85–97 (February 2006) | doi:10.1038/nrg1767

48 Epigenetic changes: chromatin structure Sproul, NRG 2005

49 Epigenetic changes: DNA methylation Laird, NRC 2003


Download ppt "Human Genome Sequence and Variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary,"

Similar presentations


Ads by Google