The Human Genome, impact in the biomedical domain Sonia ABDELHAK, PhD Molecular Investigation of Genetic Orphan Disorders Institut Pasteur de Tunis.

Slides:



Advertisements
Similar presentations
What is RNA splicing?.
Advertisements

Linkage and Genetic Mapping
The Human Genome Project
The Human Genome Project Main reference: Nature (2001) 409,
Genome Projects A genome project is the complete DNA sequence of the genome of an organism, and the identification of all its genes Genome projects are.
Janice S. Dorman, PhD University of Pittsburgh School of Nursing
Genomics – The Language of DNA Honors Genetics 2006.
Introduction to genomes & genome browsers
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Sequencing a genome. Definition Determining the identity and order of nucleotides in the genetic material – usually DNA, sometimes RNA, of an organism.
Human Genome Project What did they do? Why did they do it? What will it mean for humankind? Animation OverviewAnimation Overview - Click.
9 Genomics and Beyond Brief Chapter Outline
Genetics: From Genes to Genomes
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. CHAPTER 18 LECTURE SLIDES.
The Human Genome Project (Lecture 7)
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
Genomics MUPGRET Weekend Workshop Timeline Answers ne_2.html ne_2.html.
ECE 501 Introduction to BME
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
Genes. Outline  Genes: definitions  Molecular genetics - methodology  Genome Content  Molecular structure of mRNA-coding genes  Genetics  Gene regulation.
The Human Genome Race. Collins vs. Venter Collins Venter.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
The Human Genome The International Human Genome Consortium Initial sequencing and analysis of the human genome Nature, 409, February 15, (2001)
Goals of the Human Genome Project determine the entire sequence of human DNA identify all the genes in human DNA store this information in databases improve.
The Human Genome Project Public: International Human Genome Sequencing Consortium (aka HUGO) Private: Celera Genomics, Inc. (aka TIGR)
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Genome Analysis Determine locus & sequence of all the organism’s genes More than 100 genomes have been analysed including humans in the Human Genome Project.
Cloning, genomes, and proteomes
Manipulating the Genome: DNA Cloning and Analysis 20.1 – 20.3 Lesson 4.8.
Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications.
Anum kamal(BB ) Umm-e-Habiba(BB ). Gene splicing “Gene splicing is the removal of introns from the primary trascript of a discontinuous gene.
What is genomics? Study of genomes. What is the genome? Entire genetic compliment of an organism.
Genome of Drosophila species Olga Dolgova UAB Barcelona, 2008.
Fine Structure and Analysis of Eukaryotic Genes
Genome Sequencing & App. of DNA Technologies Genomics is a branch of science that focuses on the interactions of sets of genes with the environment. –
Lesson 10 Bioinformatics
AP Biology Ch. 20 Biotechnology.
Human Genome Project by: Amanda Mosello. What is the Human Genome Project? created in 1990, by the National Institutes of Health and the US Department.
AP Biology A Lot More Advanced Biotechnology Tools Sequencing.
Lesson Overview Lesson Overview Studying the Human Genome Lesson Overview 14.3 Studying the Human Genome.
Genomics BIT 220 Chapter 21.
Section 2 Genetics and Biotechnology DNA Technology
Genome Organization and Evolution. Assignment For 2/24/04 Read: Lesk, Chapter 2 Exercises 2.1, 2.5, 2.7, p 110 Problem 2.2, p 112 Weblems 2.4, 2.7, pp.
GenomesGenomes Chapter 21 Genomes Sequencing of DNA Human Genome Project countries 20 research centers.
Genome Sequencing & App. of DNA Technologies Genomics is a branch of science that focuses on the interactions of sets of genes with the environment. –
Chapter 21 Eukaryotic Genome Sequences
The Human Genome Project Dr. Jim Whitfield, Ph.D..
Initial sequencing and analysis of the human genome Averya Johnson Nick Patrick Aaron Lerner Joel Burrill Computer Science 4G October 18, 2005.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
Human Genome.
KEY CONCEPT Biotechnology relies on cutting DNA at specific places.
Lecture 10 Genes, genomes and chromosomes
David Sadava H. Craig Heller Gordon H. Orians William K. Purves David M. Hillis Biologia.blu B – Le basi molecolari della vita e dell’evoluzione The Eukaryotic.
Johnson - The Living World: 3rd Ed. - All Rights Reserved - McGraw Hill Companies Genomics Chapter 10 Copyright © McGraw-Hill Companies Permission required.
Genomics Chapter 18.
Eukaryotic genes are interrupted by large introns. In eukaryotes, repeated sequences characterize great amounts of noncoding DNA. Bacteria have compact.
Title: Studying whole genomes Homework: learning package 14 for Thursday 21 June 2016.
Looking Within Human Genome King abdulaziz university Dr. Nisreen R Tashkandy GENOMICS ; THE PIG PICTURE.
Biotechnology.
Ch 12: Genomes.
Genomes and Their Evolution
Genomics: Sequencing Is the Basis for Identifying and Mapping All Genes in a Genome Genomics, the study of genomes, encompasses structural genomics, functional.
Human Cells Human genomics
Section 2 Genetics and Biotechnology DNA Technology
Today… Review a few items from last class
Genomes and Their Evolution
Evolution of eukaryote genomes
Introduction to Sequencing
Human Genome Project Seminal achievement. Scientific milestone.
A Lot More Advanced Biotechnology Tools
Presentation transcript:

The Human Genome, impact in the biomedical domain Sonia ABDELHAK, PhD Molecular Investigation of Genetic Orphan Disorders Institut Pasteur de Tunis

Human Genome Project Historical context. Goals of the HGP. Strategy. Results. Impact on Biomedical domain. Discussion.

« Finished » sequence April 1953-April 2003 February 2001

Brief history of HGP 1984 to 1986 – first proposed at US DOE meetings 1988 – endorsed by US National Research Council (Funded by NIH and US DOE $3 billion set aside) 1990 – Human Genome Project started (NHGRI) Later – UK, France, Japan, Germany, China Celera announces a 3-year plan to complete the project years early First draft published in Science and Nature in February, 2001 Finished Human Genome sequence published in Nature 2003.

Challenges Genome Attributes –Size –Polymorphism –Repeats (Smaller repeats are technically difficult to sequence, some sequences are repeated all over the genome: How can these be placed?). Available Technology –600 bp per read(Sequencing works by extension from a primer/ gel electrophoresis. Limited by resolution of gel). –Error (~1 error per 600. Sequencing multiple times decreases error; same error unlikely in multiple reads. 10x Coverage = error rate ~1/10,000). –Relies on cloning (Some regions are difficult to clone Heterochromatin; some sequences rearrange or are deleted when cloned)

Goals of HGP Create a genetic and physical map of the 24 human chromosomes (22 autosomes, X & Y) Identify the entire set of genes & map them all to their chromosomes Determine the nucleotide sequence of the estimated 3 billion base pairs Analyze genetic variation among humans Map and sequence the genomes of model organisms

Model organisms Bacteria (E. coli, influenza, several others) Yeast (Saccharomyces cerevisiae) Plant (Arabidopsis thaliana) Roundworm (Caenorhabditis elegans) Fruit fly (Drosophila melanogaster) Mouse (Mus musculus)

Goals of HGP (II) Develop new laboratory and computing technologies to make all this possible Disseminate genome information Consider ethical, legal, and social issues associated with this research

Time-line large scale genomic analysis

Identification de Polymorphismes de type microsatellites par analyse de séquence: tggtggcagaaatcattgtctgaaaagtaattgttttacttttattcttttcgtgtgtgtgtgtgt gtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgcatgtgccagatttcttgtttgaaaggcaat gagcttcatccaagtatcaa IL-12p35AC F IL-12p35AC R atttcaggtgtgagccactgtgcctggccagaactttttcaatgaatattcaagataattgtata cacattttatatatatatatatatatacacacacacacacacacacatatgtatacacaca ttatatatataatccatgttatatacatctctacattatatatatccactatatatattttacttataca tatagattttatttttatgaactaggatcaaattgta IL-12p40AC F IL-12p40AC R 78.57% 69.23%

EST Division: Expressed Sequence Tags ,000 RNA gene products nucleus ,000 genes ,000 unique cDNA clones in library - isolate unique clones - sequence once from each end TAGTCA CGTACT sequence1 sequence2 clone xyz make cDNA library ESTs dbEST >IMAGE: ', mRNA sequence NNTCAAGTTTTATGATTTATTTAACTTGTGGAACAAAAATAAACCAGATTAACCACAACCATGCCTTACT TTATCAAATGTATAAGANGTAAATATGAATCTTATATGACAAAATGTTTCATTCATTATAACAAATTTCC AATAATCCTGTCAATNATATTTCTAAATTTTCCCCCAAATTCTAAGCAGAGTATGTAAATTGGAAGTTAA CTTATGCACGCTTAACTATCTTAACAAGCTTTGAGTGCAAGAGATTGANGAGTTCAAATCTGACCAAGAT GTTGATGTTGGATAAGAGAATTCTCTGCTCCCCACCTCTANGTTGCCAGCCCTC >IMAGE: ' mRNA sequence GACAGCATTCGGGCCGAGATGTCTCGCTCCGTGGCCTTAGCTGTGCTCGCGCTACTCTCTCTTTCTGGCC TGGAGGTATCCAGCGTACTCCAAAGATTCAGGTTTACTCACGTCATCCAGCAGAGAATGGAAAGTCAAAT TTCCTGAATTGCTATGTGTCTGGGTTTCATCCATCCGACATTGAAGTTGACTTACTGAAGAATGGAGAGA GAATTGAAAAAGTGGAGCATTCAGACTTGTCTTTCAGCAAGGACTGGTCTTTCTATCTCTTGTACTACAC TGAATTCACCCCCACTGAAAAAGATGAGTATGCCTGCCGTGTTGAACCATGTNGACTTTGTCACAGNCCC AAGTTNAGTTTAAGTGGGNATCGAGACATGTAAGGCAGGCATCATGGGAGGTTTTGAAGNATGCCGCNTT TTGGATTGGGATGAATTCCAAATTTCTGGTTTGCTTGNTTTTTTAATATTGGATATGCTTTTG

A A G C T AT A G C TA A G CT A GC AG Electrophorèse Gel plat / capillaire A G CT AT Analyse automatique dépot détection Chimie de séquençage Dye Terminator (6) amorce T C G A T A ADN Taq A G C T A T... réaction de séquence

Two Competing Strategies for Human Genome (Hierarchical shotgun) [Public human genome project] Whole-genome Shotgun [Celera project]

Sequencing BAC: Bacterial Artificial Chromosome clone Contig: joined overlapping collection of sequences or clones.

Whole-genome shotgun sequencing Private company Celera used to sequence whole human genome Whole genome randomly sheared three times –Plasmid library constructed with ~ 2kb inserts –Plasmid library with ~10 kb inserts –BAC library with ~ 200 kb inserts Computer program assembles sequences into chromosomes No physical map construction Only one BAC library Reduces problems of repeat sequences

Vérification de la qualité de séquence Elimination des séquences contaminantes Blastn contre des banques de vecteurs, de bactéries, levures,… Assemblage, Phred, Phrap, Consed Identification des séquences potentiellement codantes Comparaison avec les banques de données, Logiciels de prédictions dexons. Différentes étapes danalyse de séquence A G CT AT

GenBan k DDBJ EMBL EMBL Entrez SRS getentry NIG CIB EBI NCBI NIH Submissions Updates Submissions Updates Submissions Updates

HTG Division: High Throughput Genome Records 40,000 to > 350,000 bp phase 1 phase 2 phase 3 HTG PRI Acc = AC gi = Acc = AC gi = Acc = AC gi =

2.88 Gbp 2,851,330,913

Gene prediction Easy for procaryotes (single cell) – one gene, one protein More difficult for eukaryotes (multicell) – one gene, many proteins Very difficult for Human – short exons separated by non-coding long introns

Gene recognition Coding region and non-coding region have different sequence profiles –coding region is protected from mutation and is less random Gene recognition by sequence alignment Gene prediction by Hidden Markov Model trained by set of known genes Many genes are homologs – similar in vastly different organisms

Two predictions disagree John B. Hogenesch, et al Cell, Vol. 106, 413–415 August 24, 2001 …predicted transcripts collectively contain partial matches to nearly all known genes, but the novel genes predicted by both groups are largely non-overlapping.

Human genome content The Human Genome Total length 3000 Mb ~ 40,000 genes (coding seq) Gene sequences < 5% Exons ~ 1.5% (coding) Introns ~ 3.5% (noncoding) Intergenic regions (junk) > 95% Repeats > 50%

Global properties Pericentromeric and subtelomeric regions of chromosomes filled with large recent transposable elements Marked decline in the overall activity of transposable elements or transposons Male mutation rate about twice female –most mutation occurs in males Recombination rates much higher in distal regions of chromosomes and on shorter chromosome arms –> one crossover per chromosome arm in each meiosis

Fig 17 transposables Classes of transposable elements. LINE, long interspersed element. SINE short interspersed element. Total 45% Interspersed repeats: fixed transposable elements copied to non-homologous regions.

Fig 21 Two regions of about 1 Mb on chromosomes 2 and 22. Red bars, interspersed repeats; blue bars, exons of known genes. Note the deficit of repeats in the HoxD cluster, which contains a collection of genes with complex, interrelated regulation. Genes are sometimes protected from repeats

Important features of Human proteome 30,000–40,000 protein-coding genes Proteome (full set of proteins) more complex than those of invertebrates. –pre-existing components arranged into a richer architectures. Hundreds of genes seem to come from horizontal transfer from bacteria questionable Dozens of genes seem to come from transposable elements.

Noncoding RNA genes Transfer RNAs (tRNAs) – adaptors that translate triplet code of RNA into amino acid sequence of proteins Ribosomal RNAs (rRNAs) – components of ribosome Small nucleolar RNAs (snoRNAs) – RNA processing and base modification in nucleolus Small nuclear RNAs (sncRNAs) - spliceosomes

Human races have similar genes Genome sequence centers have sequenced significant portions of at least three races Range of polymorphisms within a race can be much greater than the range of differences between any two individuals of different race Very few genes are race specific

Genome Sizes (MegaBases)

Fig 35a Size distributions of exons in Human, Worm and Fly. Human have shorter exons.

Fig 35c Size distributions of intons in Human, Worm and Fly. Human have longer introns.

Complexity of proteome increase from yeast to humans –More genes –Shuffling, increase, or decrease of functional modules –Alternative RNA splicing – humans exhibit significantly more –Chemical modification of proteins is higher in humans

Combinatorial strategies At DNA level – T-cell receptor genes are encoded by a multiplicity of gene segments At RNA level – splicing of exons in different orders Fig

Yeast 70 human genes are known to repair mutations in yeast Nearly all we know about cell cycle and cancer comes from studies of yeast Advantages: fewer genes (6000) few introns 31% of yeast genes give same products as human homologues

Drosophila nearly all we know of how mutations affect gene function come from Drosophila studies We share 50% of their genes 61% of genes mutated in 289 human diseases are found in fruit flies 68% of genes associated with cancers are found in fruit flies Knockout mutants Homeobox genes

C. elegans 959 cells in the nervous system 131 of those programmed for apoptosis apoptosis involved in several human genetic neurological disorders Alzheimers Huntingtons Parkinsons

Mouse known as mini humans Very similar physiological systems Share 90% of their genes

Questions Remain about the Human Genome –Difficult to precisely estimate number of genes at this time Small genes are hard to identify Some genes are rarely expressed and do not have normal codon usage patterns – thus hard to detect

Impact of HG on Biomedical domain

Applications to medicine and biology Disease genes –human genomic sequence in public databases allows rapid identification of disease genes in silico Drug targets –pharmaceutical industry has depended upon a limited set of drug targets to develop new therapies –now can find new target in silico Basic biology –basic physiology, cell biology…

Hérédité liée au chromosome X

Hérédité autosomique dominante

Hérédité autosomique récessive A1A1A1A2 A1A1 A2A2 A1A2 Mm MMmm

Les mutations ponctuelles Création de codon stop CAGGln TAG

Disease Function/ Protein Gene Chromosomal localisation Disease Function/ Protein Gene Chromosomal localisation Positional cloning of genes

1 to 10 years!

' IIIIIIVVVIVIIVIIIIXXXIXIV XIII XVXII a) b) c) -III' EYA1 gene structure Bronchio-Oto-Renal Syndrome

... CCT GAG GAG CCT GTG GAG Pro Glu Glu Pro Val Glu... normalmuté anomalie cytogénétique Cartographie génétique -localisation chromosomique -localisation fine Cartographie physique et Isolement de clones spécifiques Isolement de gène (s) Recherche de mutations Etude fonctionnelle Recherche de familles -détermination du phénotype -collecte d'ADN

.... From in vivo to in vitro to in silico

Problème de pénétrance

Famille EBDD-I IV V III I II m733m7 3 M 10 33m733m7 3 M 10 33m633m6 3 M 10 33m633m6 33M833M8 33m733m7 33M833M8 Sous le mode dominant 33M733M7 33M833M8 33M833M8 33M733M7 2 M 11 33M833M8 3 M 10 33M833M8 33M733M7 3 M 10 2 M 11 44M544M5 52M952M9 33M33M 33m733m7

Maladie à pénétrance incomplète et expressivité variable Individu 1 G1 Malade Individu 2 G1 Sain ?? Environnement?

G1/ 1 G1/ 2 Epissage alternatif Non Sens mRNA decay Mécanisme de régulation post-transcriptionnelle G2 G3 Gènes modificateurs

Environemental factorsGenetic factors Complex /common disorders: multifactoriel

Hemophilia Familial Colon or Breast Cancer Alzheimers Asthma Skin Cancer Motor Vehicle Accident Cardiovascular Disease Complex Diseases : Genes & Environment Environmental Effect Genetic Component Schizophrenia Cystic Fibrosis Stroke Type 2 Diabetes Lung Cancer Bipolar Disorder

2Improve the understanding of disease etiology and mechanism 2Early disease risk assessment 2Discover new drug targets 2Disease prevention 2population or ethnic group variability The potential benefits of identifying genes/variations involved in disease Predisposition Targeted screening Prevention Diagnosis Therapy Predictive medicine

Pharmacogenomics: The Promise of Personalized Medicine

CREDIT: JOE SUTLIFF. SCIENCE, 2001 O GOD!

Acknowledgement: the following presentation has been prepared on the basis of Internet resources. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001). Venter, J. C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001). International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome., Nature 431: (2004).

Thank you