Presentation on theme: "Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications."— Presentation transcript:
Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications.
HGP: Background International Human Genome Sequencing Consortium: Proposed 1985, endorsed in 1988. 20 governmental groups. “Public project.” Craig Venter & Celera Genomics: Founded 1998. Sequence in 3 years. Technology: automation, computers. Had access to public project’s data. Race ends in tie Feb. 2001: both publish in Science and Nature.
International Human Genome Sequencing Consortium Approach was conservative and methodical. Had to wait for technology. First produced a clone-based physical map of the genome that would serve as a scaffold for the later sequence data: –Broke genome into chunks of DNA whose position on chromosome was known from maps, clone into bacteria using BACs. –Digest BAC-inserted clonal chunks of DNA into small fragments. –Sequence small fragments. –Stitch together BAC clones to assemble sequence. –Assemble genome sequence from BAC clone sequences, using clone-based physical map.
Celera Approach using "shotgun sequencing" (no organized map). Shreds genome randomly into small fragments with no idea of where they are physically located. Clones and sequences fragments. Uses computer to stitch together genome by matching overlapping ends of sequenced fragments.
Timeline Genome sequencing driven by technology. –1985: 500 base pairs per day by hand. –1985-86: PCR and automated DNA sequencing. –1992: BACs. –2000: 1000 bases per second.
Waiting for Technology Eyes on the human genome. While waiting for technology other genomes were sequenced.
Current Status Human genome ~3.2 Gb. “Rough draft” sequence of the human genome. Have sequenced 90% of the 2.5 Gb of gene- rich (euchromatic) DNA. What is considered finished? –Fewer than 1 base in 10,000 is incorrectly assigned. –More than 95% of the euchromatic regions are assigned. –Each gap is smaller than 150 kb.
Access to Information All public project data on the Internet. NCBI Website: www.ncbi.nlm.nih.gov.www.ncbi.nlm.nih.gov –Human genome database. –Sequence and mapping tools.
Database Search Example The genome database has many tools to locate a gene of interest or search for potential traits of the gene. Example–chromosomal map search result for the "breast cancer–causing gene" BRCA2:
Early Statistics Only 28% is transcribed into RNA. Only 1.1%-1.4% of genome actually encodes protein (=5% of transcribed RNA). Surprises: –More junk DNA. –Fewer genes.
Junk DNA No apparent direct biological function. Long stretches of repeated sequence. Hot area of investigation. Human genome has far more repeat DNA than any other sequenced organism (over half). Parasitic elements–45% of this repeat DNA is from selfish, parasitic DNA: –Transposable elements. –May play role in evolution.
Gene Count Many fewer genes than expected (half): –Only 35,000-45,000 genes vs. previously predicted 100,000. –Only twice the amount of a nematode or a fruit fly. –Does not correlate to twice as complex. –Alternative splicing: Invertebrate genes are more innovative in their assembly of genes. –Protein domains are mixed more creatively and in larger numbers by invertebrates. Genes elusive.
Genetic Variation The International Single Nucleotide Polymorphism (SNP) Map. –Compiled 1.4 million SNPs (single-base pair differences between individuals). Investigate: –Disease resistance. –Response to therapeutics. –Evolution. –Natural selection. –Individual traits.
Gene Variation Example Mutations in "breast cancer gene” BRCA2. Chromosomal location and beginning sequence with one of the mapped variations.
Future Directions Fill gaps (refinement). Bioinformatics. Sequence additional genomes. –For comparison. –Upcoming: mouse, fish, dogs, kangaroo, chimpanzee (most valuable). Proteomics. Gene and Protein Chips (Microarrays).