2 Learning Objectives By the end of this class you should understand: The information that may result from sequencing a genomeThe techniques and purpose of genetic mapsThe history of sequencing the human genome and its future potentialThe primary study and tools of the field of bioinformaticsThe relationship between the genome and the proteome and exomeConcerns about ownership and genomes
3 Advancing Genetic Knowledge Mutations have always been a major foundation for understanding geneticsWithout mutations we would all be identical!Before genetic sequencing, before even DNA, mutations were used to track relationship between genes
4 Chromosomal LinkageA chromosome with two defects will create both-or- neither inheritanceRecall this is chromosomal linkageAll genes on X chromosome are linkedHemophilia and colorblindness showed linkageAutosomal linkage took much longer to find
5 Chromosomal LinkageNail-Patella Syndrome and Blood Type were finally linkedWe now know they are both on chromosome 9However, linkage was not totalRemember crossing over? AKA Chromosomal recombination
6 Crossing OverRecall during meiosis (Prophase I) chromosomes will exchange DNAThis enables us to track what genes are on what chromosomeAnd where!
8 Chromosomal DistanceThe closer two genes are on a chromosome, the more frequently they will cross over togetherThe farther away they are, the more one will recombine without the otherPercentage of recombination measured in units of centimorgans2/16 = 12.5% = 12.5 centimorgans distance
9 Genetic CloningThanks to PCR and improved DNA sequencing, many genes were mapped to chromosomes starting in the 1980sHuntington's, Cystic Fibrosis, Retinoblastoma, etcThe Human Genome Project was initiated in 1990
10 Human Genome Project Goal: Seq uence entirety of human genome Additional goal: figure out what genes do whatAdditional goal: sequence other animals/plantsWork got faster and faster as time went onComputers were created that could accelerate the project
11 DNA Sequencing Human genome is 3.2 billion DNA bases Not the largest, not even close!Marbled lungfish genome: billion DNA basesThat is NOT a typo!Sequencing entire genome took a long time!
13 Sequencing MethodsMap-based sequencing locates sequences of DNA relative to markers found on each chromosomeA comprehensive map of which genes are on which chromosome can be constructed and measured in centimorgansWhole genome sequencing slices the entire genome into cloned libraries then sequences them all using computer technology
15 Genome SequencingEntirety of genome can now be sequenced in days instead of yearsStill expensive but getting cheaper!The human genome is 98% noncodingThe coding portion of the human genome is called the exome
16 Genome vs. Exome Exome is the coding portions of DNA only Codes for proteinsA scan of all exons in genome50 times less data to analyzeUsed to screen and diagnose for genetic disordersMay not catch CNV disorders
17 Now What? Once you have a genome, what do you do with it? The field of bioinformatics addresses what to do with this massive amount of dataBioinformaticians must be skilled in computer science and math (linear algebra and probability)
18 Bioinformatics Specialties Comparative GenomicsStudying relationship between different animals/plants/bacteriaStructural GenomicsStudying structure of proteins produced by genesPharmacogenomicsHow to make drugs to repair diseases
19 Bioinformatics Basics The most important skill in bioinformatics: locating a gene!Find the gene in the following sequence:ACAGGAGAAATATACCAATACCGCTTGCGAGAGATCATGGAATCTCGAGCGTTATGTGAATGCTGAAAAAAAAAAAA bioinformatician can find it!Technically the entire sequence is a gene but the bioinformatician can find the start codon and the upstream promoter regionTechnically the bioinformatician can write a computer program to do that for him/her
20 Gene LocationBy homing in on clues like the TATA box and CCAAT initiator sequence, bioinformaticians can locate genesThis is known as annotationMust be compared to protein sequences to have introns identified
22 Introns and ExonsThe codons (beginning with ATG) form an open reading frame (ORF)The origin of the term frameshiftInsertion/deletion mutations may alter the frameshift unless they occur in an intronIntron variability is much higher than exons!Computers can detect all these predictable patterns
24 Full GenomeRemember that standard genes comprise only 2% of the genome, topsLong stretches of DNA are repeated sequences that still serve important functions!Alterations to these repeated stretches in one spot are called single nucleotide polymorphisms
25 “Junk” DNA Alterations Single base changes (SNPs) can be compiled into a haplotypeSimilar to a bar code of the chromosomeReveals where the chromosome was inherited from and who else have a closely related chromosomeChanges to the number of repeated bases are called copy number variants
26 Copy Number Variants CNVs alter the length of a chromosome Since DNA can affect expression of genes thousands of bases away, CNVs are linked to various disordersProbably due to variable or reduced expression of the associated gene/protein
27 Genome vs. ProteomeEven though there are only 25,000 genes, there are over 200,000 proteinsPossibly up to a million!!Due to differential intron/exon splicingAll the proteins in the body together make the proteomeStudying proteins is called proteomics
28 Concerns of Genomics Major concern: who owns your genome? Remember, now genes CANNOT be patentedResearch still belongs to company even if it's done with your tissueMajor concern: should we be testing genomes of healthy people?Especially while information is incomplete!