Lecture 7: Gen(om)e duplications 9/23/09. Homework 1. Clustal and trees 2. Ensembl links 3. OMIM.

Slides:



Advertisements
Similar presentations
Evolution of genomes.
Advertisements

EAnnot: A genome annotation tool using experimental evidence Aniko Sabo & Li Ding Genome Sequencing Center Washington University, St. Louis.
Chap. 6 Problem 2 Protein coding genes are grouped into the classes known as solitary (single) genes, and duplicated or diverged genes in gene families.
Lecture #5 Vertebrate visual pigments 2/7/13. HW #3 There are two things on the assignment page: Assign#3.pdf which has the homework problems HumanGreenRedCones.xlsx.
GENE TREES Abhita Chugh. Phylogenetic tree Evolutionary tree showing the relationship among various entities that are believed to have a common ancestor.
Basics of Comparative Genomics Dr G. P. S. Raghava.
Comparative genomics Joachim Bargsten February 2012.
Homework Assignments due next session 1.Find a entry of interest in OMIM ( )
© Wiley Publishing All Rights Reserved. Phylogeny.
Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
Finding Orthologous Groups René van der Heijden. What is this lecture about? What is ‘orthology’? Why do we study gene-ancestry/gene-trees (phylogenies)?
Some basics: Homology = refers to a structure, behavior, or other character of two taxa that is derived from the same or equivalent feature of a common.
Bioinformatics and Phylogenetic Analysis
Tree Pattern Matching in Phylogenetic Trees Automatic Search for Orthologs or Paralogs in Homologous Gene Sequence Databases By: Jean-François Dufayard,
Bas E. Dutilh Phylogenomics Using complete genomes to determine the phylogeny of species.
Finding Orthologous Groups René van der Heijden. What is this lecture about? What is ‘orthology’? Why do we study gene-ancestry/gene-trees (phylogenies)?
Genome Browsers UCSC (Santa Cruz, California) and Ensembl (EBI, UK)
Phylogenetic Shadowing Daniel L. Ong. March 9, 2005RUGS, UC Berkeley2 Abstract The human genome contains about 3 billion base pairs! Algorithms to analyze.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Genome Evolution: Duplication (Paralogs) & Degradation (Pseudogenes)
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Aequatus Browser, an open-source web-based tool developed at TGAC to visualise homologous gene structures among differing species or subtypes of a common.
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Protein Evolution and Sequence Analysis Protein Evolution and Sequence Analysis.
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
Phylogenetic trees School B&I TCD Bioinformatics May 2010.
Computational Biology, Part D Phylogenetic Trees Ramamoorthi Ravi/Robert F. Murphy Copyright  2000, All rights reserved.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
Applied Bioinformatics Week 8 Jens Allmer. Practice I.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Questions. 09_12_Mutation.jpg Gene Evolution Pages
1 of 38 Data Mining in Ensembl with BioMart. 2 of 38 Simple Text-based Search Engine.
Chapter 8 Molecular Phylogenetics: Measuring Evolution.
Introduction to Phylogenetics
Function preserves sequences Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Bioinformatic Tools for Comparative Genomics of Vectors Comparative Genomics.
Why do trees?. Phylogeny 101 OTUsoperational taxonomic units: species, populations, individuals Nodes internal (often ancestors) Nodes external (terminal,
Phylogeny Ch. 7 & 8.
Annotation of Drosophila virilis Chris Shaffer GEP workshop, 2006.
Applied Bioinformatics Week 8 Jens Allmer. Theory I.
Phylogenetics.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
Sequence Alignment Abhishek Niroula Department of Experimental Medical Science Lund University
Automatic and manual sequence alignment Inferring phylogenetic trees Mining web-based databases Estimating rates of molecular evolution Testing evolutionary.
Copyright OpenHelix. No use or reproduction without express written consent1.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
Visualization of genomic data Genome browsers. How many have used a genome browser ? UCSC browser ? Ensembl browser ? Others ? survey.
VI. Mutation A.Overview B.Changes in Ploidy C.Changes in ‘Aneuploidy’ (changes in chromosome number) D. Change in Gene Number/Arrangement.
각종 생물정보 분석도구 의 실무적 활용 및 실습 김형용 개발팀 Insilicogen, Inc.
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
Genetic Code and Interrupted Gene Chapter 4. Genetic Code and Interrupted Gene Aala A. Abulfaraj.
Primary Mechanism of Duplication : Unequal Crossing Over Crossing over Between Daughter Strands Addition (duplication) Deletion (tandom duplications)
Evolutionary genomics can now be applied beyond ‘model’ organisms
Genetics and Evolutionary Biology
Basics of Comparative Genomics
Lecture #4 : Comparing genes
Genomes and Their Evolution
Genome Projects Maps Human Genome Mapping Human Genome Sequencing
Molecular Evolution.
Ensembl Genome Repository.
Pairwise Sequence Alignment
Multiple sequence alignment & Phylogenetics Analysis
Basics of Comparative Genomics
Welcome - webinar instructions
Volume 21, Issue 23, Pages (December 2011)
Presentation transcript:

Lecture 7: Gen(om)e duplications 9/23/09

Homework 1. Clustal and trees 2. Ensembl links 3. OMIM

HW #1  GNAT1

Fasta file >Human_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLEECLEFIAIIY GNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMSDIIQRLWKDSGIQACFERAS EYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGIIETQFSFKDLNFRMFDVGGQRSERKKWIHC FEGVTCIIFIAALSAYDMVLVEDDEVNRMHESLHLFNSICNHRYFATTSIVLFLNKKDVFFEKIKKAHLS ICFPDYDGPNTYEDAGNYIKVQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF >Chimp_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLEECLEFIAIIY GNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMSDIIQRLWKDSGIQACFERAS EYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGIIETQFSFKDLNFRMFDVGGQRSERKKWIHC FEGVTCIIFIAALSAYDMVLVEDDEVNRMHESLHLFNSICNHRYFATTSIVLFLNKKDVFFEKIKKAHLS ICFPDYDGPNTYEDAGNYIKVQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF >Dog_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLEECLEFIAIIY GNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMSDIIQRLWKDSGIQACFERAS EYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGIIETQFSFKDLNFRMFDVGGQRSERKKWIHC FEGVTCIIFIAALSAYDMVLVEDDEVNRMHESLHLFNSICNHRYFATTSIVLFLNKKDVFSEKIKKAHLS ICFPDYDGPNTYEDAGNYIKVQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF Note: Programs will use whatever is in the identifier up to the 1st space as labels. If you don’t like genbank #s, you can change this to species names.

CLUSTAL multiple sequence alignment Human_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE 60 Chimp_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE 60 Dog_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE 60 Cow_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE 60 Rat_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE 60 Mouse_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE 60 Zfish_GNAT1 MGAGASAEEKHSRELEKKLKEDADKDARTVKLLLLGAGESGKSTIVKQMKIIHKDGYSLE 60 ***********************:*****************************:****** Human_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS 120 Chimp_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS 120 Dog_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS 120 Cow_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS 120 Rat_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS 120 Mouse_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS 120 Zfish_GNAT1 ECLEFIVIIYSNTMQSILAVVRAMTTLNIGYGDAAAQDDARKLMHLADTIEEGTMPKELS 120 ******.***.**:*****:********* ***:* *********:************:* Human_GNAT1 DIIQRLWKDSGIQACFERASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI 180 Chimp_GNAT1 DIIQRLWKDSGIQACFERASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI 180 Dog_GNAT1 DIIQRLWKDSGIQACFERASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI 180 Cow_GNAT1 DIIQRLWKDSGIQACFDRASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI 180 Rat_GNAT1 DIIQRLWKDSGIQACFDRASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI 180 Mouse_GNAT1 DIIQRLWKDSGIQACFDRASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI 180 Zfish_GNAT1 DIILRLWKDSGIQACFDRASEYQLNDSAGYYLNDLERLIQPGYVPTEQDVLRSRVKTTGI 180 *** ************:***************.*****: ******************** Human_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 Chimp_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 Dog_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 Cow_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 Rat_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 Mouse_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 Zfish_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 ************************************************************ Human_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFFEKIKKAHLSICFPDYDGPNTYEDAGNYIK 300 Chimp_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFFEKIKKAHLSICFPDYDGPNTYEDAGNYIK 300 Dog_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFSEKIKKAHLSICFPDYDGPNTYEDAGNYIK 300 Cow_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFSEKIKKAHLSICFPDYNGPNTYEDAGNYIK 300 Rat_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFSEKIKKAHLSICFPDYDGPNTYDDAGNYIK 300 Mouse_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFSEKIKKAHLSICFPDYDGPNTYEDAGNYIK 300 Zfish_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFVEKIKKAHLSMCFPEYDGPNTFEDAGNYIK 300 ****************************** *********:***:*:****::******* Human_GNAT1 VQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF 350 Chimp_GNAT1 VQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF 350 Dog_GNAT1 VQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF 350 Cow_GNAT1 VQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF 350 Rat_GNAT1 VQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF 350 Mouse_GNAT1 VQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF 350 Zfish_GNAT1 VQFLDLNLRRDIKEIYSHMTCATDTENVKFVFDAVTDIIIKENLKDCGLF 350 ****:**:***:*************:************************ 350 sites * Fixed : /350 = 92.6%

HW #1  GNGT1 Fixed =49/74 = 66% Human_GNGT1 MPVINIEDLTEKDKLKMEVDQLKKEVTLERMLVSKCCEEVRDYVEERSGEDPLVKGIPED 60 Chimp_GNGT1 MPVINIEDLTEKDKLKMEVDQLKKEVTLERMLVSKCCEEVRDYVEERSGEDPLVKGIPED 60 Dog_GNGT1 MPVINIEDLTEKDKLKMEVDQLKKEVTLERMLVSKCCEEVRDYVEERSGEDPLVKGIPED 60 Cow_GNGT1 MPVINIEDLTEKDKLKMEVDQLKKEVTLERMLVSKCCEEFRDYVEERSGEDPLVKGIPED 60 Mouse_GNGT1 MPVINIEDLTEKDKLKMEVDQLKKEVTLERMMVSKCCEEVRDYIEERSGEDPLVKGIPED 60 Rat_GNGT1 MPVINIEDLTEKDKLKMEVDQLKKEVTLERVMVSKCCEEVRDYIEERSREDPLVKGIPED 60 Zfish_GNGT1 MPIIDVENMTDLDKAKMEVTQLKTEVKLERAKVSKCCEEITEYIQGGADEDPLVKGIPEE 60 **:*::*::*: ** **** ***.**.*** *******. :*:: : **********: Human_GNGT1 KNPFKELKGGCVIS 74 Chimp_GNGT1 KNPFKELKGGCVIS 74 Dog_GNGT1 KNPFKELKGGCVIS 74 Cow_GNGT1 KNPFKELKGGCVIS 74 Mouse_GNGT1 KNPFKELKGGCVIS 74 Rat_GNGT1 KNPFKELKGGCVIS 74 Zfish_GNGT1 KNPFKE-KGGCVIC 73 ****** ******.

Protein interactions Rhodopsin GNAT1 GNB1 GNGT1

Relative constraint, % of fixed sites  GNAT1324 / 350 = 92.6%  GNB1306 / 340 = 88%  GNGT149 / 74 = 66%

Trees

Ensembl search finds lots of groups  Interpro domain - identifies and groups proteins by protein signatures  Ensembl families - proteins grouped by phylogenetic relationship  Vega / Havana - the human hand curated part of the ensembl database. They confirm each predicted gene in different genomes Find proteins, pseudogenes, processed pseudogenes

We want Ensembl protein_coding Gene Check that it is rhodopsin and not some rhodopsin related gene

Transcript and protein info are useful

Protein - use links at left to look at the sequence

Protein sequence

Exon shows sequences of exons as well as those of UTRs, and introns Start 5’UTR Intron

cDNA sequence includes known SNPs Variation in human population

Can export sequence

Ensembl  There is a dizzying array of data and info on this web site.  We will try to use it as a “helpful” tool to gather more sequences  Often we just want to get all the homologs from all the species where Ensembl has made that link -

At bottom of sequence list is link to sequence display

Go back to the gene page and scroll down to find orthologs This shows pairwise comparisons in clustalw format.

OMIM

Q4. Making trees  Clustalw is a bit limited Sequences are compared using distances Trees are drawn by neighbor joining  Nice to have more options Max likelihood, distance, parsimony  Phylip - set of modules that you can mix and match to make trees Phylemon Pasteur Institute

Methods  Parsimony - Alignment  Input characters to parsimony tree program  Distance Alignment  Calculate distances  Input distances to tree program  Maximum likelihood Alignment  Input characters to ML program

Steps to make a distance tree StepsProgram Align sequencesClustalw-multialign Calculate distancesDNAdist Protdist Use distances to make a tree Neighbor Display treeExternal program

Steps to make a distance tree  Align sequences Can do in clustalw at EBI web site or at Pasteur web site

Pasteur Institute - Phylogenetics

Clustalw2 at Pasteur - under alignment and under multiple Either paste in sequences or select fasta file and upload

Leave defaults and hit Run

Save files to keep results Clustal does make dendogram which you can save

Save files to keep results You can pass the results of this to the next program here

Calculate distances  If DNA use DNAdist  If protein (AA) use Protdist

Pass alignment to protdist

Use Protdist under distance Upload or paste data and say Run

Save distance matrix then send to neighbor joining program to make tree

Tell it which # taxa is the outgroup - this will root your tree! 7

 CLUSTAL multiple sequence alignment Human_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE Chimp_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE Dog_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE Cow_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE Rat_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE Mouse_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE Zfish_GNAT1 MGAGASAEEKHSRELEKKLKEDADKDARTVKLLLLGAGESGKSTIVKQMKIIHKDGYSLE ***********************:*****************************:****** Human_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS Chimp_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS Dog_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS Cow_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS Rat_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS Mouse_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS Zfish_GNAT1 ECLEFIVIIYSNTMQSILAVVRAMTTLNIGYGDAAAQDDARKLMHLADTIEEGTMPKELS ******.***.**:*****:********* ***:* *********:************:* Human_GNAT1 DIIQRLWKDSGIQACFERASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI Chimp_GNAT1 DIIQRLWKDSGIQACFERASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI Dog_GNAT1 DIIQRLWKDSGIQACFERASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI Cow_GNAT1 DIIQRLWKDSGIQACFDRASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI Rat_GNAT1 DIIQRLWKDSGIQACFDRASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI Mouse_GNAT1 DIIQRLWKDSGIQACFDRASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI Zfish_GNAT1 DIILRLWKDSGIQACFDRASEYQLNDSAGYYLNDLERLIQPGYVPTEQDVLRSRVKTTGI *** ************:***************.*****: ******************** Note: Zebrafish is taxa #7

Save tree

What does this tree mean???  Tree shows relationships and branch lengths (((Cow_GNAT1: ,Rat_GNAT1: ): ,Mouse_G NAT: ): , ((Human_GNAT: ,Chimp_GNAT: ): ,Dog_ GNAT1: ): ,Zfish_GNAT: );  Just relationships: (((Cow,Rat),Mouse),((Human,Chimp),Dog),Zfish)

You can download FigTree for drawing trees Mac PC

Tree - does this make sense?

What is the difference between homologs, orthologs and paralogs?????

Orthologs Have common ancestor, derived by descent Paralogs Gene duplicates within the same organism Homologs = orthologs + paralogs

LWS RH2 SWS2 SWS1 RH1 Lamprey LWS Lamprey RHB Lamprey RHA Lamprey S2 Lamprey S1

How do gen(ome)s evolve?  What can change? DNA mutation DNA deletions / insertions (indels) Recombination Selection - change in gene frequency Gene transfer Duplications

Human Chicken Frog Zebrafish Dog Human Chicken Frog Zebrafish Dog Lamprey Gene duplication

Ohno Evolution by Gene Duplication, 1970  Gene duplication is the primary way that you get new genes to work with  Genome duplications Double # of chromosomes Keep balance in biochemical machinery Duplicate regulatory structure  New genes can evolve to do new jobs!

Gene vs genome duplications  How do you know what has duplicated?

Mechanisms for duplication 1.Tandem duplication 2.Insertion of retrotransposed gene 3.Genome / chromosome duplication

1. Mismatched recombination  Leads to extra genes inserted right next to original gene  Unequal crossover

Normal DNA recombination Switches genes from one chromosome to the other Leads to new gene combinations

Mismatched recombination If chromosomes misalign, recombination leads to gain of gene on one chromosome and loss of gene on the other. Tandem arrays of genes

Opsin gene tandem arrays on X chromosome Only first 2 genes are expressed so it doesn’t matter if there are more green genes. They are just along for ride.

Misaligned recombination If recombination happens within gene, get chimera Intermediate phenotype - changes pigment light sensitivity Opsin genes on X chromosome

Human red and green opsins 530 nm 560 nm A S A A164S=+2 nm Y F T F261Y=+10 nm A269T=+14 nm 554 nm

Normal human visual pigments Normal max = 420, 535, 565 nm

Deuteranomoly - green pigment shifted towards red max = 420, 550, 565 nm 5% male 0.04% female

2. Insertion of retrotransposed gene  Gene can be transcribed to mRNA  mRNA then gets reverse transcribed and inserted into DNA Clue a gene is retrotransposed?  No introns - all coding sequence

Comparison of rhodopsin genes Vertebrate rhodopsin gene Fish rhodopsin gene

Possibilities  Lost introns and stayed in place  mRNA sequence reinserted somewhere else in the genome

Fugu - human comparison Rh1 Human chr 3 Fugu scaffold 830 Human chr Z Fugu Rh gene has been inserted into chromosome