Comparative Genome Analysis
Comparative yeast genomics Kellis et al (2003) Nature 423,
Genic tree Inter-Genic tree
Reading Frame Conservation Species 1 ATG GAC GAT AGT CTT CGA GCA AAA TAG Species 2 TTG GAC GGA TAG TCT TGA GCA AAA AAG Species 3 CTG GA- GAT AGT CTT CGA GCA AAA TAG Species GAT AGT -TT CGA GCA AAA TTA 6000 ORFs -> 5500 ORFs ORF no ORF }
Phylogenetic footprints
72 motifs found: 40 known, 32 new
Ancient genome duplication Kellis, Birren & Lander (2004) Nature 428,
Duplicated protein evolution Average evolution (n = 457) Accelerated evolution (1 out of 72) Gene conversion Dating the WGD
Transcriptional regulatory code Harbison et al. (2004) Nature 431,
On-off Graded response Combinatorial regulation Physical interaction between regulators
Requires other molecule (leucine precursor) Requires regulation by movement Concentration dependent affinity changes Interaction with other regulator changes affinity
Mapping: principles
Variation Phenotypic traits Molecular markers –Microsattelites –Indels –SNPs Markers are polymorphic and determine only one genomic region
Linkage
Marker development: A Comparative Approach
Legume - Genomics Aim: Development of Molecular Markers for legume genetics Approach: CATS - Comparative anchor tagged sequences Alignment of ESTs from multiple legume species Alignment of EST to genomic region Intron
CATS – comparative anchor tagged sequences Alignment of ESTs from multiple legume species Alignment of EST to genomic region Intron Identification of evolutionary conserved regions Design of primers for PCR amplification of intron in mapping parents for polymorphism acertainment
Copyright ©2004 by the National Academy of Sciences Choi, Hong-Kyu et al. (2004) Proc. Natl. Acad. Sci. USA 101, , Doyle & Luckow species Legume Taxonomy Genome
Looking for conserved regions exon intron exon intron Lotus genome Glycine EST Medicago EST Phaseolus? AGC..ATCGATCAGGACAGT..TGTAC..CCCAC..AT GGAGGAGGACAATAAGAGAC CTAAACTCTCTCTAG TAC..CCCAC..AT AGC..AT GGG..AATAC..CCCAC..AT TAC..CC CAC..AT
Alignment: Lotus, Glycine, Medicago introns Good marker region
Reasons for conservation 1.Purifying selection on proteins conserves positions 1 and 2 of a codon 2.Selection on RNA G/C content introduces Codon bias that conserves the third position High G/C: constitutively expressed genes Low G/C: regulated genes (defense etc) 3.Purifying selection on RNA secondary structure Regulatory RNA molecules
CATS marker development
ETs: Expressed transcripts (full length cDNAs) TC: Tentative clusters
Three-way conservation Lotus Medicago Glycine
Copyright ©2004 by the National Academy of Sciences Paterson, A. H. et al. (2004) Proc. Natl. Acad. Sci. USA 101, Phylogenetic tree of whole genome duplications
Number of homologes in Arabidopsis Number of three way conserved sequences Three way conserved sequences vs. Arabidopsis homologe count
Macrosynteny Copyright ©2004 by the National Academy of Sciences Choi, Hong-Kyu et al. (2004) Proc. Natl. Acad. Sci. USA 101,
Macrosynteny between M. truncatula and L. japonicus Copyright ©2004 by the National Academy of Sciences Choi, Hong-Kyu et al. (2004) Proc. Natl. Acad. Sci. USA 101, Phaseolus ? Leg148 Leg101 Leg178 Leg148
Microsynteny between Arabidopsis, M. truncatula and L. japonicus Copyright ©2004 by the National Academy of Sciences Choi, Hong-Kyu et al. (2004) Proc. Natl. Acad. Sci. USA 101,
Consensus comparative map data for six legume species Copyright ©2004 by the National Academy of Sciences Choi, Hong-Kyu et al. (2004) Proc. Natl. Acad. Sci. USA 101,