Shin-Han Shiu Department of Plant Biology Evolution of Plant Stress Responsiveness: Genome-wide and Gene Family Level Analysis Shin-Han Shiu Department of Plant Biology
Outline Major interests and why Gene families and stress responsiveness: The interplay between gene family expansion, duplication mechanism, and the elusive selection pressure The Receptor Kinase family as an example One of the biggest plant gene families and their involvement in plant biotic interactions If there is enough time, the short story on plant pseudogenes When can you can a gene pseudogene?
Major interests Source of selection pressure: abiotic and biotic stress conditions Target of selection: duplicate genes Molecular evolutionary patterns Genetic basis of adaptation
Where does all these duplicates come from Whole genome duplication Tandem duplication Segmental duplication Replicative transposition +
Measuring Lineage-specific Gain Orthologous group and lineage-specific gain Reconcile species and gene trees
Expansion at the orthologous group level Most successful RLK: LRR type But Arabidopsis may not be representative of land plants log(freq) Enrichment: log(OG size)
Two major patterns in OG expansion Convergent expansion Single lineage expansion Most successful RLK: LRR type But Arabidopsis may not be representative of land plants Enrichment:
Expansion patterns and duplication mechanisms Comparison of ratios between tandem and non-tandem genes e.g. for A-M orthology Convergent Single-lineage Tandem 756 848 Non-tandem 4500 2918 Ratio 0.17 0.30 Most successful RLK: LRR type But Arabidopsis may not be representative of land plants
Summary I Duplicate gene turn over But even though some of them are retained for millions of years, the majority of them will be lost over hundreds MY time scale. The degree of lineage-specific expansion is similar at the family level but with substantial variation Expansion patterns fall into two major categories Convergent expansion Single lineage expansion Orthologous group with single lineage expansion Tend to be enriched in tandemly repeated genes Most successful RLK: LRR type But Arabidopsis may not be representative of land plants
Expansion of responsive genes and conditions Genes in expanded OGs tends be enriched in stress responsive genes +: significant at the 5% level Most successful RLK: LRR type But Arabidopsis may not be representative of land plants
Stress responsiveness and duplication mechanisms Enrichment of tandemly over non-tandemly expanded genes under biotic conditions Most successful RLK: LRR type But Arabidopsis may not be representative of land plants Significant at the 5% level T: tandem >> non-tandem N: non-tandem >> tandem
Summary II Over the course of plant evolution, retention rate: Stress response genes >> genome average True for genes up-regulated in both biotic and abiotic stress conditions Influence of duplication mechanism, particularly for biotic stress conditions, retention rate: Tandem >> non-tandem However, genes responsive to biotic stimuli are not necessarily tandem Depend on their location in the signaling network e.g. Plant receptor kinase: biotic -> tandem e.g. Transcription factors -> non-tandem, presumably WGD Most successful RLK: LRR type But Arabidopsis may not be representative of land plants
Functional evolution of duplicate genes Question: What is the fate of duplicate gene? Address this question in the context of stress responsiveness. Most successful RLK: LRR type But Arabidopsis may not be representative of land plants
Step1:construct the phylogeny of genes Step2: map current functions Reconstruction of ancestral functions Step1:construct the phylogeny of genes Step2: map current functions Step3:reconstruct the function of ancestral genes
Branch-based analysis Ancestral state determined by BayesTraits An MCMC/ML combination to estimate the trait values at ancestral nodes Definition 1,-1,o }}
Relative abundance of functional evolution classes Retention > loss >> gain >> switch 0->0 Big one one col Lost gain
Evolution of stress responsiveness over Ks What ks
Relative abundance of functional evolution classes N Maintenance NLoss Nswitch N = NM+NL+NS Abiotic stress Biotic stress Number of Nm N total number Ks Ks
Relative abundance of functional evolution classes N Maintenance NGain N = total Nswitch Abiotic stress Biotic stress Ks Ks
What is the nature of functional switch? Switch: e.g. 1 to -1 Up regulation in the ancestral state But down regulated in the current state Does switch involve a one-step or a 2-step process: Seem to be the second case since: Loss rate >> gain rate >> switch rate Function switch
Summary III Branch-based analysis Maintenance > loss > gain > switch Retention of stress responsiveness Ks< 0.8, continued loss Ks> 0.8, nearly constant Responsiveness gain has a similar trajectory Loss > gain > switch Suggest a two-step evolutionary process of functional swtich. Most successful RLK: LRR type But Arabidopsis may not be representative of land plants
Functional evolution in the context of duplicate pairs Functional partition
Relative abundance of functional evolution classes Looking at each gene pair under each condition Partition is the most abundant class Retention Partition Neo-F PL RET PAR NEO Parallel loss
Functional partition of gene pairs Partition tend to be extremely asymmetric Condition - refers to condition and time? How many
Statistical significance of asymmetry Partition tend to be extremely asymmetric, why? Gene 2 conditions Gene 1 conditions Condition - refers to condition and time? How many
Stress responsiveness of genes over multiple conditions Over-representation of broadly responsive genes ABIOTIC CONDITIONS BIOTIC CONDITIONS Add explain
Enrichment of cis elements under each condition Some cis-elements are in genes that are broadly responsive Up-regulated Down-regulated Both Neither Add explain Cis-elements Stress conditions
Enrichment of cis elements under each condition Some cis-elements are in genes that are broadly responsive Add explain
Breadth in responsiveness vs. cis-element complexity Positive correlation between these two factors Spearman rank: ρ = 0.51, p < 2.2e-16 Add explain Number of cis-element types Number of responsive conditions
Expression and cis-element asymmetries Some cis-elements are in genes that are broadly responsive Spearman rank: ρ = 0.62, p < 2.2e-16 Add explain Asymmetry (cis-element) Asymmetry (up-regulation)
Summary IV Sub- vs. neofunctionalization Sub > retention > neo > parallel loss Subfunctionalization asymmetry Significant asymmetry Explanation Some genes are controlled by multi-responsive elemenents Differential loss/gain of cis-elements Add explain
Acknowledgement Lab members Melissa Lehti-Shiu Gaurav Moghe Cheng Zou Past member Kosuke Hanada, RIKEN Collaborators Jeff Conner Gregg Howe, PRL Rong Jin, CSE Doug Schemske Mike Thomashow, PRL Funding: