Presentation on theme: "Mutations and Epimutations A story of two cultivars and their children. Matteo Pellegrini."— Presentation transcript:
Mutations and Epimutations A story of two cultivars and their children. Matteo Pellegrini
Nipponbare and Nipponbare: – Oryza sativa japonica Primarily Japan, China, Indonesia Agronomic differences: Days to heading – Oryza sativa indica India, Bangladesh, Nepal, China Submerged growth Agronomic differences: Seed fertility Long grain Taller (83 cm)
Why Study Crosses? Crosses of Indica and Japonica are often sterile Show hybrid vigor in agronomic traits
Overview Identify SNPs between ecotypes. – SNP generation Identify epiMutations between ecotypes. – Identify methyl-inheritance Identify allele-specific expression Identify RNA editing P F1 NPB rice ecotypes: Nipponbare and Generated BS-seq data for NPB, 93-11, and 2 reciprocal crosses
Detecting Cytosine Methylation A, C unmethylated, C methylated, G, T ? … m mm … … ACCCGTACCCGATTAG … … ATCTGTATCCGATTAG … Apply sodium bisulfite and amplify: Unmethylated C → T, methylated C (and A/G/T) unchanged Try to align new sequence to known reference; compare
Mapping Approach: BS Seeker Chen et al (2010) BMC Bioinformatics BS reads are C/T converted, so normal aligners are not applicable Three letter alignment: AATCGTA CTAATCGCAG G BS read: Ref. genome: TTAATTGTAGG AATTGTA Convert C to T AATTGTA TTAATTGTAGG Bowtie mapping CTAATCGCAG G AATCGTA Restore to 4 letters m m u u Compare alignments
Methylation levels at single-base resolution 7 Calculate methylation level at each covered cytosine Methylation level= #C/(#C+#T) 5’ --attgagacatcctagcgcgtggtgacaataata—- 3’ ttttagcgcgtggtg cattttagtgcgtgg tagtgcgtggtg 3/(3+0)=100% 1/(1+2)=33.3% Ref. genome:
Workflow Alignments – BS-Seeker mapping of NPB and 9311 samples to NPB reference genome. – Maps 9311 genome to NPB coordinates Parent genomes – Each read generates a small implied sequence fragment. – Use this to generate a parent genome. F1 read matching Map reads to NPB reference genome to get location. Compare each read to NPB and 9311 parent genomes and determine better match.
SNP parent1 parent2 Methylation level at CG sites Methylation level at CG sites BS-seq parent1/parent2 Detecting Alelle-Specific methylation
Library statistics Methyl-Seq ReadsMapped% MappedCoverage NPB 298M 134M45% M 74M47% NPB x M 279M47% NPB x NPB 543M236M43% NPB RNA-Seq NPB 42M 17M42% M 13M31% NPB x M 12M26% -NPB x NPB 43M 11M25% -NPB
Identifying SNPs If sites: – > 3 reads/strand – > 90% agreement within ecotype – Strands agree with each other (compensate for Cs). – (obviously) disagree with each other. Will miss indels, dups, inversions, other chr rearrangements. Will miss long runs of SNPs ( > 3 within ~55 bp) (BS-seeker limit)
SNPs - NPB vs ,209,456 mutations / 306,106,830 sites with mutual base calls ~ 1/253 bases Mostly (73%) C->T (or G- >A if C->T on opposite strand) or T->C & A->G if in other ACGT A 86,677,300 42, ,135 42,513 C 43,336 65,771,387 34, ,045 G 34,146 65,771,387 43,336 T 42, ,135 42,553 86,677,300
SNPs - NPB vs F1 (9N-NPB) 12 mutations Are these real or false? Similar numbers amongst all F1 comparisons ACGT A 3,188, C - 2,695, G 2 - 2,548,205 - T ,253,196
Identifying epimutations Use the binomial dist. to build min, max, and mean pct methylation at each C. Confidence intervals at 5% are min, max As # of reads ^, interval size v Reads Min/max
Identifying epimutations (cont) Called different if: – mean(sample1) max(sample1)
1 in 300 CG sites spontaneously mutate across one generation Epimutation rate
Epimutations are enriched in regions where parents differ Half of the epimutations between parents and crosses occur at sites where parents differ
Epimutations (continued) Epimutations within genes – 498 genes were significantly enriched for epimutations – GO Term x-ecotypes indicates: ATP synthesizing related activity (ATP synthesis coupled proton transport, hydrogen transport, ion transmembrane transport, etc).
Expression Many genes (~7800/25640) are differentially expressed between ecotypes. GO term: choroplast related terms, response to cadmiumion.
Expression cont. Across generations, only 78 genes differentially expressed Of these only 2 were differentially expressed in the parents
Allele Specific Expression 681 examples of allele specific expression Partially explain hybrid vigor? NPB parent NPB cross 9311 parent 9311 cross NPB cross 9311 cross
Allele-Specific Genes Accumulate Mutations SNP Density All genesAllele-specific genes And are also enriched for differentially methylated sites
Allele-specific Expression cont. And are also enriched for differentially methylated sites
RNA Editing Cytidine deamination : C to U Adenosine deaminase: A to I (G)
How Widespread Recent studies indicate that RNA editing may be more widespread than originally thought Others have disputed this claim (Schrider et al, PlosOne) In plants RNA editing is thought to take place in the mitochondria and plastids Is there editing in nuclear genes? Science Jul 1;333(6038):53-8.
RNA Editing in Rice NPB - RNA ACGT NPB - DNA A C G T Initially we found lots of examples….
On Closer Inspection… Alignments are often off by one or more bases at splice sites
But a Few Real Ones Remain?
But more Filtering Should be done… Position of edit site along read
Conclusions Epimutation rates are one in 300 cytosines across one generation – Clusters of epimutations are present – Are enriched in sites where parental epigenomes differ Allele-specific expression is widespread and associated with – Increased SNP densities – Higher differential methylation Find some evidence for RNA editing but…