Presentation on theme: "Gene tree analyses of Aboriginal Australians Rosalind Harding University of Oxford."— Presentation transcript:
Gene tree analyses of Aboriginal Australians Rosalind Harding University of Oxford
Aim To investigate gene genealogies of two data sets –Human mitochondrial coding genomes from Aboriginal Australians –Hepatitis B virus from the Pacific region (collaboration with Rory Bowden) Why? –To evaluate time depth of polymorphism –To use coalescent models rather than molecular clocks in phylogenies –To examine the implications of demographic assumptions
Aim for the ongoing study of HBV For HBV, mutations at fast sites have to be removed to resolve networks. But, mutations at fast sites contribute to high estimates of mutation rates. If we remove the fast sites, how do we recalibrate the mutation rates? Can we match patterns of HBV diversity in the Pacific region to human dispersals that have been dated by archaeology and genetics, to suggest appropriate time scales?
Coalescence times in a gene genealogy Notice that T(2) is longer than T(3). Here N is assumed constant Rosenberg and Nordborg, 2002 This time scaling shows what we expect for a standard coalescence model.
Introducing Mutation G → T1 2 3 MCRA of 1, 2 & 3 The MCRA of 1, 2 & 3 is usually a more recent (younger) common ancestor than the common ancestor in whom a shared mutation, G → T, first arose for the whole tree Rosenberg and Nordborg, 2002
Constant N vs Expansion Gene genealogy simulated assuming constant N e ABC Frequencies of 3 alleles Frequencies of 11 alleles Gene genealogy simulated under population expansion
Computational analyses Software based on Genetree written by Prof Bob Griffiths Input data: infinite-sites compatible gene tree Unpublished upgrades that use importance sampling, following algorithms developed by Paul Fearnhead.
Polymorphism data for gene genealogies
Resolving the gene trees MtDNA coding genomes: –Minor problem: recurrent or back mutation events –Solution: re-instate inferred mutation events following standard mtDNA phylogeny reconstructions HBV data: –Major problem: a subset of fast sites –Solution: determine fast sites using Parat software from Meyer & Von Haeseler, 2003, Mol. Biol. Evol. 20(2): , and proceed as above.
Background to mtDNA study Van Holst Pellekaan et al. (2006) Mitochondrial genomics identifies major haplogroups in aboriginal Australians. Am J Phys Anthrop 131: Estimated a time scaled genealogy for 8 mtDNA coding regions from individual samples sequenced by Van Holst Pellekaan. No of genomes in public database and available to study has increased, now n=34.
r1 d3 d38 d32 r17 r6 r7 r25 KYA 74,200 17,400 26,700 36,350 36,300 43,000 53,200 N AuA N AuE N AuD N AuC M AuB 15301G 10873T 10398A 9540T 8701A 15043A 14783C 12771A 10400T 8793C 4508T 1598G 12705C 15040T 14384C 8506C 8404C 8251A 15002A 9095C 4008G 1598G 13341T 13105G 8542C 8474T 8014G 7705C 5126T 15885T 15852C 15300C 12771A 10724C 8705C 6755A 6221C 5563A 5147A 15521C 15511C 15110A 12756A 12414C 11404G 11353C 11065G 8614C 8167C 7805A 5237A 3391A 10398G 13419G 13132T 11288T 11016A 10914A 10088T 6881G 6260A 5276G 4976G 4688C 3010A 1719A 591A 12999G 8635A 8251A 7961C 1346G 1187C 14527G 9410G 9156G 6104T 5563A 14572T 10645G 8269A 5442C 2772T 12715G 11110G 10786C 14783T 15607G NM Present Both of the major non-African haplogroups represented. Time scale estimated from gene tree suggests that these lineages evolved from original founders, 40,000 – 50,000 years ago. mtDNA genealogy
M: AuB; N: haplogroups O (AuD), S (AuA), P: P3, P4 (AuC), P5, P6, P7, P8 (AuE) Network of 34 mtDNA genomes
Time scale for Australian mtDNA Estimated mutation rate: – : per coding region per generation Data suggests population expansion Find model parameters with relatively high likelihood –ML( ) = 350 = 2Nu –Population expansion rate since TMRCA : e 5 –TMRCA: time to most recent common ancestor Population size –N present: 33,000 –from N ancestral: 220 at TMRCA TMRCA: 66,000 yrs Note: P3 is the only haplogroup with branches represented in both Australia and PNG.
New analysis confirms: Aboriginal Australian diversity has been evolving in isolation for ~40,000 years. mtDNA genealogy
Background to HBV study Analysis by Rory Bowden Focus on HBV variability in Australia and the Pacific and judge the time scale of the genealogy by comparison with hypotheses for HBV dispersal Within Genotype C are two very distinct sequences from aboriginal Australians.
HBV Genotypes Worldwide, 7 HBV genotypes each with distinct geographic distribution. Sampling in East Asia and Pacific region finds mainly genotypes C and D
Australia and Pacific region First occupation of Australia: 50,000 yrs BP; PNG and Solomons: 30,000 yrs BP; Austronesian expansion: Vanuatu: 5,000 yrs BP; Fiji and Tonga: 3,000 yrs BP.
HBV C Genotype Network Various, mainly Melanesian Vanuatu Tonga, Fiji China/Japan AUSTRALIA
HBV C: Starting again … Network after removal of 10 fastest sites. S antigen sequences, relative rate cut-off of 15.
HBV C: more resolution Network of S antigen sequences
HBV Genotype C in the Pacific
Time scales 5000 or 50,000 years? 3000 or 30,000 years? 2000 or 20,000 years?
Conclusions Gene trees can be constructed for mtDNA and HBV data to represent polymorphism data. Coalescent analyses are feasible Contemporary mtDNA diversity in aboriginal Australians dates represents founding lineages Contemporary HBV diversity in Australia and Pacific could be explained by two alternative times scales (more work to do!) –Over 50,000 years –Over 5,000 years
Abstract: Gene tree analyses of Aboriginal Australians. Genetrees from mitochondrial DNA sequences have been widely used for phylogeographic analyses of modern human dispersal but are not so often used in combination with coalescent models for demographic inferences. Given the lack of recombination in mtDNA, such data should be ideal for gene tree based coalescent analyses. However, the same mutability that makes them so informative for studies of geographic variation also generates difficulties for analyses assuming an infinite-sites mutation process. The main aim of this talk is to present some gene tree based coalescent analyses applied to hypervariable sequence data from mtDNA and also other genomes, and discuss solutions to the problems ensued. The primary data set comprises 34 mtDNA coding genomes from Aboriginal Australians and extends work presented by van Holst Pellekaan et al. (Am J Phys Anthrop 131: , 2006). Mitochondrial DNA is not the only haploid genome that has value for anthropological genetics. Vertically transmitted bacteria can also be informative, as has been previously shown using data from Helicobacter pylori. In collaboration with Rory Bowden, we hope to show that Hepatitis B virus strains may also provide insights into anthropological questions about Aboriginal Australian prehistory. Hopefully, I will have results on some gene tree analyses as well as methodological issues to discuss.