Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung.

1 Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung der Deutschen Gesellschaft für Humangenetik Heidelberg, 08.–11. März 2006

2 Human mtDNA from MITOMAP HVS-I alias HVR1

3 The perception of evolution as seen through the lenses of laboratories constitutes an overlay of two different processes: Perceived evolution = Natural evolution (of the genome) + Artificial evolution (in the lab)

4 Migrational processes (prehistory) mtDNA and evolution α : Natural evolution

5 C A A A A C 825A A d T rCRS A T C , , , ,000 Time (years) L0 3516A L L0a L L5 L2’6 = L2’3’4’6’7 L3bd = L3bcd L3ex = L3eix L3x L3i L3e L3f L3f1 L4’6 = L3’4’6’7 L3’7 = L3’4’7 L1’5 = L1’2’3’4’5’6’7 M N R L3h C M L1 2395d A L1c L L2d L3d A d L3b L3c L4 L7 L L3a d L2’5= L2’3’4’5’6’ G C T C bp-del L0a L0a2a L1c1’ L1c2 921 L3d1 750 L3e L1c2a L0af L0ak ML tree of basal African mtDNA haplogroups Coding-region variation displayed Torroni et al. (TIG, June 2006). Ethiopian samples

6 CRS RM all mutations that distinguish haplogroups M and R (part of N) incorrect rooting One of the first views of the East Asian mtDNA phylogeny (Ozawa, Herz 1994)

7 U pre- HV JT R1 R5 R6 R7 R30 R31 R9 P R11 R8 B N5 S O X A N9 West Eurasia South Asia East Asia Oceania N N1 W R R2 Palanichamy et al (Amer J Hum Genet, 2004) Star-burst of autochthonous mtDNA lineages in Eurasia (haplogroup N and its subhaplogroup R)

8 ... and a massive burst in haplogroup M, as e.g. seen in India: Sun et al (Mol Biol Evol, March 2006)

9 An Out-of-Africa model based on mtDNA analysis Kivisild et al (Springer-Verlag, April 2006)

10 HV HV0 = pre -V HV0a H3 H1 H V R0a = ( pre -HV)1 R0 = pre -HV U JT R X N2 W N1bN1a’I N1 N1a I N Sketch of the phylogeny of basal European mtDNA haplogroups Torroni et al (TIG, June 2006)

11 Spatial frequency distributions of haplogroups H1, H3, V, and U5b reveal signature of post-LGM expansions Torroni et al (TIG, June 2006)

12 Laboratory-specific processes (error and fraud) mtDNA and evolution β : Artificial evolution

13 Major sources of error in mtDNA sequence data Artificial Recombination through contamination or sample mix-up (or targeting nuclear inserts of mtDNA) Phantom mutations sequencing errors at electrophoresis Documentation errors incurred by casual reading or writing

14 Impurifying selection is the driving force in artificial evolution inasmuch as incorrect data are more flexible to interpret and can support sexy stories — seemingly told by DNA — which are then disseminated by high-impact factor journals (e.g. Science and Nature). Worst case: mtDNA in cancer research (Salas et al, PLoS Medicine 2005)

15 Case of mtDNA sample mix-up, mis-interpreted as somatic mutations; data generated with MitoChip by Maitra et al (Genome Res, 2004) Data re-analysis by Bandelt et al (J Med Genet, 2005)

16 M7a N F CC NDsq0178 rCRS L3 R C d F NDsq0167 NDsq M F1a F1a’c M R d C CC F1a1 F1a1b ACE BDFF M7a 2 M7aF1a1b NDsq0168 M7a NDsq0167 F1a1b 63 A case of cross-over in the 672 human complete mtDNA sequences from Tanaka et al (2004)

17 Prime example of a phantom mutation (Brandstätter et al, Electrophoresis 2005)

18 rCRS Electropherogram from Nasidze and Stoneking (2001) generated 1997 / 1998 and for the first time presented in Stoneking and Nasidze (Ann Hum Genet, 2006)

19 Phantom mutations can be found in excess in the HVS-I Caucasus data of Nasidze and Stoneking (2001). In view of additional problems, this may be regarded as the worst data set ever published in the realm of molecular anthropology; see Bandelt and Kivisild (Ann Hum Genet 2006) for data re-analysis

20 Sequences with phantom transitions at in those Caucasus data CodeMutation (16000+)Haplogroup AR G HV1 AR CJ AZ ? AZ pre-V AZ A ? CH G U1b CH ? DAR ? DAR ? KAB K This mutation pair has never been observed in >40,000 HVS-I sequences!

21 Electropherogram presented by Stoneking and Nasidze (Ann Hum Genet, 2006) rCRS

22 Phantom mutations in the HVS-I data of Plaza et al (Ann Hum Genet, 2003) (267 samples) SampleMutation (16000+)Haplogroup Algeria279N 285N ? Andalusia C 183C M1 Andalusia ? Andalusia281? Catalonia A U5b Catalonia K Morroco K Morroco C 285T TL2d Morroco L1b Morroco C T2 Morroco183C GX Morroco TU5b Saharawi GL3e1 Saharawi U6? Saharawi G?

23 Comparison with 1624 complete sequences stored in the mtDB database Variation in : Only 20 transitional variants at Variation in : Only transitional variants at 16371, 16380, and 16381

24 Re-evaluation of the mtDNA data from the lab of Min-Xin Guan missing mutations misscored mutations in red Yao et al (Hum Genet, 2006) N M rCRS R

25 Strategies of authors to deal with errors 1st:Publishing a corrigendum [rare event] 2nd:No correction — but avoiding similar errors in future work [common practice] 3rd: No action — and committing the same errors as before [e.g. as Min-Xin Guan and colleagues do] 4th: Fraudulent action — performing fake analyses and giving false statements [ as done by Mark Stoneking and Ivane Nasidze in the Ann Hum Genet ]

26 ... only L strand, no H strand information shown! Stoneking and Nasidze (2006)

27 Human Mitochondrial DNA and the Evolution of Homo sapiens Series: Nucleic Acids and Molecular Biology, Vol.18 Volume package: Human Mitochondrial DNANucleic Acids and Molecular BiologyHuman Mitochondrial DNA Bandelt, Hans-Jürgen; Richards, Martin; Macaulay, Vincent (Eds.) 2006, Approx. 250 p., 31 illus., 2 in colour., Hardcover ISBN: Springer-Verlag Due: April 2006

