Workshop Biostatistics in relationship testing ESWG-meeting 2014 Daniel Kling Norwegian Institute of Public Health and Norwegian University of Life Sciences,

Slides:



Advertisements
Similar presentations
Forensic DNA Inference ICFIS 2008 Lausanne, Switzerland Mark W Perlin, PhD, MD, PhD Joseph B Kadane, PhD Robin W Cotton, PhD Cybergenetics ©
Advertisements

Attaching statistical weight to DNA test results 1.Single source samples 2.Relatives 3.Substructure 4.Error rates 5.Mixtures/allelic drop out 6.Database.
How TrueAllele ® Works (Part 3) Kinship, Paternity and Missing Persons Cybergenetics Webinar December, 2014 Mark W Perlin, PhD, MD, PhD Cybergenetics,
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt.
Forensics and DNA Statistics Harry R Erwin, PhD CIS308 Faculty of Applied Sciences University of Sunderland.
1 Statistical genetics and genetical statistics Thore Egeland, Rikshospitalet and Section of Medical Statistics Joint work with P. Mostad, NR, B. Olaisen,
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Gene Frequency and LINKAGE Gregory Kovriga & Alex Ratt.
METHODS FOR HAPLOTYPE RECONSTRUCTION
Tutorial #5 by Ma’ayan Fishelson. Input Format of Superlink There are 2 input files: –The locus file describes the loci being analyzed and parameters.
Beyond Null Hypothesis Testing Supplementary Statistical Techniques.
Hardy-Weinberg Equilibrium
Basics of Linkage Analysis
Joint Linkage and Linkage Disequilibrium Mapping
Kinship Analysis in Immigration Cases Charles H. Brenner, Ph.D. consulting in forensic mathematics Oakland, California
Fundamentals of Forensic DNA Typing Slides prepared by John M. Butler June 2009 Appendix 3 Probability and Statistics.
DATA ANALYSIS Module Code: CA660 Lecture Block 2.
IGES 2003 How many markers are necessary to infer correct familial relationships in follow-up studies? Silvano Presciuttini 1,3, Chiara Toni 2, Fabio Marroni.
Inbreeding. inbreeding coefficient F – probability that given alleles are identical by descent - note: homozygotes may arise in population from unrelated.
Thoughts about the TDT. Contribution of TDT: Finding Genes for 3 Complex Diseases PPAR-gamma in Type 2 diabetes Altshuler et al. Nat Genet 26:76-80, 2000.
Tutorial #5 by Ma’ayan Fishelson Changes made by Anna Tzemach.
Genetic Statistic Application in Forensic Science Arthur J. Eisenberg, PhD Professor and Chairman Department of Forensic and Investigative Genetics Co-Director.
Title: Population Genetics 12th February 2014
New Technologies Y-STR DNA Pedigree (7 generations)
Forensic Statistics From the ground up…. Basics Interpretation Hardy-Weinberg equations Random Match Probability Likelihood Ratio Substructure.
11.4 Hardy-Wineberg Equilibrium. Equation - used to predict genotype frequencies in a population Predicted genotype frequencies are compared with Actual.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
Population Genetics is the study of the genetic
Chapter 7 Population Genetics. Introduction Genes act on individuals and flow through families. The forces that determine gene frequencies act at the.
DNA evidence The DNA Double Helix Consists of so-called nucleobases always in pairs A-T, C-G. One part of the pair is inherited from the mother, the other.
Confidence intervals and hypothesis testing Petter Mostad
Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.
Joint Linkage and Linkage Disequilibrium Mapping Key Reference Li, Q., and R. L. Wu, 2009 A multilocus model for constructing a linkage disequilibrium.
1 Haplotyping Algorithms Qunyuan Zhang Division of Statistical Genomics GEMS Course M Computational Statistical Genetics Mar. 29,
Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm.
Lecture 12: Linkage Analysis V Date: 10/03/02  Least squares  An EM algorithm  Simulated distribution  Marker coverage and density.
Tutorial #10 by Ma’ayan Fishelson. Classical Method of Linkage Analysis The classical method was parametric linkage analysis  the Lod-score method. This.
Lecture 15: Linkage Analysis VII
1 B-b B-B B-b b-b Lecture 2 - Segregation Analysis 1/15/04 Biomath 207B / Biostat 237 / HG 207B.
Allele Frequencies: Staying Constant Chapter 14. What is Allele Frequency? How frequent any allele is in a given population: –Within one race –Within.
Using Familias and FamLink Athens 29 May, Familias 10 w w.umb.no NORWEGIAN UNIVERSITY OF LIFE SCIENCES.
Errors in Genetic Data Gonçalo Abecasis. Errors in Genetic Data Pedigree Errors Genotyping Errors Phenotyping Errors.
Statistical weights of single source DNA profiles Forensic Bioinformatics ( Dan E. Krane, Wright State University, Dayton, OH Forensic.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Chapter 23: Evaluation of the Strength of Forensic DNA Profiling Results.
Lab 4: Inbreeding and Kinship. Inbreeding Reduces heterozygosity Does not change allele frequencies.
Individual Identity and Population Assignment Lab. 8 Date: 10/17/2012.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Seventh Annual Prescriptions for Criminal Justice Forensics Program Fordham University School of Law June 3, 2016 DNA Panel.
Lecture 17: Model-Free Linkage Analysis Date: 10/17/02  IBD and IBS  IBD and linkage  Fully Informative Sib Pair Analysis  Sib Pair Analysis with Missing.
Lecture 15: Individual Identity and Forensics October 17, 2011.
Gonçalo Abecasis and Janis Wigginton University of Michigan, Ann Arbor
ESWG paper challenges – A walkthrough Athen, May 29, 2014
Validating TrueAllele® genotyping on ten contributor DNA mixtures
Population Genetics: Selection and mutation as mechanisms of evolution
Measuring Evolution of Populations
Recombination (Crossing Over)
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Population stratification
Basic concepts on population genetics
Population Genetics & Hardy - Weinberg
Familias and Forensic genetics Thore.
DNA Identification: Inclusion Genotype and LR
Hardy Weinberg: Population Genetics
Hardy-Weinberg Equillibrium
Association Analysis Spotted history
Linkage Analysis Problems
X-chromosomal markers and FamLinkX
Presentation transcript:

Workshop Biostatistics in relationship testing ESWG-meeting 2014 Daniel Kling Norwegian Institute of Public Health and Norwegian University of Life Sciences, Norway Andreas Tillmar National Board of Forensic Medicine, Sweden

Outline of this first session Likelihood ratio principle Theoretical: paternity issues –Calculation of LR “by hand”. –Accounting for: Mutations Null/silent alleles Linkage and linkage disequilibrium (allelic association) Co-ancestry

Is Mike the father of Brian? DNA-testing to solve relationship issues DNA-data vWAD12S391D21S11…. Mike:12,1322,2231,32.2…. Mother of Brian:12,1423,23.230,30…. Brian:13,1422,2330,31…. Can we use the information from the DNA-data to answer the question? YES! We estimate the “weight of evidence” by biostatistical calculations. How much more (or less) probable is it that M is the father of B, compared to not being the father of B? We will follow the recommendations given by Gjertson et al., 2007 “ISFG: Recommendations on biostatistics in paternity testing.”

H 1 : The alleged father is the father of the child H 2 : The alleged father is unrelated to the child We wish to find the probability for H 1 to be true, given the evidence = Pr(H 1 |E) Via Baye’s theorem: Posterior oddsLR (or PI)Prior odds Today we focus on the likelihood ratio (LR) Calculating the “Weight of evidence” in a case with DNA-data

LR= C MAF Founders Child Transmission probabilities Mendelian factor (1, 0.5, µ, etc) Genotype probabilities 2*p i *p j *(1-Fst) Fst*p i + (1-Fst)*p i *p i 2*p i *p j p i *p i Assuming HWE GCGC GMGM G AF

C MAF G M =a,bG AF =c,d G C =a,c C MAF G M =a,b G AF =c,d G C =a,c Paternity trio – An example

LR= = = = Paternity trio – An example Probability to observe a certain allele in the population (e.g. population frequency)

C MAF G M =10,11G AF =13,14.2 G C =10,14.2 C MAF G M =10,11 G AF =13,14.2 G C =10,14.2 LR= Paternity trio – real data

C MAF G M =10,11G AF =13,15.2 G C =10,14.2 Paternity trio - Mutation Mutation?

Mutations Brinkmann et al (1998) found 23 mutations in 10,844 parent/child offsprings. Out of these 22 were single step and 1 were two-step mutations. AABB summary report covering ~4000 mutations showed approx 90-95% single step, 5-10% two-step and 1% or less more than two steps. Mutation rate depend on marker, sex (female/male), age of individual, allele size Note that these are the observed mutation rate, not the actual mutation rate (hidden mutations, assumption of mutation with the smallest step. “The possibility of mutation shall be taken into account whenever a genetic inconsistency is observed”

Mutations cont Approaches to calculate LR accounting for mutations: LR~(µ tot *adj_steps)/p(paternal allele) e.g “stepwise mutation model” LR~µ/A (A=average PE) LR~µ LR~µ/p(paternal allele) µ tot µ +1 µ -1 µ +2 µ +3 µ -2 µ Allele: µ tot =µ +1 +µ -1 +µ +2 +µ -2+ …..

C MAF G M =10,11G AF =13,15.2 G C =10,14.2 “Stepwise decreasing with range” 50% for 15.2 to be transmitted Total mutation rate 90% of mutations are 1 step 50% are loss of fragment size Mutation model decreasing with range Paternity trio - Mutation

C MAF G M =10,11G AF =13,15.2 G C =10,14.2 C MAF G M =10,11 G AF =13,15.2 G C =10,14.2 LR= Paternity trio - Mutation

C MAF G M =10,11G AF =17.2 G C =10 G AF =17.2,17.2 G AF =17.2, s G C =10,10 G C =10,s Paternity trio – Silent allele,s

Null/silent allele “STR null alleles mainly occur when a primer in the PCR reaction fails to hybridize to the template DNA...” “Laboratories should try to resolve this situation with different pairs of STR primers” “…no data from which to estimate the null allele frequency exists, consider a generous bracket of plausible values for the frequency and compute a corresponding range of values for the PI. If the resulting range of case-specific PI values are all large (as large as a laboratory’s threshold value for issuing non-exclusion paternity reports), then report that the PI is ‘‘at least greater than the smallest value’’ in the range” Impact of null/silent allele may vary on the pedigree and the genotype constellations

C MAF G M =10,11G AF =17.2 G C =10 C MAF G M =10,11 G AF =17.2 G C =10 LR= G AF =17.2,17.2 G AF =17.2, s G C =10,10 G C =10,s Paternity trio – Silent allele

Linkage and Linkage disequilibrium Linkage –Can be described as the co-segregation of closely located loci within a family or pedigree. –Effects the transmission probabilities! Linkage disequilibrium (LD) –Allelic association. –Two alleles (at two different markers) which is observed more often/less often than can be expected. –Effects the founder genotype probabilities, not the transmission probabilities!

Mother:16,1821,24 Chr3:1Chr3: Chr3:1Chr3: Distance R, Theta Marker 1Marker 2 If LD, different probabilities for each phase If linkage, different probability of transmission

Mother:16,1821,24 Child1:16,2521,29 Child2:16,2224,30 Chr3:1Chr3:2 25 paternal 29 paternal Chr3:1Chr3: Chr3:1Chr3:2 22 paternal 30 paternal Chr3:1Chr3:

Linkage and LD when it comes to forensic DNA-markers? 14 pair of autosomal STRs < 50cM –Phillips et al 2012 –vWA-D12S391 (around 10cM, i.e. 10% recomb rate) Linkage will have an effect on LR for some pedigrees –Gill et al 2012, O’Connor & Tillmar 2012, Kling et al, –Incest. –Sibling, uncle-nephew. –Impact of linkage will depend on Pedigree Markers typed Individuals’ DNA data LE for vWA-D12S391! –Gill et al 2012, and O’Connor et al 2011.

Linkage cont. Assuming LE (no allelic association) –Gill et al 2012: "...linkage has no effect at all in a pedigree unless: at least one individual in the pedigree (the central individual) is involved in at least two transmissions of genetic material…, and that individual is a double heterozygote at the loci in question." Thus… –For standard paternity cases (father vs unrelated), linkage has no impact on LR –BUT for paternity vs uncle (or other close related alternative), linkage has impact on LR

Presenting the evidence Likelihood ratio (LR) or Paternity index (PI) Posterior probability (must set priors) Simplifies to LR/LR+1, assuming two hypotheses with equal priors L i =Likelihood under hypothesis i π i =Prior probability for hypothesis i

Subpopulation correction 24  What is subpopulation effects?  First developed by Wright in (1965)  Balding (1994)  Hardy Weinberg equilibrium  How to estimate it?  Reasonable values ( )  Sampling formula (Fst = θ)

Example 1. Genotype probabilities 25  HWE  Homozygouts: p(A)p(A)  Heterozygots: 2p(A)p(B)  HWD  Homozygouts: Fst*p(A) + (1-Fst)*p(A)p(A)  Heterozygouts: (1-Fst)*p(A)p(B)

Example 2. Random Match Prob. 26  H 1 : The profiles G1 and G2 are from the same individual  H 2 : The profiles G1 and G2 are from different individuals Let G1=A,A and G2=A,A  HWE:  HWD:

Example 3. Paternity 27 H 1 : The alleged father (AF) is the real father H 2 : AF and the child are unrelated. p(A) = 0.05

28

29

Example 4. Siblings 30 H0: Two persons are full siblings H1: The same two persons are unrelated  The two persons are homozygous A,A (p(A)=0.05)  We use the “IBD method”

Example 4. Siblings 31

Example 5. General relationships 32  We have N number of founder genotypes  If Fst!=0 these are not independent  We use the sampling formula to calculate updated allele frequencies  It is more complex to do by hand  In simple pairwise relationships use the IBD method (See Example 4)  Otherwise, use validated software!