Random Match Probability Statistics

Slides:



Advertisements
Similar presentations
15 The Genetic Basis of Complex Inheritance
Advertisements

Chapter 23 – Part 1 Part 2 After Break.
Attaching statistical weight to DNA test results 1.Single source samples 2.Relatives 3.Substructure 4.Error rates 5.Mixtures/allelic drop out 6.Database.
Database Searches Non-random samples of N individuals Typically individuals convicted of some crime Maryland, people arrested but not convicted.
Introductory Mathematics & Statistics for Business
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Sampling Distributions and Estimators
3 Person Example #2 Suspect Boxer Shorts (The Ladies Man)
Building a Conceptual Understanding of Algebra with Algebra Tiles
ALGEBRA TILES Jim Rahn LL Teach, Inc.
Adding and Subtracting Rational Expressions (cont.)
Introduction To 2 and 3 Person Mixtures How the RMP Can Help With Complex Mixtures.
By Dr. Julia Arnold Tidewater Community College
U2 L5 Quotient Rule QUOTIENT RULE
Arithmetic of random variables: adding constants to random variables, multiplying random variables by constants, and adding two random variables together.
Addition, Subtraction, and Multiplication of Polynomials
2 Person Mixture #3 Questioned samples from bomb remains, no references.
CHAPTER 15: Tests of Significance: The Basics Lecture PowerPoint Slides The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 15 Probability Rules!
Copyright © 2010 Pearson Education, Inc. Chapter 15 Probability Rules!
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 15 Probability Rules!
Testing Hypotheses About Proportions
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 14 From Randomness to Probability.
Chapter 4 Systems of Linear Equations; Matrices
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 16 Random Variables.
MA 1128: Lecture 16 – 3/29/11 Rational Equations Roots and Radicals.
Order of Operations And Real Number Operations
Polynomials and Factoring
MA 1128: Lecture 06 – 2/15/11 Graphs Functions.
MA 1165: Special Assignment Completing the Square.
1 Person Mixture #1 The Alien Case (Type 0 Mixture)
Dealing with Random Phenomena A random phenomenon is a situation in which we know what outcomes could happen, but we don’t know which particular outcome.
1 MAC 2313 CALC III Chapter 12 VECTORS and the GEOMETRY of SPACE THOMAS’ CALCULUS – EARLY TRANSCENDENTALS, 11 TH ED. Commentary by Doug Jones Revised Aug.
Exponential Functions Logarithmic Functions
Deconvoluting Mixtures Using Proportional Allele Sharing What does it mean and how do you do it?
QUICK MATH REVIEW & TIPS 3 Step into Algebra and Conquer it.
Hardy-Weinberg Equilibrium
 Read Chapter 6 of text  Brachydachtyly displays the classic 3:1 pattern of inheritance (for a cross between heterozygotes) that mendel described.
Mr Barton’s Maths Notes
Mr Barton’s Maths Notes
Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Exponents Scientific Notation
Mr Barton’s Maths Notes
Lesson 4: Percentage of Amounts.
8/15/2015Slide 1 The only legitimate mathematical operation that we can use with a variable that we treat as categorical is to count the number of cases.
College Algebra Prerequisite Topics Review
Forensic Statistics From the ground up…. Basics Interpretation Hardy-Weinberg equations Random Match Probability Likelihood Ratio Substructure.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Arithmetic of Positive Integer Exponents © Math As A Second Language All Rights Reserved next #10 Taking the Fear out of Math 2 8 × 2 4.
Unit 1.4 Recurrence Relations
2 Person Mixture #2 Vaginal swab of Victim. Case Scenario Assault occurred in dorm room Suspect says it was consensual No other parties heard or saw anything.
3 Person Mixture #4 (Or is it 2?) The Hardest Mixture I Know.
Chapter 6 Probability. Introduction We usually start a study asking questions about the population. But we conduct the research using a sample. The role.
PowerPoint Slides for Chapter 16: Variation and Population Genetics Section 16.2: How can population genetic information be used to predict evolution?
You don’t know what you don’t know But does it matter? Or is everything inconclusive?
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Allele Frequencies: Staying Constant Chapter 14. What is Allele Frequency? How frequent any allele is in a given population: –Within one race –Within.
Statistical Analysis of DNA Simple Repeats –Identical length and sequence agat agat agat agat agat Compound Repeats –Two or more adjacent simple repeats.
Statistical weights of single source DNA profiles Forensic Bioinformatics ( Dan E. Krane, Wright State University, Dayton, OH Forensic.
1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Making Sense of Statistics: A Conceptual Overview Sixth Edition PowerPoints by Pamela Pitman Brown, PhD, CPG Fred Pyrczak Pyrczak Publishing.
Disputed DNA Stats for a Low-level Sample: A Case Study By Dan Krane – Carrie Rowland –
Seventh Annual Prescriptions for Criminal Justice Forensics Program Fordham University School of Law June 3, 2016 DNA Panel.
Statistical Analysis of DNA
Statistical Weights of DNA Profiles
Exponents Scientific Notation
Lecture 4: Testing for Departures from Hardy-Weinberg Equilibrium
vms x Year 8 Mathematics Equations
Presentation transcript:

Random Match Probability Statistics From single source to three person mixtures with allelic drop out

Statistics “There are three kinds of lies: lies, damned lies, and statistics.” –Benjamin Disraeli, British Prime Minister as popularized by Mark Twain 18.7% of all statistics are made up My introduction to forensics statistics…. It had been a loooooong time since sophomore genetics

Heterozygote Alleles P and Q Could be PQ or it could be QP So… 2pq Where p is frequency of P And q is frequency of Q If p = 0.2 and q = 0.15, then 2(0.2)(.15) = 0.06 Most of us understood this pretty quickly

Homozygote Allele P Above stochastic threshold So… p x p or p2 But there’s that Θ business Most of us understood this pretty quickly too

Homozygote You don’t use p2 Use p2 + p(1-p)Θ Where did Θ come from? But I understood that Use p2 + p(1-p)Θ I didn’t understand this Where did Θ come from? “It’s the inbreeding coefficient.”

Homozygote OK, but where did p2 + p(1-p)Θ come from? “It’s the correction factor for inbreeding.” Not so helpful Why isn’t it just p2 – Θ?

Homozygote We start with what we thought But some percentage is from inbreeding Correct for that amount of inbreeding Combine them p2 Θp (1-Θ) p2 +

Homozygote Now it’s algebra Θp + (1 – Θ)p2 (inbred p + non-inbred p2) Θp + p2 – Θp2 (expand the terms) p2 + Θp – Θp2 (we like to see p2 term first) p2 + p(Θ – Θp) (pull out p) p2 + p(1 – p)Θ (pull out Θ to get final form)

Single source stat Do the 2pq calculation at each heterozygous locus Do p2 + p(1 – p)Θ at each homozygous locus Then multiply the results for all loci

Partial single source stat What if you don’t detect everything from a single contributor? Consistent with one contributor, but obvious there is a lot of drop out

Partial single source stat No result No result Drop out ?? Drop out Drop out No result

With a sample like this, would you Inconclusive data Exclude only Exclude or “inc a person” Exclude/include no stat Exclude/include stat for 2 allele loci Exclude/include for all loci with something detected Other 0 of 30 Countdown 30

Partial single source stat Heterozygous loci still 2pq

Partial single source stat What about loci that you don’t know about?

Partial single source stat Any person that is a 9.3 could be the source How to calculate 9.3, Any?

Partial single source stat The 9.3 could be a homozygote So p2 + p(1-p)Θ covers that But the 9.3 could be a heterozygote with any other allele So 2pq, but what is q?

Partial single source stat You could go to the ladder 2(p)(q) p = 9.3 q = 4 so 2(f9.3)(f4) q = 5 so 2(f9.3)(f5) q = 6 so 2(f9.3)(f6) ….. q = 13.3 so 2(f9.3)(f13.3) Then add them up But what about off ladder alleles, microvariants, etc? How do you do 2pq for those?

Partial single source stat Instead – if p is what you see (or detect) Then q must be what you don’t see (or detect) Since this is a binary system (What you see/detect) + (what you don’t) = 1.0 (what you don’t see) = 1 – (what you see/detect) So q = (1-p) Therefore 2pq becomes 2p(1-p)

Partial single source stat Now just combine the homozygote and heterozygote options (p = f9.3) [p2 + p(1-p)Θ] + [2p(1-p)] for anyone with 9.3

Partial single source stat What about loci that look like homozygotes? Use your PHR and stochastic threshold studies If you treat a locus as a homozygote, you better be above your stochastic threshold When in doubt, use Allele, Any – you’re covered At USACIL, Allele, Any = “modified” RMP

Partial single source stat The “2p” rule Section 5.2.1.3 –SWGDAM 5.2.1.3. For single-allele profiles where the zygosity is in question (e.g., it falls below the stochastic threshold): 5.2.1.3.1. The formula 2p, as described in recommendation 4.1 of NRCII, may be applied to this result. 5.2.1.3.2. Instead of using 2p, the algebraically identical formulae 2p – p2 and p2 + 2p(1-p) may be used to address this situation without double-counting the proportion of homozygotes in the population.

Partial single source stat 2p is an extremely conservative approximation There is a better way 2p-p2 p2 + 2p(1-p) But this is even better p2 + p(1-p)Θ + 2p(1-p) (computers can calculate anything)

Partial single source stat “Algebraically identical formulae” f9.3 = 0.3054 2p –p2 p2 + 2p(1-p) 2(0.3054) - (0.3054)2 (0.3054)2 + 2(0.3054) (1-0.3054) 0.6108 - 0.09326 0.09326 + 0.6108 (0.6946) 0.5175 0.09326 + 0.42426 0.5175 0.5175

Partial single source stat So for 9.3, Any 2p = 0.6108 2p-p2 = 0.5175 p2 + 2p(1-p) = 0.5175 p2 + p(1-p)Θ + 2p(1-p) = 0.5197

Minor contributor stat

When the minor is probative, would you Inconclusive data Exclude only Exclude or “inc a person” Exclude/include no stat Exclude/include stat for some allele loci Exclude/include for all loci Other 0 of 30 Countdown 30

Minor contributor stat For our purposes, it is an intimate sample from known female contributor Female is major Major would have a single source stat But isn’t probative Focus on the minor (or foreign) contributor

Minor contributor stat Situations you need to be able to calculate When you know the minor type When you are concerned about drop out When you are not concerned about drop out, but you don’t know the minor type (masking/sharing) When you do not see any minor alleles, but still think the minor contributor is represented We haven’t discussed the last two yet

Minor contributor stat When you know the minor type 10, 11 2pq 2(f10)(f11) 6, 9.3 2(f6)(f9.3)

Minor contributor stat When you are concerned about drop out 24, Any p2 + p(1-p)Θ + 2p(1-p) (f24)2 + (f24)(1-(f24))Θ + 2(f24) (1-(f24))

Minor contributor stat When you are not concerned about drop out, but don’t know the minor type What types are possible? 9, 9 8, 9 9, 11 “Combo stat”

Minor contributor stat “Combo stat” 9 is above stochastic threshold 9, 9 8, 9 9, 11 Add them up p2 + p(1-p)Θ 2pq 2pr + + (f9)2 + (f9)(1-(f9))Θ + 2(f8) (f9) + 2(f9) (f11)

Minor contributor stat Section 5.2.2 - SWGDAM 5.2.2. When the interpretation is conditioned upon the assumption of a particular number of contributors greater than one, the RMP is the sum of the individual frequencies for the genotypes included following a mixture deconvolution. Examples are provided below. 5.2.2.1. In a sperm fraction mixture (at a locus having alleles P, Q, and R) assumed to be from two contributors, one of whom is the victim (having genotype QR), the sperm contributor genotypes included post-deconvolution might be PP, PQ, and PR. In this case, the RMP for the sperm DNA contributor could be calculated as [p2 + p(1-p)] + 2pq + 2pr.

Minor contributor stat

Minor contributor stat No minor alleles present, but you know the minor is contributing Every other locus has minor alleles Did the enzyme just get lazy? “Just inc the locus for stats” That doesn’t make any more sense than throwing out any other locus You just need the right calculator

Minor contributor stat Two scenarios to consider No stochastic concerns Stochastic concerns Two slightly different stats, but can deal with both

Minor contributor stat No stochastic concerns In some cases, PHR and P may help 17, 17 or possibly 16, 17 Maybe not 16, 16 But, you know minor must be: 16, 16 16, 17 17, 17 p2 + p(1-p)Θ 2pq This is the “combo” stat q2 + q(1-q)Θ + +

Minor contributor stat Couple more definitions: “Unrestricted” RMP The “combo” stat where we used all possibilities 16,16 and 16,17 and 17,17 from previous slide “Restricted” RMP The “combo” stat where we chose not to use one (or more) possible types based on what fits peak heights, peak height ratios, or proportions of contributors 17,17 or 16,17 but not 16,16 from previous slide

Minor contributor stat What if stochastic concerns? You would take anyone with 16, Any 17, Any But that has the 16, 17 counted twice Subtract 16, 17 But only once! (p2 + p(1-p)Θ) + 2p(1-p) (q2 + q(1-q)Θ) + 2q(1-q) – 2pq +

Modified random match probability Let’s look at this “double any” calculation Simplify by removing Θ This is the basis for dealing with any number of “Allele, Any” contributors USACIL calls this a modified RMP because “Anys” are involved (p2 + p(1-p)Θ) + 2p(1-p) + p2 + 2p(1-p) + (q2 + q(1-q)Θ) + 2q(1-q) q2 + 2q(1-q) – 2pq – 2pq p2 + 2p(1-p) + q2 + 2q(1-q) – 2pq

Modified random match probability Let’s say we’ve got a two contributor mixture with signs that both contributors are having stochastic issues. But what you see is consistent with two contributors Remember “Take a stand on the stand….” Validation studies, interpretation guidelines, your experience, Tech Review agrees…

Modified random match probability We’ll start with this same pattern But stochastic concerns Homozygote threshold Mixture interpretation threshold Stochastic threshold Drop out threshold Lets just call it the “Danger Zone” Why do I always think of “Top Gun” when I have low peak heights? 16 230 17 260 (We’re not suggesting that you MUST do this - only that you can calculate it.)

Modified random match probability Remember the “Allele, Any” 2pq = 2p(1-p) 2x(what you do see)x(what you don’t see) (We used it for a single allele below stochastic threshold for partial or minor contributor) Because we have two contributors: 16, Any 17, Any Or both or 16 230 17 260

Modified random match probability Also, remember the “combo stat” for the combinations you can see p2 + 2pq + q2 We’ll rearrange this in a minute 16 230 17 260

Modified random match probability Allele, Any for p (16) 2(what you see)(what you don’t) 2p(1-?) You “see” two alleles now Both p and q (16 and 17) Stick with “1 – what you see” for what you don’t see 2p(1-(p+q)) for p (16) Same thing for q (17) 2q(1-(p+q)) 16 230 17 260

Modified random match probability So, the obvious combinations: “Combo” for visible The “Allele, Any” combinations: Allele, Any for the 16 Allele, Any for the 17 Add them up p2 + 2pq + q2 16 230 17 260 2p(1-(p+q)) 2q(1-(p+q)) + +

Modified random match probability Here is the formula for multiple Allele, Any Now we rearrange that first part That last line should look familiar p2 + 2pq + q2 + 2p(1-(p+q)) + 2q(1-(p+q)) p2 + 2pq + q2 (p + q) x (p + q) (p + q)2

Modified random match probability Remember back in the good old days? CPI stat For two alleles For three alleles … For nine alleles (p + q)2 (p + q + r)2 (p + q + r + s + t + u + v + w + x)2 CPI

Modified random match probability Two ways to think about Allele, Any The way we derived it for that minor contributor The way that works for as many contributors as we may need They are equivalent (Remember we dropped Θ for the top one) (CPI math is the foundation for the bottom one, and doesn’t use Θ) [p2 + 2p(1-p)] + [q2 + 2q(1-q)] – 2pq (p + q)2 + 2p(1-(p+q)) + 2q(1-(p+q))

Modified random match probability Expand this one (“Double” Allele, Any – duplicate) To get Rearrange the terms p2 + 2p(1-p) + q2 + 2q(1-q) – 2pq p2 + 2p – 2p2 + q2 + 2q – 2q2 – 2pq p2 + q2 + 2p + 2q – 2pq – 2p2 – 2q2

Modified random match probability Now expand the other one (Multiple Allele, Any) To get Rearrange the terms Condense the 2pq terms (p + q)2 + 2p(1-(p+q)) + 2q(1-(p+q)) p2 + 2pq + q2 + 2p – 2p2 – 2pq + 2q – 2q2 – 2pq p2 + q2 + 2p + 2q + 2pq – 2pq – 2pq – 2p2 – 2q2 p2 + q2 + 2p + 2q – 2pq – 2p2 – 2q2

Modified random match probability Now compare them: This was the “single source” one (2 slides ago) This is the “generic” form for multiple contributors (previous slide) p2 + q2 + 2p + 2q – 2pq – 2p2 – 2q2 p2 + q2 + 2p + 2q – 2pq – 2p2 – 2q2

Modified random match probability Section 5.2.2.3 - SWGDAM 5.2.2.3. In a mixture having at a locus alleles P, Q, and R, assumed to be from two contributors, where all three alleles are below the stochastic threshold, the interpretation may be that the two contributors could be a heterozygote-homozygote pairing where all alleles were detected, a heterozygote-heterozygote pairing where all alleles were detected, or a heterozygote-heterozygote pairing where a fourth allele might have dropped out. In this case, the RMP must account for all heterozygotes and homozygotes represented by these three alleles, but also all heterozygotes that include one of the detected alleles. The RMP for this interpretation could be calculated as (2p – p2) + (2q – q2) + (2r – r2) – 2pq – 2pr – 2qr. 5.2.2.3.1. Since 2p includes 2pq and 2pr, 2q includes 2pq and 2qr, and 2r includes 2pr and 2rq, the formula in 5.2.2.3 subtracts 2pq, 2pr, and 2qr to avoid double-counting these genotype frequencies.

Modified random match probability To use RMP you must state the number of contributors Validation studies Experience Yadda, yadda Now that we know how to deal with drop out via Allele, Any, we can use RMP more often Modified RMP (modified denotes “Anys”) This is the language we use at our lab

CPI compared to RMP But CPI is NOT the same as RMP CPI is used when you are unsure about the number of contributors Consequently, you have problems when you have alleles in the stochastic range – “Danger Zone” If you don’t know how many contributors you have, you don’t know how many alleles are missing

CPI compared to RMP But we can use the CPI math in our RMP stat We must make two changes to the “base” CPI formula that we use in the RMP 1. We must correct for situations that change the number of contributors 2. We must account for allelic drop out We’ve been through that second, so let’s deal with the first

CPI compared to RMP Consider a four allele pattern We interpret the overall profile as having two contributors. CPI considers all possible “visible” combinations of contributors (p + q + r + s)2 This includes P, P and Q, Q and R, R and S, S types

CPI compared to RMP But if you think you could have a P, P contributor, that leaves three alleles left We stated that there were only 2 contributors If Contributor #1 is P, P Contributor #2 cannot account for Q, R and S alleles Having a homozygote changes the assumption of the number of contributors

CPI compared to RMP So all we need to do is subtract the homozygotes – but only when the presence of a homozygote changes the number of contributors 2 contributors and 4 alleles detected 3 contributors and 6 alleles detected

CPI compared to RMP Easy to do with a friendly computer USACIL defines this as an “Unrestricted” RMP We kind of think of it as a CPI stat corrected for a defined number of contributors (p + q + r + s)2 – p2 – q2 – r2 – s2 (p + q + r + s + t + u)2 – p2 – q2 – r2 – s2 – t2 – u2

Unrestricted RMP Section 5.2.2.6 - SWGDAM 5.2.2.6. The unrestricted RMP might be calculated for mixtures that display no indications of allelic dropout. The formulae include an assumption of the number of contributors, but relative peak height information is not utilized. For two-person mixtures, the formulae for loci displaying one, two, or three alleles are identical to the CPI calculation discussed in section 5.3. For loci displaying four alleles (P, Q, R, and S), homozygous genotypes would not typically be included. The unrestricted RMP in this case would require the subtraction for homozygote genotype frequencies, e.g., (p + q + r + s) 2 – p2 – q2 – r2 – s2.

Modified random match probability Same thing for our “Allele, Any” situation No need to consider an “Allele, Any” if it changes the number of contributors It doesn’t matter how many alleles are below your stochastic threshold If you say there are 2 contributors and you detect 4 alleles, by definition there are no alleles missing Similar for 3 contributors and 6 alleles detected

Modified random match probability About as bad as it can get 3 contributors All alleles are in the Danger Zone Each allele could be missing it’s sister allele (p+q+r+s+t)2 + 2p(1-(p+q+r+s+t)) + 2q(1-(p+q+r+s+t)) + 2r(1-(p+q+r+s+t)) + 2s(1-(p+q+r+s+t)) + 2t(1-(p+q+r+s+t))

Modified random match probability GIANT DISCLAIMER!! We are not saying that you can charge ahead and now use any profile of any number of people with any number of alleles dropping out if you just use a modified RMP calculation Bad data is bad data It’s science, not Voodoo