Statistical Weights of DNA Profiles

Statistical Weights of DNA Profiles
Dan E. Krane, Wright State University, Dayton, OH Forensic Bioinformatics (

DNA statistics Coincidental 10 locus DNA profile matches are very rare
Several factors can make statistics less impressive Mixtures Incomplete information Relatives Database searches

DNA profile 3

Comparing electropherograms
EXCLUDE Evidence sample Suspect #1’s reference

Comparing electropherograms
CANNOT EXCLUDE Evidence sample Suspect #2’s reference 5

What weight should be given to DNA evidence?
Statistics do not lie. But, you have to pay close attention to the questions they are addressing.

Statistics do not lie. But, you have to pay close attention to the questions they are addressing. What is the chance that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample?

Single source statistics: Random Match Probability (RMP) or “Random Man Not Excluded” (RMNE)

Statistical estimates: the product rule
2pq x Single source samples Formulae for RMNE: At a locus: Heterozygotes: Homozygotes: Multiply across all loci p2 2pq p2 9 9

Statistical estimate: Single source sample
0.1454 x 0.1097 x 2 10

1 in 608 quintillion (“less than one in one billion”)
Statistical estimate: Single source sample 6.0% 4.6% 1.2% 9.8% 9.5% 6.3% 2.2% 1.0% 2.9% 5.1% 29.9% 4.0% 1.1% 6.6% 3.2% X 0.1454 0.1097 2 x = 0.032 1 in 608 quintillion (“less than one in one billion”) 1 in 608,961,665,956,361,000,000 11

Mixture statistics: Combined Probability of Inclusion (CPI) or Likelihood Ratios (LR)

Mixed DNA samples

Put two people’s names into a mixture.

How many names can you take out?

How many contributors to a mixture?
How many contributors to a mixture if analysts can discard a locus? Maximum # of alleles observed in a 3-person mixture # of occurrences Percent of cases 2 0.00 3 78 4 4,967,034 3.39 5 93,037,010 63.49 6 48,532,037 33.12 3,398 7,274,823 112,469,398 26,788,540 0.00 4.96 76.75 18.28 There are 45,139,896 possible different 3-way mixtures of the 648 individuals in the MN BCI database. There are 146,536,159 possible different 3-person mixtures of the 959 individuals in the FB I database (Paoletti et al., November 2005 JFS).

How many contributors to a mixture?
Maximum # of alleles observed in a 4-person mixture # of occurrences Percent of cases 4 13,480 0.02 5 8,596,320 15.03 6 35,068,040 61.30 7 12,637,101 22.09 8 896,435 1.57 There are 45,139,896 possible different 3-way mixtures of the 648 individuals in the MN BCI database. There are 57,211,376 possible different 4-way mixtures of the 194 individuals in the FB I Caucasian database (Paoletti et al., November 2005 JFS). (35,022,142,001 4-person mixtures with 959 individuals.)

CPI Stats

Combined Probability of Inclusion
CPI Stats Combined Probability of Inclusion Probability that a random, unrelated person could be included as a possible contributor to a mixed profile For a mixed profile with the alleles 14, 16, 17, 18; contributors could have any of 10 genotypes: 14, , 16 14, 17 14, 18 16, 16 16, 17 16, 18 17, 17 17, 18 18, 18 Probability works out as: CPI = (p[14] + p[16] + p[17] + p[18])2 ( )2 = 0.621

1 in 1.3 million 62.1% CPI Stats 91.5% 23.5% 19.2% 40.7% 47.6% 99.0%
54.4% 61.2% 8.4% 91.6% 63.7% 8.8% 82.9% 31.1% X 62.1% 62.1% 1 in 1.3 million

Mixtures with drop out

The testing lab’s conclusions

Ignoring loci with “missing” alleles
Labs often claim that this is a “conservative” statistic Ignores potentially exculpatory information “It fails to acknowledge that choosing the omitted loci is suspect-centric and therefore prejudicial against the suspect.” Gill, et al. “DNA commission of the International Society of Forensic Genetics: Recommendations on the interpretation of mixtures.” FSI

Likelihood approaches for mixtures where allelic drop out may have occurred
Determining the rate of allelic drop-out is problematic Determining the rate of allelic drop-in is problematic Considering more than two possible contributors is computationally intensive Considering mixtures of different racial groups can be computationally intensive Contributions from different kinds of close relatives require special considerations If you presume that the drop-out or drop-in rates are very high then virtually no-one can be excluded. Determining the actual number of contributors to a mixture is difficult – the minimum number isn’t necessarily the actual number especially when drop-out rates may be high. Peak height information is not taken into consideration. Subjectivity can influence the alleles that are deemed to be present/absent. Make sure that all realistic defense hypotheses are explored.

How many names can you take out if you can use blanks?

How many names can you take out if you can use blanks?
The more blanks the harder it is to eliminate anyone’s name as possibly being in the mix.

The alternative suspect pool

Which allele frequency database should be used?
• Random match probabilities are typically generated for each of three major racial groups • Literally hundreds of alternative allele frequency databases are available • The racial background of a suspect is not relevant.

What is the relevant population?

A process of elimination
• Consider that a suspect matches an evidence sample • If he is not the source of the DNA then it must be someone else’s. Whose might it be? • Could the actual source be: Caucasian, Afro-Caribbean, or Indo-Pakistan? • If it cannot be and there is no one else in the alternative suspect pool then the suspect must be the source. Presuming that the source of an evidentiary sample must be from the same ethnic/racial group as a given suspect is in direct odds with a presumption of innocence. It is literally prejudicial. The weight of the evidence should be the same regardless of who is identified as a suspect – to the extent to which it is not it is subjective (and not objective) and not scientific.

A suspect pool D matches. It means something if we find that A, B and C are all unlikely to also match. B C A D Presuming that the source of an evidentiary sample must be from the same ethnic/racial group as a given suspect is in direct odds with a presumption of innocence. It is literally prejudicial. The weight of the evidence should be the same regardless of who is identified as a suspect – to the extent to which it is not it is subjective (and not objective) and not scientific.

Database searches

Consider cold hits UK’s National DNA Database (NDNAD)
Maintained by the Home Office Contains 6,929,946 arrested individuals as of 31 March, 2012 Assisted in 409,715 investigations (2,595 murders)

In which case is the DNA evidence most damning?
Probable Cause Case Suspect is first identified by non-DNA evidence DNA evidence is used to corroborate traditional police investigation Cold Hit Case Suspect is first identified by search of DNA database Traditional police work is no longer focus

Probable Cause Case Suspect is first identified by non-DNA evidence DNA evidence is used to corroborate traditional police investigation RMNE = 1 in 10 million Cold Hit Case Suspect is first identified by search of DNA database Traditional police work is no longer focus RMNE = 1 in 10 million

Probable Cause Case Suspect is first identified by non-DNA evidence DNA evidence is used to corroborate traditional police investigation RMNE = 1 in 10 million Cold Hit Case Suspect is first identified by search of DNA database Traditional police work is no longer focus RMNE = 1 in 10 million DMP = in 1

Familial searches Database search yields a close but imperfect DNA match Can suggest a relative is the true perpetrator UK performs them relatively rarely – a total of 29 were carried out in Reluctance to perform them in US since 1992 NRC report

Is the true DNA match a relative or a random individual?
Given a closely matching profile, who is more likely to match, a relative or a randomly chosen, unrelated individual? Use a likelihood ratio

Is the true DNA match a relative or a random individual?
This question is ultimately governed by two considerations: What is the size of the alternative suspect pool? What is an acceptable rate of false positives?

Additional (free) resources
Forensic Bioinformatics ( GenoStat® ( Eight 50-minute YouTube videos (

Statistical Weights of DNA Profiles

Similar presentations

Presentation on theme: "Statistical Weights of DNA Profiles"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Statistical Weights of DNA Profiles

Similar presentations

Presentation on theme: "Statistical Weights of DNA Profiles"— Presentation transcript:

Similar presentations

About project

Feedback