Presentation is loading. Please wait.

Presentation is loading. Please wait.

Representing and Solving Complex DNA Identification Cases Using Bayesian Networks Philip Dawid University College London Julia Mortera & Paola Vicard Università.

Similar presentations


Presentation on theme: "Representing and Solving Complex DNA Identification Cases Using Bayesian Networks Philip Dawid University College London Julia Mortera & Paola Vicard Università."— Presentation transcript:

1 Representing and Solving Complex DNA Identification Cases Using Bayesian Networks Philip Dawid University College London Julia Mortera & Paola Vicard Università Roma Tre

2 FORENSIC USES FOR DNA PROFILES Murder/Rape/…: Is A the culprit? Paternity: Is A the father of B? Immigration: Is A the mother of B? How are A and B related? Disasters: 9/11, tsunami, Romanovs,…

3 Disputed Paternity c m pf We have DNA data D from a disputed child c, its mother m and the putative father pf child founder hypothesis Building blocks: founder, child query founder pf tf af If pf is not the true father tf, this is a “random” alternative father af, query

4 Disputed Paternity c m pf We have DNA data D from a disputed child c, its mother m and the putative father pf LIKELIHOOD RATIO Essen-Möller 1938 pf tf af If pf is not the true father tf, this is a “random” alternative father af

5 MISSING DNA DATA What if we can not obtain DNA from the suspect ? (or other relevant individual?) Sometimes we can obtain indirect information by DNA profiling of relatives But analysis is complex and subtle…

6 query child founder hypothesis Disputed Paternity Case Building blocks: founder, child, query

7 Complex Paternity Case c1 m1 pf c2 pf m2 b1 b2pf We have DNA from a disputed child c1 and its mother m1 but not from the putative father pf. We do have DNA from c2 an undisputed child of pf, and from her mother m2 as well as from two undisputed full brothers b1 and b2 of pf. founder child query hypothesis Building blocks: founder, child, query

8 Criminal Identification Case body CR bodywife CR c1c2 wife A body has been found, burnt beyond recognition, but there is reason to believe it might be that of a missing criminal CR. DNA is available from the body, from the wife of CR, and from two children c1 and c2 of CR and wife founder child founder query child hypothesis Building blocks: founder, child, query

9 founderchildqueryEach building block ( founder / child / query ) in a pedigree can be an INSTANCE of a generic CLASS network — which can itself have further structure The pedigree is built up using simple mouse clicks to insert new nodes/instances and connect them up Genotype data are entered and propagated using simple mouse clicks Object-Oriented Bayesian Network HUGIN 6

10 Under the microscope… Each CLASS is itself a Bayesian Network, with internal structure Recursive: can contain instances of further class networks Communication via input and output nodes

11 Marker VWA (Austro-German population allele frequencies) 12.0003 13.0018 14.1009 15.1004 16.1949 17.2834 18.2162 19.0866 20.0137 21.0015 22.0003 Single-marker analysis (multiply LR’s across markers)

12 Lowest Level Building Blocks STR MARKER having associated repertory of alleles together with their frequencies gene mendel MENDELIAN SEGREGATION Child’s gene copies paternal or maternal gene, according to outcome of fair coin flip GENOTYPE consisting of maximum and minimum of paternal and maternal genes genotype

13 founder FOUNDER INDIVIDUAL represented by a pair of genes pgin and mgin (instances of gene ) sampled independently from population distribution, and combined in instance gt of genotype gene genotype

14 child CHILD INDIVIDUAL paternal [maternal] gene selected by instances fmeiosis [mmeiosis] of mendel from father’s [mother’s] two genes, and combined in instance cgt of genotype mendel genotype

15 query QUERY INDIVIDUAL Choice of true father’s paternal gene tfpg [maternal gene mfpg] as either that of f1 or that of f2, according as tf=f1? is true or false. QUERY INDIVIDUAL Choice of true father’s paternal gene tfpg [maternal gene mfpg] as either that of f1 or that of f2, according as tf=f1? is true or false.

16 Complex Paternity Case founder child query hypothesis Measurements for 12 DNA markers on all 6 individuals Enter data, “propagate” through system Overall Likelihood Ratio in favour of paternity: 1300

17 MORE COMPLEX DNA CASES Mutation Silent/missed alleles,… Mixed crime stains –rape –scuffle Multiple perpetrators and stains Database search Contamination, laboratory errors –…–…

18 MUTATION mendel mut + appropriate network mut to describe mutation process

19 e.g. proportional mutation: founder Prob(otherg) ~ mutation rate mut – or build other, more realistic, models

20 SILENT ALLELES Code by additional allele (99) gene genotype unobserved + inherited e.g. 5 = 5/5 or 5/s

21 MISSED ALLELES genotype geneobs unobserved + non-inherited geneobs

22 COMBINATION Can combine any or all of above features (and others), by using all appropriate subnetworks Can use any desired pedigree network –no visible difference at top level Simply enter data (and desired parameter- values) and propagate…

23 Effect of accounting for silent allele Simple paternity testing Paternity testing with additional measured individuals

24 Marker VWA (Austro-German population allele frequencies) 12.0003 13.0018 14.1009 15.1004 16.1949 17.2834 18.2162 19.0866 20.0137 21.0015 22.0003

25 Simple paternity testing – allowing for silent alleles Simple paternity testing – allowing for silent alleles

26 pr(silent)LR 003.8 0.0000152630 0.0001125127 0.001203 mgt = 12/20 pfgt = 13 cgt = 12 Paternal incompatibility p 12 = 0.0003 – rare allele with mutation ~ 0.005

27 pr(silent)LR 0 Impossible 0.0000154.6 0.00014.6 0.0014.6 mgt = 16 pfgt = 18 cgt = 18 The mother must have passed a silent allele to the child –who must have inherited allele 18 from his father Maternal incompatibility

28 Paternity testing

29 Paternity testing with brother too

30 Overall likelihood ratio is Consider additional information carried by the brother’s data B: where D denotes data on triplet ( pf, c, m )

31 mgt = 12/15 pfgt = 14 cgt = 12 Incompatible triplet 16/20 12/14 14 22 p(silent)LR D LR B 0010.5513334 0.0000150.510.551.001595 0.00012.510.551.00404 0.0017.510.551.0046 B = p 22 =.0003 *Maximum LR overall is 1027, at p(silent) = 0.0000642 *

32 mgt = 12/15 pfgt = 13 cgt = 12/13 Compatible triplet 13 13/1621/2222 p(silent)LR D LR B 05561111 0.00001555111.0010.51 0.000152811.0210.52 0.00141011.1110.61 B =

33 Extensions Estimation of mutation rates from paternity data Peak area data – mixtures – contamination – low copy number

34 Network to estimate mutation rate

35 Marker:D8D18D21 Alleles:10111413161759656770 Peak Area (RFUs): 6416383565938985191419911226143488168894 Suspect alleles in yellow Excerpt of data on 6 markers from Evett et al. (1998) Mixed crime trace

36 Mixed crime trace – alleles only

37 Mixed crime trace – peak areas

38 Marker:D8D18D21 Alleles:10111413161759656770 Peak area: 6416383565938985191419911226143488168894 Mixed crime trace + 3 more… LR (alleles only): 25,000 LR (peak areas too): 170,000,000

39 Thanks to: Steffen Lauritzen Robert Cowell and The Leverhulme Trust


Download ppt "Representing and Solving Complex DNA Identification Cases Using Bayesian Networks Philip Dawid University College London Julia Mortera & Paola Vicard Università."

Similar presentations


Ads by Google