Presentation is loading. Please wait.

Presentation is loading. Please wait.

Writer identification through information retrieval Ralph Niels, Franc Grootjen & Louis Vuurpijl.

Similar presentations


Presentation on theme: "Writer identification through information retrieval Ralph Niels, Franc Grootjen & Louis Vuurpijl."— Presentation transcript:

1 Writer identification through information retrieval Ralph Niels, Franc Grootjen & Louis Vuurpijl

2 A search engine for forensic experts Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

3 Overview Forensic writer identification Prototypical shapes in handwriting Information retrieval (IR) Traditional Writer identification using prototypes Experiments Method Results Conclusions & future work Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

4 Forensic writer identification Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

5 Forensic information retrieval Web search: query of words to search in documents containing words Forensic search: query of characters to search in documents containing characters Previous work*: sub-character level, binary features Based on characters: improves justification possibilities Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl * A. Bensefia, T. Paquet, and L. Heutte. A writer identification and verification system. Pattern Recogn. Letters, 26(13):2080–2092, 2005.

6 Forensic information retrieval Dictionary of character shapes: prototypes –Experts use prototypes –Describe query & documents by prototype usage instances of prototype Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

7 Character to prototype matcher Find most similar prototype for each character W48 h16 a9 t1 y2 o1 u23 d16 i25 d12 i6 s12 (…) a5 a9 a16 a52 (…) Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

8 Prototypes Averaged shapes of real handwritten characters Dynamic Time Warping-distance to find most similar prototype Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl R. Niels & L. Vuurpijl & L. Schomaker. Automatic allograph matching in forensic writer identification. International Journal of Pattern Recognition and Artificial Intelligence. Vol. 21, No. 1. Pages 61-81. February 2007.

9 The IR model for writer identification Character to prototype matcher Indexing Matching Character to prototype matcher Writer input Query input Prototype list af(q) af(w)aw(w) Ranked list Justification Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

10 Indexing: create weighted vectors Vector of prototype usage for each writer: af(w) Adjust weight of prototypes in that vector: Protos used by many writers: not distinctive -> lower weight wf(p) = number of writers using proto p Weighted vector of prototype use for each writer Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

11 The IR model for writer identification Character to prototype matcher Indexing Matching Character to prototype matcher Writer input Query input Prototype list af(q) af(w)aw(w) Ranked list Justification Prototype frequency in query Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

12 The IR model for writer identification Character to prototype matcher Indexing Matching Character to prototype matcher Writer input Query input Prototype list af(q) af(w)aw(w) Ranked list Justification Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

13 Matching Input ‘Database writers’: Indexed writer vectors aw(w) ‘Query writer’: Vector af(q) Match: Calculate cosine of angle between af(q) and each aw(w) Output Ranked list of writers (similarity to query) Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

14 The IR model for writer identification Character to prototype matcher Indexing Matching Character to prototype matcher Writer input Query input Prototype list af(q) af(w)aw(w) Ranked list Justification Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

15 Justification Similarity value (cosine of angle) Prototype contribution to retrieval result Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

16 Justification Forensic expert can further inspect justification Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

17 Experiment 43 writers from plucoll database Online data Segmented into characters How well does our technique perform given a certain amount of data (characters)? Amount of characters in database (d) Amount of characters in query (q) Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

18 Experiment Pick d random letters from each database writer Pick q random other letters from one writer, and use those as query Find most similar writer Prototypes iwf(p), aw(w) Matching Vary d and q Repeat 10 times for each writer Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl Repeat 10 times for each comb. of d and q

19 Results 1003005001000 1059798388 30869799100 509499100 7096100 98100 d q d q Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

20 Conclusions & future work Needed for 100%: 70 chars (q), 300 chars (d) Average English sentence: 75-100 characters No black box: results are justified Online data: forensic practice? Extract semi-automatically with help expert Use offline matching technique Just 43 writers Bigger (n writers & n techniques) experiments planned Promising results Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl

21 Writer identification through information retrieval Ralph Niels Franc Grootjen Louis Vuurpijl A search engine for forensic experts


Download ppt "Writer identification through information retrieval Ralph Niels, Franc Grootjen & Louis Vuurpijl."

Similar presentations


Ads by Google