Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Silke Scheible, Richard Jason Whitt, Martin Durrell, and Paul Bennett The GerManC.

Similar presentations


Presentation on theme: "Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Silke Scheible, Richard Jason Whitt, Martin Durrell, and Paul Bennett The GerManC."— Presentation transcript:

1 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Silke Scheible, Richard Jason Whitt, Martin Durrell, and Paul Bennett The GerManC project School of Languages, Linguistics, and Cultures University of Manchester (UK)

2 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Overview Motivation The GerManC corpus POS-tagger and tagset Challenges Results 2

3 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Motivation Goal: – POS-tagged version of GerManC corpus 3

4 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Motivation Goal: – POS-tagged version of GerManC corpus Problems: – No specialised tagger available for EMG – Limited funds: Manual annotation not feasible for whole corpus 4

5 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Motivation Goal: – POS-tagged version of GerManC corpus Problems: – No specialised tagger available for EMG – Limited funds: Manual annotation not feasible for whole corpus Question: – How well does an ‘off-the shelf’ tagger for modern German perform on Early Modern German data? 5

6 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Motivation Tagger evaluation requires gold standard data 6

7 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Motivation Tagger evaluation requires gold standard data Idea: – Develop gold-standard subcorpus of GerManC – Use subcorpus to test and adapt modern NLP tools – Create historical text processing pipeline 7

8 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Motivation Tagger evaluation requires gold standard data Idea: – Develop gold-standard subcorpus of GerManC – Use subcorpus to test and adapt modern NLP tools – Create historical text processing pipeline Results useful for other small humanities- based projects wishing to add POS annotations to EMG data 8

9 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text The GerManC corpus 9

10 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text The GerManC corpus Purpose: Studies of development and standardisation of German language Texts published between 1650 and 1800 Sample corpus (2,000 words per text) Total corpus size: ca. 1 million words Aims to be “representative” 10

11 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text The GerManC corpus Eight genres 11 Orally- oriented Print-oriented Dramas Newspapers Letters Sermons Narrative prose Humanities texts Science & medicine texts Legal texts

12 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text The GerManC corpus Three periods

13 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text The GerManC corpus Five regions 13 North German West Central German East Central German West Upper German East Upper German

14 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text The GerManC corpus Three 2,000-word files per genre/period/region Total size: ca. 1 million words 14

15 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Gold-standard subcorpus: GerManC-GS One 2,000-word file per genre and period from North German region  24 files > 50,000 tokens Annotated by two historical linguists Gold standard POS tags, lemmas, and normalised word forms 15

16 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text POS-tagger TreeTagger (Schmid, 1994) Statistical, decision tree-based POS tagger Parameter file for modern German supplied with the tagger Trained on German newspaper corpus STTS tagset 16

17 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text STTS-EMG 1.PIAT (merged with PIDAT): Indefinite determiner, as in ‘viele solche Bemerkungen’ (‘many such remarks’) 17

18 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text STTS-EMG 2.NA: Adjectives used as nouns, as in ‘der Gesandte’ (‘the ambassador’) 18

19 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text STTS-EMG 3.PAVREL: Pronominal adverb used as relative, as in ‘die Puppe, damit sie spielt’ (‘the doll with which she plays’) 4.PTKREL: Indeclinable relative particle, as in ‘die Fälle, so aus Schwachheit entstehen’ (‘the cases which arise from weakness’) 19

20 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text STTS-EMG 5.PWAVREL: Interrogative adverb used as relative, as in ‘der Zaun, worüber sie springt’ (‘the fence over which she jumps’) 6.PWREL: Interrogative pronoun used as relative, as in ‘etwas, was er sieht’ (‘something which he sees’) 20

21 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text POS-tagging in GerManC-GS New categories account for 2% of all tokens IAA on POS-tagging task: 91.6% 21

22 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Challenges: Tokenisation issues Clitics: – hastu: hast du (‘have you’) - wirstu: wirst du (‘will you’) 22

23 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Challenges: Tokenisation issues Clitics: – has|tu: hast du (‘have you’) - wirs|tu: wirst du (‘will you’) 23

24 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Challenges: Tokenisation issues Clitics: – has|tu: hast du (‘have you’) - wirs|tu: wirst du (‘will you’) Multi-word tokens: – obgleich vs. ob gleich (‘even though’) 24

25 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Challenges: Tokenisation issues Clitics: – has|tu: hast du (‘have you’) - wirs|tu: wirst du (‘will you’) Multi-word tokens: – obgleich/KOUS vs. ob/KOUS gleich/ADV (‘even though’) 25

26 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Challenges: Spelling variation Spelling not standardised: – Comet  Komet – auff  auf – nachdeme  nachdem – kompt  kommt – Bothenbrodt  Botenbrot – differiret  differiert – beßer  besser – kehme  käme – trucken  trockenen – gepressett  gepreßt – büxen  Büchsen 26

27 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Challenges: Spelling variation All spelling variants in GerManC-GS normalised to a modern standard  Assess what effect spelling variation has on the performance of automatic tools  Help improve automated processing? Important for: – Automatic tools (POS tagger!) – Accurate corpus search 27

28 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Challenges: Spelling variation Proportion of normalised word tokens plotted against time 28

29 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Questions What is the “off-the-shelf” performance of the TreeTagger on historical data from the EMG period? Can the results be improved by running the tool on normalised data? 29

30 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Results Original dataNormalised data Accuracy69.6%79.7% 30 TreeTagger accuracy on original vs. normalised input

31 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Improvement through normalisation over time 31 Tagger performance plotted against publication date

32 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Effects of spelling normalisation on POS tagger performance 32 For normalised tokens: Effect of using original (O)/normalised (N) input on tagger accuracy +: correctly tagged; -: incorrectly tagged

33 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Comparison with “modern” results Performance of TreeTagger on modern data: ca. 97% (Schmid, 1995) Current results seem low 33

34 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Comparison with “modern” results Performance of TreeTagger on modern data: ca. 97% (Schmid, 1995) Current results seem low But: – Modern accuracy figure: evaluation of tagger on the text type it was developed on (newspaper text) 34

35 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Comparison with “modern” results Performance of TreeTagger on modern data: ca. 97% (Schmid, 1995) Current results seem low But: – Modern accuracy figure: evaluation of tagger on the text type it was developed on (newspaper text) – IAA higher for modern German (98.6%) 35

36 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Conclusion Substantial amount of manual post-editing required Normalisation layer can improve results by 10%, but so far only half of all annotations have positive effect 36

37 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Future work Adapt normalisation scheme to account for more cases Automate normalisation (Jurish, 2010) Retrain state-of-the-art POS taggers  Evaluation? Provide detailed information about annotation quality to research community 37

38 Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text 38 Thank you!


Download ppt "Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text Silke Scheible, Richard Jason Whitt, Martin Durrell, and Paul Bennett The GerManC."

Similar presentations


Ads by Google