Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linking Dictionary and Corpus Adam Kilgarriff Lexicography MasterClass Ltd Lexical Computing Ltd University of Sussex UK.

Similar presentations


Presentation on theme: "Linking Dictionary and Corpus Adam Kilgarriff Lexicography MasterClass Ltd Lexical Computing Ltd University of Sussex UK."— Presentation transcript:

1 Linking Dictionary and Corpus Adam Kilgarriff Lexicography MasterClass Ltd Lexical Computing Ltd University of Sussex UK

2 Observation: corpus: arbitrary sample dictionary: systematic account Children encounter arbitrary samples develop systematic account

3 Corpus provisional, dispensable used to develop lexicon Paper  reviews WSD (word sense disambiguation) work  presents new model for linkage

4 WSD SEMCOR  Based on WordNet  All corpus words tagged with WordNet senses  Widely used in WSD  standard model “Putting the dictionary into the corpus”

5 Disadvantages Dictionary enriches corpus Corpus is not dispensable wrong way round We want to put the corpus into the dictionary Make richer dictionaries (like children), not richer corpora

6 Levels of abstraction Direct linkage: Fragile  Updates (to C or D) break links Dictionary: abstract Corpus: raw Intermediate level needed CorpusDictionary ===   ===

7 How most WSD works  Analyse dictionary to give set of collocates  Match to collocates in a corpus Dispensable corpus CorpusDictionary ===   === ===   === Collocates

8 Word Sketch http://sketchengine.co.uk/bnc

9 Not just collocates triples  parse the corpus some “unary relations”  I hear him singing domain-based clues  Collocates, Constructions, Domains = CoCoDo

10 Automatically extract CoCoDos from corpus How linked to senses?  Automatic (WSD techniques) Manual  WASPS project  “dictionary-free”: ideal for new dictionaries  Labour costs Mixed  WSD with manual confirmation/correction CorpusDictionary ===   === ===   === CoCoDo CoCoDo Linking CoCoDo’s to senses

11 Output and benefits Dictionary directly carries corpus evidence Integrated dictionary and corpus Very rich resource for  Lexicographer  Dictionary user (learner, translator, …)  Machine translation  Natural language processing Sound model of D-C relations

12 Current status CoCoDo functions being added to Word Sketch Engine Proposal for WordNet Discussions with UK publishers


Download ppt "Linking Dictionary and Corpus Adam Kilgarriff Lexicography MasterClass Ltd Lexical Computing Ltd University of Sussex UK."

Similar presentations


Ads by Google