Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Word senses: a computational response Adam Kilgarriff.

Similar presentations


Presentation on theme: "1 Word senses: a computational response Adam Kilgarriff."— Presentation transcript:

1 1 Word senses: a computational response Adam Kilgarriff

2 Madrid 2010 Kilgarriff: Word senses: a computational response2 A word sense is a cluster of corpus lines  But I’m an NLP person  Automatic clustering?  Inspiration: Hindle 1991, Schütze 1993, Grefenstette 1993, Lin 1999 You can get semantic sense from corpora+stats

3 Madrid 2010 Kilgarriff: Word senses: a computational response3 First attempt  Longman 1994  Abject failure No grammar Corpus too small and noisy Naïve clustering Useless programmer

4 Madrid 2010 Kilgarriff: Word senses: a computational response4 Collocations  Easy Most words don’t go with most other words  Then build on what we can do well  metaphor, analogy, homonymy, rules all much harder

5 Madrid 2010 Kilgarriff: Word senses: a computational response5 Clustering  Word sketch Collocates organised by grammar  Dictionary Collocates (and other things) organised by meaning  How to re-organise

6 Madrid 2010 Kilgarriff: Word senses: a computational response6 Observation:  corpus: arbitrary sample  dictionary ( =lexicon) : systematic account Children  encounter arbitrary samples  develop systematic account

7 Madrid 2010 Kilgarriff: Word senses: a computational response7 Corpus  provisional, dispensable  used to develop lexicon

8 Madrid 2010 Kilgarriff: Word senses: a computational response8 Levels of abstraction  Direct linkage:  Fragile Updates (to C or D) break links  Dictionary: abstract  Corpus: raw  Intermediate level needed CorpusDictionary ===   ===

9 Madrid 2010 Kilgarriff: Word senses: a computational response9  How most automatic word sense disambiguation (WSD) works Analyse dictionary to give set of collocates Match to collocates in a corpus  Dispensable corpus CorpusDictionary ===   === ===   === Collocates

10 Madrid 2010 Kilgarriff: Word senses: a computational response10 Not just collocates  triples  parse the corpus  some “unary relations” I hear him singing  domain-based clues Collocates, Constructions, Domains = CoCoDo

11 Madrid 2010 Kilgarriff: Word senses: a computational response11  Automatically extract CoCoDos from corpus  How linked to senses? Automatic (WSD techniques) ‏  Manual “dictionary-free”: ideal for new dictionaries Labour costs  Mixed WSD with manual confirmation/correction CorpusDictionary ===   === ===   === CoCoDo CoCoDo Linking CoCoDo’s to senses

12 Madrid 2010 Kilgarriff: Word senses: a computational response12 Semi-automatic dictionary drafting (SADD) ‏  CoCoDo database  Automatic clustering  Lexicographer input  More clustering  Dictionary with corpus inside

13 Madrid 2010 Kilgarriff: Word senses: a computational response13  Automatic clustering of collocates Propose senses  Iterate: Lexicographer input  Confirm/reject/edit sense inventory  Assigns collocates / corpus lines to senses WSD  Uses seeds to build full WSD for word  Find more collocates for each sense  XML dictionary entry Load into dictionary-editing tool

14 Madrid 2010 Kilgarriff: Word senses: a computational response14 Atkins method for bilingual lexicography  Analyse source language From corpus List all expressions that might possibly have a non-predictable translation  Very fine grained  Lots of collocations target-language-neutral; re-usable  Translate  Edit to finalise dictionary

15 Madrid 2010 Kilgarriff: Word senses: a computational response15 Current projects/initiatives  Semi-automatic Dictionary Disambiguation (SADD) ‏  Tickbox Lexicography (TBL) ‏ Slovene project New English-Irish Dictionary  Putting Collocations in the Dictionary (PCID) ‏


Download ppt "1 Word senses: a computational response Adam Kilgarriff."

Similar presentations


Ads by Google