Presentation is loading. Please wait.

Presentation is loading. Please wait.

WordNet Sense Tag Annotation Tool Using Python to aid in human annotation of a corpus.

Similar presentations


Presentation on theme: "WordNet Sense Tag Annotation Tool Using Python to aid in human annotation of a corpus."— Presentation transcript:

1 WordNet Sense Tag Annotation Tool Using Python to aid in human annotation of a corpus

2 Overview Input – Ambiguous word list, sense definitions – { ‘fish’ : – {‘fish_verb’ : [fish.v.01, fish.v.02], ‘fish_animal’ : [fish.n.01], ‘fish_food’ : [fish.n.02]}, ‘bass’ : – {‘bass_music’ : [bass.n.01, bass.n.02, bass.n.03, bass.n.06, bass.s.01], ‘bass_fish’ : [freshwater_bass.n.01, bass.n.08]} – } Output – Corpus tagged by sense – “…the ability to produce fine deep-sounding bass > tones while contrasting them simultaneously with…”

3 Details I: [‘fish’, ‘bass’] O: fish Synset(‘fish.n.01’) any of various mostly cold-blooded aquatic vertebrates usually having scales and breathing through gills O: bass Synset(‘bass.n.01’) the lowest part of the musical range I: { ‘fish’ : {‘fish_verb’ : [fish.v.01, fish.v.02], ‘fish_animal’ : [fish.n.01], ‘fish_food’ : [fish.n.02]}, ‘bass’ : {‘bass_music’ : [bass.n.01, bass.n.02, bass.n.03, bass.n.06, bass.s.01], ‘bass_fish’ : [freshwater_bass.n.01, bass.n.08]} }

4 Details I: Corpus – List of sentences (strings) – O: for each sentence with one of more of the words in question… text – There true yachtsmen often find November winds steadier, the waters cooler, the fish hungrier, and rivers more pleasant -- less turbulence and mud, and fewer floating logs – Which sense is this usage of the word fish? » 0: fish_verb1: fish_animal2: fish_food – I: 1 (in this case) – O: sentence with tag – There true yachtsmen often find November winds steadier, the waters cooler, the fish/fish_animal/ hungrier, and rivers more pleasant -- less turbulence and mud, and fewer floating log Ultimate Output: Tagged Corpus


Download ppt "WordNet Sense Tag Annotation Tool Using Python to aid in human annotation of a corpus."

Similar presentations


Ads by Google