Presentation is loading. Please wait.

Presentation is loading. Please wait.

LREC, Malta MayApril 20 th, 2010 Annotation Scheme and Gold Standard for Dutch sentiment-bearing Adjectives Isa Maks and Piek Vossen Faculty of Arts, VU.

Similar presentations


Presentation on theme: "LREC, Malta MayApril 20 th, 2010 Annotation Scheme and Gold Standard for Dutch sentiment-bearing Adjectives Isa Maks and Piek Vossen Faculty of Arts, VU."— Presentation transcript:

1 LREC, Malta MayApril 20 th, 2010 Annotation Scheme and Gold Standard for Dutch sentiment-bearing Adjectives Isa Maks and Piek Vossen Faculty of Arts, VU University Amsterdam

2 LREC, Malta MayApril 20 th, 2010 Overview of Presentation Annotation scheme for subjectivity and sentiment annotation of (Dutch) Adjectives Composition of the Gold Standard Results of Human Annotation Task Conclusions and Future Work

3 LREC, Malta MayApril 20 th, 2010 Sentiment Lexicon tools for automatic lexicon building systems for rich automatic sentiment analysis and opinion mining Gold Standard to evaluate automatically built sentiment lexicons Guidelines for sentiment and subjectivity annotation evaluation morphology, morfo- syntax, semantics, usage, etc. sentiment and subjectivity information Sentiment Lexicon Dutch Wordnet (synsets) Dutch Reference Lexicon (lexical units)

4 LREC, Malta MayApril 20 th, 2010 Existing Annotation Schemata ‘prior’ polarity: positive (good), negative (ugly), neutral (direct), posneg (curious) ‘prior’ subjectivity: subjective or objective alarm [emotion] vs. alarm [device] annotation at word sense (or synset) level Wiebe et al. (2004, 2006), Su et al. (2008) new: Attitude Holder

5 LREC, Malta MayApril 20 th, 2010 Attitude Holder (examples)...there are reports from inside Gaza (AE) that criticize (NEG) Hamas (TOPIC)...the dominant media (AE) vilify (NEG) Hamas (TOPIC) and …. (SW) Bush (AE) is angry (NEG) about Obama’s behaviour (TOPIC)... Bush is bad (NEG) for the economy... (SW) SW = Speaker or Writer AE = Agent or Experiencer

6 LREC, Malta MayApril 20 th, 2010 Values for Attitude Holder Annotation SW: speaker’s or writer’s attitude (bad, ugly, beautiful) AE: agent’s or experiencer’s attitude ( angry, bent on) no-specific attitude holder (water proof, rainy, biological)

7 LREC, Malta MayApril 20 th, 2010 burgerlijk (civil)obj-ntrn.r.Burgerlijk huwelijk (civil marriage) burgerlijk (narrow-minded) subjSWnegjudgment (moral)Zijn buren zijn vreselijk burgerlijk (his neighbours are terribly narrow- minded) wreed (cruel)subjSWnegjudgment (moral)Een wrede despoot (a cruel tyrant) wreed (fantastic, cool)subjSWposappreciationZe rijden daar in vet wrede auto’s rond (they drive around in really cool cars) gelukkig (happy, satisfied)subjAEposemotionBos is gelukkig met Zalms keuze (Bos is happy with Zalm’s choice) bijziend (myopic)obj-negdescriptiveZo’n 30% van de bevolking is bijziend (30% of the population is myopic) afkerig (averse)subjAEnegappreciationHij is afkerig van geld (he is averse of money) Subjectivity Polarity Attitude Holder Semantic CategoryLexical UnitIllustration

8 LREC, Malta MayApril 20 th, 2010 a cautious estimate W.B. is happy with the choice for.. Bush is angry over Obama's leeking of private conversation..... They drive around in beautiful cars Bush is bad for the economy …. water resistant watches deaf man civil marriage Polarity, Subjectivity and Attitude Holder Subj=subjectivity Obj=objectivity AE= Agent/Experiencer SW=Speaker/Writer No-AH=no attitude holder

9 LREC, Malta MayApril 20 th, 2010 Gold Standard Annotation Schema Summarizing: annotation at word sense level (instead of word level) because words may be subjectivity or polarity ambiguous annotation of subjectivity (objective vs. subjective), polarity (positive, negative, posneg, neutral), attitude holder (whose opinion: speaker/writer or agent/experiencer) Question: How reliable is human annotation with a complex schema for subjectivity annotation

10 LREC, Malta MayApril 20 th, 2010 Data Set Gold Standard Requirements: representative of the whole lexicon relevant to automatic annotation of subjectivity –inclusion of subjective and objective lexical items –equal distribution of items across the lexicon with regard to frequency, polysemy and synset size English General Inquirer (Stone, 1966), Hatzivassiloglou, V. et al. (1997), Riloff and Wiebe (2005), Jijkoun et al. (2008) Micro-WNOp (Cerini et al., 2007), Su et al. (2008)

11 LREC, Malta MayApril 20 th, 2010 Composition Gold Standard ADJECTIVES frequencypolysemysynsetsize high mid low variants: 609 lexical units 512 synsets 390 words

12 LREC, Malta MayApril 20 th, 2010 Inter-annotator results Polarity Annotation Attitude Holder Annotation Both 86.3% (k=0.80)87% (k=0.73)79% (k=0.73) single-category kappa computation polarity attitude holderboth polarity and attitude holder overall agreement for 2 annotators

13 LREC, Malta MayApril 20 th, 2010 Comparison with Other Studies word sense level word level synset level

14 LREC, Malta MayApril 20 th, 2010 Analysis of Disagreements OBJ-neg (0.34) vs. SW-neg and OBJ-pos (0.23) vs. SW-pos kaalhoofdig (bald-headed), oud (old- having lived for a long time), mute (doofstom), droog (dry), langzaam (slow), zuiver (pure), etc. AE-pos vs. SW-neg belust (bent on) - hij is belust op geld (he is bent on money)

15 LREC, Malta MayApril 20 th, 2010 human annotations across various lexicon dimensions agreement decreases when word frequency increases agreement decreases when polysemy increases agreement increases when item is a member of a large synset (65%)

16 LREC, Malta MayApril 20 th, 2010 Conclusions Development of similar annotation schemes and gold standards for nouns and verbs use of the gold standard to test methods and techniques to build a sentiment lexicon for Dutch We designed a new annotation scheme for polarity, subjectivity and attitude holder annotation and showed that all substantial categories can be reliably annotated by human annotators. We assume that this holds for automatic annotation as well. We aimed at an equal distribution of test items across 3 lexicon dimensions (word frequency, large synset membership and polysemy) relevant to subjectivity and polarity identification; we measured correlations between polarity annotation and each of these lexicon dimensions. Future Work

17 LREC, Malta MayApril 20 th, 2010 Acknowledgements The research is part of the project From Text To Political Positions (http://www2.let.vu.nl/oz/cltl/t2pp)http://www2.let.vu.nl/oz/cltl/t2pp Funded by the Interfaculty Reseach Institute CAMeRA - VU university Amsterdam Gold standard data available at (http:// www2.let.vu.nl/ oz/cltl/t2pp)http:// www2.let.vu.nl/ oz/cltl/t2pp

18 LREC, Malta MayApril 20 th, 2010 Thank you for your attention

19 LREC, Malta MayApril 20 th, 2010 objSWntrburgerlijk (civil)Burgerlijk huwelijk (civil marriage) objSWnegdescriptivebijziend (myopic)Zo’n 30% van de bevolking is bijziend (30% of the population is myopic) subjSWnegjudgment (moral)wreed (cruel)Een wrede despoot (a cruel tyrant) subjSWposappreciationwreed (fantastic)Ze rijden daar in vet wrede auto’s rond (they drive around in cool cars) subjSWnegjudgment (moral)burgerlijk (narrow-minded) Zijn buren zijn vreselijk burgerlijk (his neighbours are terribly narrow- minded) subjAEposemotionboos (angry)Bos is gelukkig met Zalms keuze (Bos is happy with Zalm’s choice) subjAEnegappreciationafkerig (averse)Hij is afkerig van geld (he is averse of money) Subjectivity Polarity Attitude HolderSemantic Category Lexical UnitIllustration

20 LREC, Malta MayApril 20 th, 2010 Attitude holder: CDA-lijsttrekker Polarity negative topic: linkse coalitie

21 LREC, Malta MayApril 20 th, 2010 polaritySubj vs. Obj Attitude holder This Study86% κ= % κ =0.73 Jijkoun et al. (2008) 79% κ =0.66 Andreevskaia et al.(2006) 79% Su et al.(2009) 89% κ = % κ =0.79

22 LREC, Malta MayApril 20 th, 2010 What is an Opinion or Attitude (Kim, Hovy 2006) (1) Bush is bad for the economy (2) Bush is angry about Obama’s behaviour judgment emotion -> judgment sentence1sentence2 attitude holder Speaker/writerBush polarity negative topic Bush (for the economy)Obama’s behaviour

23 LREC, Malta MayApril 20 th, 2010 Gold standard distribution polarity attitude holder polarity and attitude holder

24 LREC, Malta MayApril 20 th, 2010


Download ppt "LREC, Malta MayApril 20 th, 2010 Annotation Scheme and Gold Standard for Dutch sentiment-bearing Adjectives Isa Maks and Piek Vossen Faculty of Arts, VU."

Similar presentations


Ads by Google