Presentation is loading. Please wait.

Presentation is loading. Please wait.

Subjectivity and Sentiment Analysis Jan Wiebe Department of Computer Science CERATOPS: Center for the Extraction and Summarization of Events and Opinions.

Similar presentations


Presentation on theme: "Subjectivity and Sentiment Analysis Jan Wiebe Department of Computer Science CERATOPS: Center for the Extraction and Summarization of Events and Opinions."— Presentation transcript:

1 Subjectivity and Sentiment Analysis Jan Wiebe Department of Computer Science CERATOPS: Center for the Extraction and Summarization of Events and Opinions in Text University of Pittsburgh

2 What is Subjectivity? The linguistic expression of somebody’s opinions, sentiments, emotions, evaluations, beliefs, speculations (private states) Wow, this is my 4th Olympus camera Staley declared it to be “one wicked song” Most voters believe he won‘t raise their taxes

3 One Motivation Automatic question answering…

4 Fact-Based Question Answering Q: When is the first day of spring in 2007? Q: Does the US have a tax treaty with Cuba?

5 Fact-Based Question Answering Q: When is the first day of spring in 2007? A: March 21 Q: Does the US have a tax treaty with Cuba? A: Thus, the U.S. has no tax treaties with nations like Iraq and Cuba.

6 Opinion Question Answering Q: What is the international reaction to the reelection of Robert Mugabe as President of Zimbabwe?

7 Opinion Question Answering Q: What is the international reaction to the reelection of Robert Mugabe as President of Zimbabwe? A: African observers generally approved of his victory while Western Governments strongly denounced it.

8 More Motivations Product review mining: What features of the ThinkPad T43 do customers like and which do they dislike? Review classification: Is a review positive or negative toward the movie? Information Extraction: Is “killing two birds with one stone” a terrorist event? Tracking sentiments toward topics over time: Is anger ratcheting up or cooling down? Etcetera!

9 wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </>

10 wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 1

11 wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 2

12 wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 3

13 wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 4

14 Outline Subjectivity annotation scheme (MPQA) Learning subjective expressions from unannotated texts Contextual polarity of sentiment expressions Word sense and subjectivity Conclusions and pointers to related work

15 wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 1

16 Corpus Annotation Wiebe, Wilson, Cardie 2005 Wilson & Wiebe 2005 Somasundaran, Wiebe, Hoffmann, Litman 2006 Wilson 2007

17 Outline for Section 1 Overview Frame types Nested Sources Extensions

18 Overview Fine-grained: expression level rather than sentence or document level Annotate –expressions of opinions, sentiments, emotions, evaluations, speculations, … –material attributed to a source, but presented objectively

19 Overview Opinions, evaluations, emotions, speculations are private states They are expressed in language by subjective expressions Private state: state that is not open to objective observation or verification Quirk, Greenbaum, Leech, Svartvik (1985). A Comprehensive Grammar of the English Language.

20 Overview Focus on three ways private states are expressed in language –Direct subjective expressions –Expressive subjective elements –Objective speech events

21 Direct Subjective Expressions Direct mentions of private states The United States fears a spill-over from the anti-terrorist campaign. Private states expressed in speech events “We foresaw electoral fraud but not daylight robbery,” Tsvangirai said.

22 Expressive Subjective Elements Banfield 1982 “We foresaw electoral fraud but not daylight robbery,” Tsvangirai said The part of the US human rights report about China is full of absurdities and fabrications

23 Objective Speech Events Material attributed to a source, but presented as objective fact (without evaluation…) The government, it added, has amended the Pakistan Citizenship Act 10 of 1951 to enable women of Pakistani descent to claim Pakistani nationality for their children born to foreign husbands.

24

25 Nested Sources “The report is full of absurdities,’’ Xirao-Nima said the next day.

26 Nested Sources “The report is full of absurdities,’’ Xirao-Nima said the next day. (Writer)

27 Nested Sources “The report is full of absurdities,’’ Xirao-Nima said the next day. (Writer, Xirao-Nima)

28 Nested Sources “The report is full of absurdities,’’ Xirao-Nima said the next day. (Writer Xirao-Nima)

29 Nested Sources “The report is full of absurdities,’’ Xirao-Nima said the next day. (Writer Xirao-Nima) (Writer)

30 “The report is full of absurdities,” Xirao-Nima said the next day. Objective speech event anchor: the entire sentence source: implicit: true Direct subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative

31 “The report is full of absurdities,” Xirao-Nima said the next day. Objective speech event anchor: the entire sentence source: implicit: true Direct subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative

32 “The report is full of absurdities,” Xirao-Nima said the next day. Objective speech event anchor: the entire sentence source: implicit: true Direct subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative

33 “The report is full of absurdities,” Xirao-Nima said the next day. Objective speech event anchor: the entire sentence source: implicit: true Direct subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative

34 “The report is full of absurdities,” Xirao-Nima said the next day. Objective speech event anchor: the entire sentence source: implicit: true Direct subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative

35 “The report is full of absurdities,” Xirao-Nima said the next day. Objective speech event anchor: the entire sentence source: implicit: true Direct subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative

36 “The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities.

37 (Writer)

38 “The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities. (writer, Xirao-Nima)

39 “The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities. (writer, Xirao-Nima, US)

40 “The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities. (writer, Xirao-Nima, US) (writer, Xirao-Nima) (Writer)

41 Objective speech event anchor: the entire sentence source: implicit: true Objective speech event anchor: said source: Direct subjective anchor: fears source: intensity: medium expression intensity: medium attitude type: negative target: spill-over “The US fears a spill-over’’, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities.

42 Corpus @ www.cs.pitt.edu/mpqa English language versions of articles from the world press (187 news sources) Also includes contextual polarity annotations Themes of the instructions: –No rules about how particular words should be annotated. –Don’t take expressions out of context and think about what they could mean, but judge them as they are used in that sentence.

43 Extensions Wilson 2007

44 I think people are happy because Chavez has fallen. direct subjective span: are happy source: attitude: inferred attitude span: are happy because Chavez has fallen type: neg sentiment intensity: medium target: target span: Chavez has fallen target span: Chavez attitude span: are happy type: pos sentiment intensity: medium target: direct subjective span: think source: attitude: attitude span: think type: positive arguing intensity: medium target: target span: people are happy because Chavez has fallen

45 Outline Subjectivity annotation scheme (MPQA) Learning subjective expressions from unannotated texts Contextual polarity of sentiment expressions Word sense and subjectivity Conclusions and pointers to related work

46 wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 2

47 Outline for Section 2 Learning subjective nouns with extraction pattern bootstrapping Automatically generating training data with high-precision classifiers Learning subjective and objective expressions (not simply words or n-grams) Riloff, Wiebe, Wilson 2003; Riloff & Wiebe 2003; Wiebe & Riloff 2005; Riloff, Patwardhan, Wiebe 2006.

48 Outline for Section 2 Learning subjective nouns with extraction pattern bootstrapping Automatically generating training data with high-precision classifiers Learning subjective and objective expressions

49 Information Extraction Information extraction (IE) systems identify facts related to a domain of interest. Extraction patterns are lexico-syntactic expressions that identify the role of an object. For example: was killed assassinated murder of

50 Learning Subjective Nouns Goal: to learn subjective nouns from unannotated text Method: applying IE-based bootstrapping algorithms that were designed to learn semantic categories Hypothesis: extraction patterns can identify subjective contexts that co-occur with subjective nouns Example: “expressed ” concern, hope, support

51 Extraction Examples expressed condolences, hope, grief, views, worries indicative of compromise, desire, thinking inject vitality, hatred reaffirmed resolve, position, commitment voiced outrage, support, skepticism, opposition, gratitude, indignation show of support, strength, goodwill, solidarity was sharedanxiety, view, niceties, feeling

52 Meta-Bootstrapping Riloff & Jones 1999 Unannotated Texts Best Extraction Pattern Extractions (Nouns) Ex: hope, grief, joy, concern, worries Ex: expressed Ex: happiness, relief, condolences

53 Subjective Seed Words cowardiceembarrassment hatred outrage crapfool hell slander delightgloom hypocrisy sigh disdaingrievance love twit dismayhappiness nonsense virtue

54 Subjective Noun Results Bootstrapping corpus: 950 unannotated news articles We ran both bootstrapping algorithms for several iterations We manually reviewed the words and labeled them as strong, weak, or not subjective 1052 subjective nouns were learned (454 strong, 598 weak) –included in our subjectivity lexicon @ www.cs.pitt.edu/mpqa

55 Examples of Strong Subjective Nouns anguish exploitation pariah antagonism evil repudiation apologist fallacies revenge atrocities genius rogue barbarian goodwill sanctimonious belligerence humiliationscum bully ill-treatment smokescreen condemnation injustice sympathy denunciation innuendo tyranny devil insinuation venom diatribe liar exaggeration mockery

56 Examples of Weak Subjective Nouns aberration eyebrowsresistant allusion failuresrisk apprehensions inclinationsincerity assault intrigue slump beneficiary liabilityspirit benefit likelihoodsuccess blood peacefultolerance controversy persistenttrick credence plaguetrust distortion pressureunity drama promise eternity rejection

57 Outline for Section 2 Learning subjective nouns with extraction pattern bootstrapping Automatically generating training data with high-precision classifiers Learning subjective and objective expressions

58 Training Data Creation rule-based subjective sentence classifier rule-based objective sentence classifier subjective & objective sentences unlabeled texts subjective clues

59 Subjective Clues Subjectivity lexicon –@ www.cs.pitt.edu/mpqa –Entries from manually developed resources (e.g., General Inquirer, Framenet) + –Entries automatically identified (E.g., nouns just described)

60 Creating High-Precision Rule- Based Classifiers a sentence is subjective if it contains  2 strong subjective clues a sentence is objective if: –it contains no strong subjective clues –the previous and next sentence contain  1 strong subjective clue –the current, previous, and next sentence together contain  2 weak subjective clues GOAL: use subjectivity clues from previous research to build a high-precision (low-recall) rule-based classifier

61 Accuracy of Rule-Based Classifiers (measured against MPQA annotations) SubjRecSubjPrecSubjF Subj RBC 34.2 90.4 49.6 ObjRecObjPrecObjF Obj RBC 30.7 82.444.7

62 Generated Data We applied the rule-based classifiers to ~300,000 sentences from (unannotated) new articles ~53,000 were labeled subjective ~48,000 were labeled objective training set of over 100,000 labeled sentences!

63 Generated Data The generated data may serve as training data for supervised learning, and initial data for self training Wiebe & Riloff 2005 The rule-based classifiers are part of OpinionFinder @ www.cs.pitt.edu/mpqa Here, we use the data to learn new subjective expressions…

64 Outline for Section 2 Learning subjective nouns with extraction pattern bootstrapping Automatically generating training data with high-precision classifiers Learning subjective and objective expressions

65 Representing Subjective Expressions with Extraction Patterns EPs can represent expressions that are not fixed word sequences drove [NP] up the wall - drove him up the wall - drove George Bush up the wall - drove George Herbert Walker Bush up the wall step on [modifiers] toes - step on her toes - step on the mayor’s toes - step on the newly elected mayor’s toes gave [NP] a [modifiers] look - gave his annoying sister a really really mean look

66 The Extraction Pattern Learner Used AutoSlog-TS [Riloff 96] to learn EPs AutoSlog-TS needs relevant and irrelevant texts as input Statistics are generated measuring each pattern’s association with the relevant texts The subjective sentences are the relevant texts, and the objective sentences are the irrelevant texts

67 passive-vp was satisfied active-vp complained active-vp dobj dealt blow active-vp infinitive appears to be passive-vp infinitive was thought to be auxiliary dobj has position active-vp endorsed infinitive to condemn active-vp infinitive get to know passive-vp infinitive was meant to show subject auxiliary fact is passive-vp prep opinion on active-vp prep agrees with infinitive prep was worried about noun prep to resort to

68 RelevantIrrelevant [The World Trade Center], [an icon] of [New York City], was intentionally attacked very early on [September 11, 2001]. Parser Extraction Patterns: was attacked icon of was attacked on Syntactic Templates AutoSlog-TS (Step 1)

69 AutoSlog-TS (Step 2) RelevantIrrelevant Extraction PatternsFreqProb was attacked100.90 icon of 5.20 was attacked on 80.79 Extraction Patterns: was attacked icon of was attacked on

70 Identifying Subjective Patterns Subjective patterns: –Freq > X –Probability > Y –~6,400 learned on 1 st iteration Evaluation against the MPQA corpus: –direct evaluation of performance as simple classifiers –evaluation of classifiers using patterns as additional features –Does the system learn new knowledge? Add patterns to the strong subjective set and re-run the rule-based classifier recall += 20 while precision -+ 7 on 1 st iteration

71 Patterns with Interesting Behavior PATTERNFREQP(Subj | Pattern) asked 128.63 was asked 11 1.0 was expected 45.42 was expected from 5 1.0 talk 28.71 talk of 10.90 is talk 5 1.0 put 187.67 put end 10.90 is fact 38 1.0 fact is 12 1.0

72 Conclusions for Section 2 Extraction pattern bootstrapping can learn subjective nouns (unannotated data) Extraction patterns can represent richer subjective expressions Learning methods can discover subtle distinctions that are important for subjectivity (unannotated data) Ongoing work: –lexicon representation integrating different types of entries, enabling comparisons (e.g., subsumption relationships) –learning and bootstrapping processes applied to much larger unannotated corpora –processes applied to learning positive and negative subjective expressions

73 Outline Subjectivity annotation scheme (MPQA) Learning subjective expressions from unannotated texts Contextual polarity of sentiment expressions (briefly) –Wilson, Wiebe, Hoffmann 2005; Wilson 2007; Wilson & Wiebe forthcoming Word sense and subjectivity Conclusions and pointers to related work

74 wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn great wicked <> </><> </> <> </> <> </> 3

75 Most approaches use a lexicon of positive and negative words Prior polarity: out of context, positive or negative beautiful  positive horrid  negative A word may appear in a phrase that expresses a different polarity in context Contextual polarity “Cheers to Timothy Whitfield for the wonderfully horrid visuals.” Prior Polarity versus Contextual Polarity

76 Goal of This Work Automatically distinguish prior and contextual polarity

77 Approach Use machine learning and variety of features Achieve significant results for a large subset of sentiment expressions Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances

78 Manual Annotations Subjective expressions of the MPQA corpus annotated with contextual polarity

79 Annotation Scheme Mark polarity of subjective expressions as positive, negative, both, or neutral African observers generally approved of his victory while Western governments denounced it. Besides, politicians refer to good and evil … Jerome says the hospital feels no different than a hospital in the states. positive negative both neutral

80 Annotation Scheme Judge the contextual polarity of sentiment ultimately being conveyed They have not succeeded, and will never succeed, in breaking the will of this valiant people.

81 Annotation Scheme Judge the contextual polarity of sentiment ultimately being conveyed They have not succeeded, and will never succeed, in breaking the will of this valiant people.

82 Annotation Scheme Judge the contextual polarity of sentiment ultimately being conveyed They have not succeeded, and will never succeed, in breaking the will of this valiant people.

83 Annotation Scheme Judge the contextual polarity of sentiment ultimately being conveyed They have not succeeded, and will never succeed, in breaking the will of this valiant people.

84 Features Many inspired by Polanyi & Zaenen (2004): Contextual Valence Shifters Example:little threat little truth Others capture dependency relationships between words Example: wonderfully horrid pos mod

85 1.Word features 2.Modification features 3.Structure features 4.Sentence features 5.Document feature Word token terrifies Word part-of-speech VB Context that terrifies me Prior Polarity negative Reliability strong subjective Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances

86 1.Word features 2.Modification features 3.Structure features 4.Sentence features 5.Document feature Binary features: Preceded by – adjective – adverb (other than not) – intensifier Self intensifier Modifies – strong clue – weak clue Modified by – strong clue – weak clue Dependency Parse Tree Thehumanrights report poses asubstantial challenge … det adj mod adj det subjobj p Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances

87 1.Word features 2.Modification features 3.Structure features 4.Sentence features 5.Document feature Binary features: In subject [The human rights report] poses In copular I am confident In passive voice must be regarded Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances Thehumanrights report poses asubstantial challenge … det adj mod adj det subjobj p

88 1.Word features 2.Modification features 3.Structure features 4.Sentence features 5.Document feature Count of strong clues in previous, current, next sentence Count of weak clues in previous, current, next sentence Counts of various parts of speech Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances

89 1.Word features 2.Modification features 3.Structure features 4.Sentence features 5.Document feature Document topic (15) – economics – health – Kyoto protocol – presidential election in Zimbabwe … Example: The disease can be contracted if a person is bitten by a certain tick or if a person comes into contact with the blood of a congo fever sufferer. Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances

90 Word token Word prior polarity Negated Negated subject Modifies polarity Modified by polarity Conjunction polarity General polarity shifter Negative polarity shifter Positive polarity shifter Word token terrifies Word prior polarity negative Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances

91 Word token Word prior polarity Negated Negated subject Modifies polarity Modified by polarity Conjunction polarity General polarity shifter Negative polarity shifter Positive polarity shifter Binary features: Negated For example: –not good –does not look very good  not only good but amazing Negated subject No politically prudent Israeli could support either of them. Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances

92 Word token Word prior polarity Negated Negated subject Modifies polarity Modified by polarity Conjunction polarity General polarity shifter Negative polarity shifter Positive polarity shifter Modifies polarity 5 values: positive, negative, neutral, both, not mod substantial: negative Modified by polarity 5 values: positive, negative, neutral, both, not mod challenge: positive substantial (pos) challenge (neg) Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances

93 Word token Word prior polarity Negated Negated subject Modifies polarity Modified by polarity Conjunction polarity General polarity shifter Negative polarity shifter Positive polarity shifter Conjunction polarity 5 values: positive, negative, neutral, both, not mod good: negative good (pos) and evil (neg) Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances

94 Word token Word prior polarity Negated Negated subject Modifies polarity Modified by polarity Conjunction polarity General polarity shifter Negative polarity shifter Positive polarity shifter General polarity shifter have few risks/rewards Negative polarity shifter lack of understanding Positive polarity shifter abate the damage Corpus Lexicon Neutral or Polar? Step 1 Contextual Polarity? Step 2 All Instances Polar Instances

95 Findings Statistically significant improvements can be gained Require combining all feature types Ongoing work: –richer lexicon entries –compositional contextual processing

96 S/O and Pos/Neg: both Important Several studies have found two steps beneficial –Yu & Hatzivassiloglou 2003; Pang & Lee 2004; Wilson et al. 2005; Kim & Hovy 2006 Subjective Objective Positive Negative Neutral Both

97 S/O and Pos/Neg: both Important S/O can be more difficult –Manual annotation of phrases Takamura et al. 2006 –Contextual polarity Wilson et al. 2005 –Sentiment tagging of words Andreevskaia & Bergler 2006 –Sentiment tagging of word senses Esuli & Sabastiani 2006 –Later: evidence that S/O more significant for senses Subjective Objective Positive Negative Neutral Both

98 S/O and Pos/Neg: both Important Desirable for NLP systems to find a wide range of private states, including motivations, thoughts, and speculations, not just positive and negative sentiments

99 Sentiment Other Subjectivity Objective Pos Neg Both Pos Neg Neu Both S/O and Pos/Neg: both Important

100 Outline Subjectivity annotation scheme (MPQA) Learning subjective expressions from unannotated texts Contextual polarity of sentiment expressions Word sense and subjectivity –Wiebe & Mihalcea 2006 Conclusions and pointers to related work

101 wicked visuals loudly condemned The building has been condemned QA Review Mining Opinion Tracking condemn> condemn#1 condemn#2 condemn#3 <> </><> </> <> </> <> </> 4

102 Introduction Continuing interest in word sense –Sense annotated resources being developed for many languages www.globalwordnet.org –Active participation in evaluations such as SENSEVAL

103 Word Sense and Subjectivity Though both are concerned with text meaning, until recently they have been investigated independently

104 Subjectivity Labels on Senses Alarm, dismay, consternation – (fear resulting from the awareness of danger) Alarm, warning device, alarm system – (a device that signals the occurrence of some undesirable event) S O

105 Subjectivity Labels on Senses Interest, involvement -- (a sense of concern with and curiosity about someone or something; "an interest in music") Interest -- (a fixed charge for borrowing money; usually a percentage of the amount borrowed; "how much interest do you pay on your mortgage?") S O

106 WSD using Subjectivity Tagging The notes do not pay interest. He spins a riveting plot which grabs and holds the reader’s interest. Sense 4 “a sense of concern with and curiosity about someone or something” S Sense 1 “a fixed charge for borrowing money” O WSD System Sense 4 Sense 1? Sense 1 Sense 4?

107 Sense 4 “a sense of concern with and curiosity about someone or something” S Sense 1 “a fixed charge for borrowing money” O WSD using Subjectivity Tagging The notes do not pay interest. He spins a riveting plot which grabs and holds the reader’s interest. WSD System Sense 4 Sense 1? Sense 1 Sense 4? Subjectivity Classifier S O

108 Sense 4 “a sense of concern with and curiosity about someone or something” S Sense 1 “a fixed charge for borrowing money” O WSD using Subjectivity Tagging The notes do not pay interest. He spins a riveting plot which grabs and holds the reader’s interest. WSD System Sense 4 Sense 1? Sense 1 Sense 4? Subjectivity Classifier S O

109 Subjectivity Classifier Subjectivity Tagging using WSD The notes do not pay interest. He spins a riveting plot which grabs and holds the reader’s interest. O S? S O?

110 Subjectivity Classifier S Sense 4 “a sense of concern with and curiosity about someone or something” O Sense 1 “a fixed charge for borrowing money” Subjectivity Tagging using WSD The notes do not pay interest. He spins a riveting plot which grabs and holds the reader’s interest. WSD System Sense 4 Sense 1 O S? S O?

111 Subjectivity Classifier S Sense 4 “a sense of concern with and curiosity about someone or something” O Sense 1 “a fixed charge for borrowing money” Subjectivity Tagging using WSD The notes do not pay interest He spins a riveting plot which grabs and holds the reader’s interest. WSD System Sense 4 Sense 1 O S? S O?

112 Refining WordNet Semantic Richness Find inconsistencies and gaps –Verb assault – attack, round, assail, last out, snipe, assault (attack in speech or writing) “The editors of the left-leaning paper attacked the new House Speaker” –But no sense for the noun as in “His verbal assault was vicious”

113 Goals Explore interactions between word sense and subjectivity –Can subjectivity labels be assigned to word senses? Manually Automatically –Can subjectivity analysis improve word sense disambiguation? –Can word sense disambiguation improve subjectivity analysis? Current work

114 Outline for Section 4 Motivation and Goals Assigning Subjectivity Labels to Word Senses –Manually –Automatically Word Sense Disambiguation using Automatic Subjectivity Analysis Conclusions

115 Annotation Scheme Assigning subjectivity labels to WordNet senses –S: subjective –O: objective –B: both

116 Examples Alarm, dismay, consternation – (fear resulting form the awareness of danger) –Fear, fearfulness, fright – (an emotion experiences in anticipation of some specific pain or danger (usually accompanied by a desire to flee or fight)) Alarm, warning device, alarm system – (a device that signals the occurrence of some undesirable event) –Device – (an instrumentality invented for a particular purpose; “the device is small enough to wear on your wrist”; “a device intended to conserve water” S O

117 Subjective Sense Definition When the sense is used in a text or conversation, we expect it to express subjectivity, and we expect the phrase/sentence containing it to be subjective.

118 Subjective Sense Examples His alarm grew Alarm, dismay, consternation – (fear resulting form the awareness of danger) –Fear, fearfulness, fright – (an emotion experiences in anticipation of some specific pain or danger (usually accompanied by a desire to flee or fight)) He was boiling with anger Seethe, boil – (be in an agitated emotional state; “The customer was seething with anger”) –Be – (have the quality of being; (copula, used with an adjective or a predicate noun); “John is rich”; “This is not a good answer”)

119 Subjective Sense Examples What’s the catch? Catch – (a hidden drawback; “it sounds good but what’s the catch?”) Drawback – (the quality of being a hindrance; “he pointed out all the drawbacks to my plan”) That doctor is a quack. Quack – (an untrained person who pretends to be a physician and who dispenses medical advice) –Doctor, doc, physician, MD, Dr., medico

120 Objective Sense Examples The alarm went off Alarm, warning device, alarm system – (a device that signals the occurrence of some undesirable event) –Device – (an instrumentality invented for a particular purpose; “the device is small enough to wear on your wrist”; “a device intended to conserve water” The water boiled Boil – (come to the boiling point and change from a liquid to vapor; “Water boils at 100 degrees Celsius”) –Change state, turn – (undergo a transformation or a change of position or action)

121 Objective Sense Examples He sold his catch at the market Catch, haul – (the quantity that was caught; “the catch was only 10 fish”) –Indefinite quantity – (an estimated quantity) The duck’s quack was loud and brief Quack – (the harsh sound of a duck) –Sound – (the sudden occurrence of an audible event)

122 Objective Senses: Observation We don’t necessarily expect phrases/sentences containing objective senses to be objective –Will someone shut that darn alarm off? –Can’t you even boil water? Subjective, but not due to alarm and boil

123 Objective Sense Definition When the sense is used in a text or conversation, we don’t expect it to express subjectivity and, if the phrase/sentence containing it is subjective, the subjectivity is due to something else.

124 Inter-Annotator Agreement Results Overall: –Kappa=0.74 –Percent Agreement=85.5% Without the 12.3% cases when a judge is uncertain: –Kappa=0.90 –Percent Agreement=95.0%

125 S/O and Pos/Neg Hypothesis: moving from word to word sense is more important for S versus O than it is for Positive versus Negative Pilot study with the nouns of the SENSEVAL-3 English lexical sample task –½ have both subjective and objective senses –Only 1 has both positive and negative subjective senses

126 Outline for Section 4 Motivation and Goals Assigning Subjectivity Labels to Word Senses –Manually –Automatically Word Sense Disambiguation using Automatic Subjectivity Analysis Conclusions

127 Overview Main idea: assess the subjectivity of a word sense based on information about the subjectivity of –a set of distributionally similar words –in a corpus annotated with subjective expressions

128 Preliminaries Suppose the goal were to assess the subjectivity of a word w, given an annotated corpus We could consider how often w appears in subjective expressions Or, we could take into account more evidence – the subjectivity of a set of distributionally similar words

129 Lin’s Distributional Similarity Lin 1998 Ihaveabrowndog R1 R3 R2 R4 Word R W I R1 have have R2 dog brown R3 dog...

130 Lin’s Distributional Similarity R W R W Word1 Word2

131 Subjectivity of word w Unannotated Corpus (BNC) DSW = {dsw 1, …, dsw j } Annotated Corpus (MPQA) #insts(DSW) in SE - #insts(DSW) not in SE #insts (DSW) subj(w) =

132 Subjectivity of word w Unannotated Corpus (BNC) DSW = {dsw 1, …, dsw j } Annotated Corpus (MPQA) [-1, 1] [highly objective, highly subjective] #insts(DSW) in SE - #insts(DSW) not in SE #insts (DSW) subj(w) =

133 Subjectivity of word w Unannotated Corpus (BNC) DSW = {dsw 1,dsw 2 } Annotated Corpus (MPQA) dsw 1 inst 1 dsw 1 inst 2 dsw 2 inst 1 +1 +1 +1 -1 +1 subj(w) = 3 = 1/3

134 Subjectivity of word sense w i Rather than 1, add or subtract sim(w i,dsw j ) Annotated Corpus (MPQA) dsw 1 inst 1 dsw 1 inst 2 dsw 2 inst 1 +sim(w i,dsw 1 ) -sim(w i,dsw 1 ) +sim(w i,dsw 2 ) subj(w i ) = +sim(w i,dsw 1 ) - sim(w i,dsw 1 ) + sim(w i,dsw 2 ) 2 * sim(w i,dsw 1 ) + sim(w i,dsw 2 )

135 Method –Step 1 Given word w Find distributionally similar words –DSW = {dsw j | j = 1.. n}

136 Method –Step 2 word w = Alarm DSW 1 Panic DSW 2 Detector Sense w 1 “fear resulting from the awareness of danger” sim(w 1,panic)sim(w 1,detector) Sense w 2 “a device that signals the occurrence of some undesirable event” sim(w 2,panic)sim(w 2,detector)

137 Method – Step 2 Find the similarity between each word sense and each distributionally similar word  wnss can be any concept-based similarity measure between word senses  we use Jiang & Conrath 1997

138 Method –Step 3 Input: word sense w i of word w DSW = {dsw j | j = 1..n} sim(w i,dsw j ) MPQA Opinion Corpus Output: subjectivity score subj(w i )

139 Method –Step 3 total sim = #insts(dsw j ) * sim(w i,dsw j ) evi_subj = 0 for each dsw j in DSW: for each instance k in insts(dsw j ): if k is in a subjective expression: evi_subj += sim(w i,dsw j ) else: evi_subj -= sim(w i,dsw j ) subj(w i ) = evi_subj / total sim

140 Evaluation Calculate subj scores for each word sense, and sort them While 0 is a natural candidate for division between S and O, we perform the evaluation for different thresholds in [- 1,+1] Determine the precision of the algorithm at different points of recall

141 Evaluation: precision/recall curves Number of distri- butionally similar words = 160

142 Outline for Section 4 Motivation and Goals Assigning Subjectivity Labels to Word Senses –Manually –Automatically Word Sense Disambiguation using Automatic Subjectivity Analysis Conclusions

143 Overview Augment an existing data-driven WSD system with a feature reflecting the subjectivity of the context of the ambiguous word Compare the performance of original and subjectivity-aware WSD systems The ambiguous nouns of the SENSEVAL- 3 English Lexical Task SENSEVAL-3 data

144 Original WSD System Integrates local and topical features: Local: context of three words to the left and right, their part-of-speech Topical: top five words occurring at least three times in the context of a word sense [Ng & Lee, 1996], [Mihalcea, 2002] Naïve Bayes classifier [Lee & Ng, 2003]

145 Subjectivity Classifier Rule-based automatic sentence classifier from Wiebe & Riloff 2005 Harvests subjective and objective sentences it is certain about from unannotated data Part of OpinionFinder @ www.cs.pitt.edu/mpqa/

146 Subjectivity Tagging for WSD Sentences of the SENSEVAL-3data that contain a target noun are tagged with O, S, or B: Subjectivity Classifier Sentence j “interest” … … … Sentence k “atmosphere” “interest” Sentence i S O S … Tags are fed as input to the Subjectivity Aware WSD System

147 WSD using Subjectivity Tagging Subjectivity Classifier S, O, or B S Original WSD System Subjectivity Aware WSD System Sense 4Sense 1 Sentence i “interest” Sense 1 “a sense of concern with and curiosity about someone or something” Sense 4 “a fixed charge for borrowing money”

148 WSD using Subjectivity Tagging Hypothesis: instances of subjective senses are more likely to be in subjective sentences, so sentence subjectivity is an informative feature for WSD of words with both subjective and objective senses

149 Words with S and O Senses 4.3% error reduction; significant (p < 0.05 paired t-test) S sense not in data

150 Words with Only O Senses Overall 2.2% error reduction; significant (p < 0.1) often target > > = < = = = = = =

151 Conclusions for Section 4 Can subjectivity labels be assigned to word senses? –Manually Good agreement; Kappa=0.74 Very good when uncertain cases removed; Kappa=0.90 –Automatically Method substantially outperforms baseline Showed feasibility of assigning subjectivity labels to the fine-grained sense level of word senses

152 Conclusions for Section 4 Can subjectivity analysis improve word sense disambiguation? –Quality of a WSD system can be improved with subjectivity information Improves performance, but mainly for words with both S and O senses –4.3% error reduction; significant (p < 0.05) Performance largely remains the same or degrades for words that don’t Once senses have been assigned subjectivity labels, a WSD system could consult them to decide whether to consider the subjectivity feature

153 Pointers To Related Work Tutorial held at EUROLAN 2007 “Semantics, Opinion, and Sentiment in Text”, Iaşi, Romania, August Slides, bibliographies, … –www.cs.pitt.edu/~wiebe/EUROLAN07www.cs.pitt.edu/~wiebe/EUROLAN07

154 Conclusions Subjectivity is common in language Recognizing it is useful in many NLP tasks It comes in many forms and often is context- dependent A wide variety of features seem to be necessary for opinion and polarity recognition Subjectivity may be assigned to word senses, promising improved performance for both subjectivity analysis and WSD –Promising as well for multi-lingual subjectivity analysis Mihalcea, Banea, Wiebe 2007

155 Acknowledgements CERATOPS Center for the Extraction and Summarization of Events and Opinions in Text –Pittsburgh: Paul Hoffmann, Josef Ruppenhofer, Swapna Somasundaran, Theresa Wilson –Cornell: Claire Cardie, Eric Breck, Yejin Choi, Ves Stoyanov –Utah: Ellen Riloff, Sidd Patwardhan, Bill Phillips UNT: Rada Mihalcea, Carmen Banea NLP@Pitt: Wendy Chapman, Rebecca Hwa, Diane Litman, …


Download ppt "Subjectivity and Sentiment Analysis Jan Wiebe Department of Computer Science CERATOPS: Center for the Extraction and Summarization of Events and Opinions."

Similar presentations


Ads by Google