Presentation is loading. Please wait.

Presentation is loading. Please wait.

D ETERMINING THE S ENTIMENT OF O PINIONS Presentation by Md Mustafizur Rahman (mr4xb) 1.

Similar presentations


Presentation on theme: "D ETERMINING THE S ENTIMENT OF O PINIONS Presentation by Md Mustafizur Rahman (mr4xb) 1."— Presentation transcript:

1 D ETERMINING THE S ENTIMENT OF O PINIONS Presentation by Md Mustafizur Rahman (mr4xb) 1

2 O UTLINES What is an Opinion? Problem definition Word Sentiment Classifier Sentence Sentiment Classifier Experimental Analysis Shortcomings Future works 2

3 W HAT IS AN OPINION ? An opinion is a quadruple [Topic, Holder, Claim, Sentiment] The Holder believes a Claim about the Topic and in many cases associates a Sentiment. Opinion may contain sentiment or not e.g. I believe the world is flat. (absent) Sentiment can be implicit or explicit e.g. I like apple. (explicit) e.g. We should decrease our dependence on oil (implicit) 3

4 P ROBLEM DEFINITION Opinion = [Topic, Holder, Claim, Sentiment] Given a Topic a set of texts about the topic Find The sentiments (only positive or negative) about the topic in each sentence Identify the people who hold that sentiment. 4

5 A UTHORS APPROACH 4 Basic stages Calculation of the polarity of sentiment bearing words (Word Sentiment Classifier) Selection of sentence containing both topic and holder Holder based region identification Combine these polarity to provide the sentence sentiment (Sentence Sentiment Classifier) 5

6 W ORD SENTIMENT CLASSIFIER To build a classifier we need a training data How to generate training data for word sentiment classifier? Assemble a small amount of seed words by hand Seed word list only contains positive and negative polarity words Then grow this list by adding synonyms and antonyms from WordNet [1] 6

7 W ORD SENTIMENT CLASSIFIER W ORDNET 7

8 W ORD SENTIMENT CLASSIFIER W ORDNET (C ONTD.) Figure: An example of the relationship between Hyponyms and Hypernym [source: wikipedia] 8

9 W ORD SENTIMENT CLASSIFIER (C ONTD.) Initial Seed word list Adjectives (15 positive and 19 negative) Verbs (23 positive and 21 negative) Final Seed word list Adjectives (5880 positive and 6233 negative) Verb (2840 positive and 3239 negative) Some words e.g. “great”, “strong” appears in both positive and negative categories. 9

10 W ORD SENTIMENT CLASSIFIER (C ONTD.) Now we have A set of words Each word has a class label (or polarity) of either positive or negative How to calculate the strength of the sentiment polarity? For a new word w we compute first the synonym set ( syn 1, syn 2, …, syn n ) from WordNet. Then we compute arg max P(c|w) which is equivalent to arg max P(c| syn 1, syn 2, …, syn n ) Here c is sentiment category (positive or negative) 10

11 W ORD SENTIMENT CLASSIFIER (C ONTD.) There are two possible ways to calculate arg max P(c|w) Approach 1 Where f_k is the kth feature of category c. And count(f_k,synset(w )) is the total number of occurrence of f_k in the synonym set of w. 11

12 W ORD SENTIMENT CLASSIFIER (C ONTD.) There are two possible ways to calculate arg max P(c|w) Approach 2 Where count(syn_i,c) is the count of occurrence of w’s synonyms in the list of c. 12

13 W ORD SENTIMENT CLASSIFIER (C ONTD.) word “amusing”, for example, is classified as carrying primarily positive sentiment, and “blame” as primarily negative “afraid” with strength - 0.99 represents strong negativity while “abysmal” with strength -0.61 represents weaker negativity. 13

14 S ENTENCE SENTIMENT CLASSIFIER Consists of 4 parts: Identification of Topic in the sentence (i.e. direct matching) Identification of opinion holder Identification of region Development of model to combine sentiments 14

15 S ENTENCE SENTIMENT CLASSIFIER (C ONTD.) H OLDER I DENTIFICATION Assumption Person and organization are the only opinion holder For sentence with more than holder just pick the closest one to Topic. Method BBN named entity tagger identifier [2] A software tool [http://www.bbn.com/technology/speech/identifinder] 15

16 S ENTENCE SENTIMENT CLASSIFIER (C ONTD.) S ENTIMENT REGION IDENTIFICATION Where to look for the sentiment? Proposed different sentiment region Window 1Full sentence Window 2Words between holder and Topic Window 3Window2 ± 2 Window 4Window 2 to the end of the sentence 16

17 S ENTENCE SENTIMENT CLASSIFIER (C ONTD.) C LASSIFICATION MODEL 3 different models Model 0: Signs can be positive or negative Model 1: Harmonic mean of the sentiment in the region 17

18 S ENTENCE SENTIMENT CLASSIFIER (C ONTD.) C LASSIFICATION MODEL Model 1 (Contd.) n( c) is the number of words in the region whose sentiment category is c. s is the sentiment strength Model 2 Geometric mean of the sentiment in the region 18

19 S YSTEM A RCHITECTURE 19

20 E XPERIMENTAL ANALYSIS Two set of experiments for Word Sentiment Classifier Sentence Sentiment Classifier 20

21 E XPERIMENTAL ANALYSIS (C ONTD.) W ORD SENTIMENT CLASSIFIER Dataset Word List from TOEFL exam A predefined list Containing 19748 English Adjectives And 8011 English Verbs Take an intersection of above two lists. Finally take randomly 462 adjectives and 502 verbs. Classification of dataset Human 1 and Human 2: label adjectives Human 2 and Human 3 : label verbs 21

22 E XPERIMENTAL ANALYSIS (C ONTD.) W ORD SENTIMENT CLASSIFIER Class Label Positive, Negative and Neutral Measurement Type Strict – Consider all class label Lenient – Two Class Label Negative and Positive merged with neutral Table: Inter Human Agreement 22

23 E XPERIMENTAL ANALYSIS (C ONTD.) W ORD SENTIMENT CLASSIFIER Table: Human-Machine Agreement (Small Seed Set) Table: Human-Machine Agreement (Larger Seed Set) 23

24 E XPERIMENTAL ANALYSIS (C ONTD.) SENTENCE SENTIMENT CLASSIFIER Dataset 100 sentences from the DUC 2001 Corpus Topics covered: “illegal alien”, “term limit”, “gun control” and “NAFTA” Classification of Sentence 100 sentences from the DUC 2001 Corpus [3] Two human classify the sentence into three class label : positive, negative and N/A. 24

25 E XPERIMENTAL ANALYSIS (C ONTD.) SENTENCE SENTIMENT CLASSIFIER Experiment Variants Three different models Four different windows Two different word classifier models Manual annotated holder vs. automatic holder So in total 16 different variants for each model 1 and model 2 and 8 different variants for model 0. 25

26 E XPERIMENTAL ANALYSIS (C ONTD.) SENTENCE SENTIMENT CLASSIFIER Table: Results with manually annotated Holder Table: Results with automatic Holder 26

27 E XPERIMENTAL ANALYSIS (C ONTD.) SENTENCE SENTIMENT CLASSIFIER Performance Matrix Correctness Correct identification of both holder and sentiment Best Model : Model 0 Best Window : window 4 Accuracy 81% accuracy obtained on manually annotated holder 67% accuracy obtained on automatic holder 27

28 SHORTCOMINGS Consider only unigram model. As a result, for some words having both positive and negative sentiment this model will fail. E.g.: Term limit really hit at democracy. Model cannot infer sentiment from fact Absence of adjective, verb and noun sentiment word prevents classification. E.g.: She thinks term limit will give women more opportunities in politics. 28

29 F UTURE WORK One of assumption of this work is that the topic is given. Can we extract topic automatically? E.g: Twitter HashTag ?? Not only positive or negative sentiment Context dependent sentiment (Bi-gram or ti-gram analysis) 29

30 REFERENCES [1] Miller, G.A., R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. 1993. Introduction to WordNet: An On-Line Lexical Database. http://www.cosgi.princeton.edu/~wn. [2] BBN named entity tagger identifier- http://www.bbn.com/technology/speech/identifind er [3] DUC 2001 Corpus. http://www- nlpir.nist.gov/projects/duc/data.html 30


Download ppt "D ETERMINING THE S ENTIMENT OF O PINIONS Presentation by Md Mustafizur Rahman (mr4xb) 1."

Similar presentations


Ads by Google