Instructor: Smaranda Muresan Columbia University Sentiment Lexicons.

Instructor: Smaranda Muresan Columbia University smara@ccls.columbia.edu Sentiment Lexicons

Announcements Class setup on Courseworks too. Class website linked to Courseworks (“Class website” tab) TA’s (Arpit Gupta) office hours – Monday 4:15-5:15pm in TA room in Mudd TA’s email: – ta.cmsm@gmail.com ta.cmsm@gmail.com

Class Today Word level sentiment analysis (Sentiment Lexicons) Discussion of the two papers Introduction to Sentiment Analysis beyond words (sentence level, text level) (to facilitate discussion of articles next week)

What is sentiment analysis? Attempts to identify the sentiment/opinion that a person may hold towards an object/person/topic etc It is a finer grain analysis compared to subjectivity analysis Sentiment AnalysisSubjectivity analysis Positive Subjective Negative NeutralObjective This film should be brilliant. It sounds like a great plot, the actors are first grade, and the supporting cast is good as well, and Stallone is attempting to deliver a good performance. However, it can’t hold up.

Why sentiment analysis? Movie: is this review positive or negative? Products: what do people think about the new iPhone? Public sentiment: how is consumer confidence? Is despair increasing? Politics: what do people think about this candidate or issue? Prediction: predict election outcomes or market trends from sentiment 5

Goal of today’s lecture Gain insights into how sentiment is expressed lexically Begin developing resources that are useful in higher level classification (phrase level, sentence level, document level) Explore different philosophies on how to build such large scale sentiment lexicons

What are we classifying gross (gross,adj) (gross,noun) (gross,verb) gross out GROSS!!! The soup was gross – 1 star The horror movie was gross – 5 stars

Words Adjectives – positive: honest important mature large patient Ron Paul is the only honest man in Washington. Kitchell’s writing is unbelievably mature and is only likely to get better. To humour me my patient father agrees yet again to my choice of film

Words Adjectives – negative: harmful hypocritical inefficient insecure It was a macabre and hypocritical circus. Why are they being so inefficient ? Slide from Janyce Wiebe

Other parts of speech Verbs – positive: praise, love – negative: blame, criticize Nouns – positive: pleasure, enjoyment – negative: pain, criticism

Hand Annotated/Compiled Lexicons WordNet-based approaches Distributional Approaches How to build sentiment lexicons

General Inquirer (GI) Harvard General Inquirer Database (Stone, 1966) – Total of 11,788 terms – http://www.wjh.harvard.edu/~inquirer/spreadsheet_guide.htm http://www.wjh.harvard.edu/~inquirer/spreadsheet_guide.htm – http://www.wjh.harvard.edu/~inquirer/homecat.htm http://www.wjh.harvard.edu/~inquirer/homecat.htm – Positive (1915 words) vs Negative (2291 words) (rest of 7582 could be consider Neutral) – Strong vs Weak – Active vs Passive – Overstated versus Understated – Pleasure, Pain, Virtue, Vice – Motivation, Cognitive Orientation, etc

WordNet (Miller, 1995; Fellbaum, 1998) Semantic Lexical resource http://wordnetweb.princeton.edu/perl/webwn http://wordnetweb.princeton.edu/perl/webwn www.globalwordnet.org (multilingual) www.globalwordnet.org Synsets (denote different senses of a word)

http://www-3.unipv.it/wnop/ Micro-WNOp (Cerini et al 1997) 1105 Wordnet Sysnsets related to opinion topic (initial words were selected from the GI)

Micro-WNOp (Carrenini et al 1997) Micro-WNOp statistics reduced to the 702 sysnsets when everyone agreed ISSUES with Hand built Lexicons such as GI, Micro-WNOp???

Hand Annotated/Compiled Lexicons WordNet-based approaches Distributional Approaches How to build sentiment lexicons

Simple sense/sentiment propagation Hypothesis: Sentiment is constant throughout regions of lexically related items. Thus, sentiment properties of hand-built seed-sets will be preserved as we follow WordNet relations out from them. SentiWordNet (Esuli and Sebastiani, 2006) – Approx 1.7 Million words – Using WordNet and Machine Learning (Classifiers). – Each synset is assigned three scores Positive Negative Objective

Values in 3 dimension sum to 1. Ex: P=0.75, N=0, O=0.25

Building SentiWordNet Lp, Ln, Lo are the three seed sets Iteratively expand the seed sets through K steps Train the classifier for the expanded sets

Lp Ln also-see antonymy Expansion of seed sets The sets at the end of kth step are called Tr(k,p) and Tr(k,n) Tr(k,o) is the set that is not present in Tr(k,p) and Tr(k,n)

Committee of classifiers Train a committee of classifiers of different types and different K-values for the given data Observations: – Low values of K give high precision and low recall – Accuracy in determining positivity or negativity, however, remains almost constant

Useful Sentiment Tutorial http://sentiment.christopherpotts.net/ Has code related to WordNet propagation methods (used in SentiWordNet) Many other pointers! Issues with the WordNet based propagation lexicons?

Other Sentiment Lexicons

MPQA Subjectivity Cues Lexicon Home page: http://www.cs.pitt.edu/mpqa/subj_lexicon.html http://www.cs.pitt.edu/mpqa/subj_lexicon.html 6885 words from 8221 lemmas – 2718 positive – 4912 negative Each word annotated for intensity (strong, weak) GNU GPL 24 Theresa Wilson, Janyce Wiebe, and Paul Hoffmann (2005). Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis. Proc. of HLT-EMNLP-2005. Riloff and Wiebe (2003). Learning extraction patterns for subjective expressions. EMNLP-2003.

Bing Liu Opinion Lexicon Bing Liu's Page on Opinion Mining http://www.cs.uic.edu/~liub/FBS/opinion- lexicon-English.rar http://www.cs.uic.edu/~liub/FBS/opinion- lexicon-English.rar 6786 words – 2006 positive – 4783 negative 25 Minqing Hu and Bing Liu. Mining and Summarizing Customer Reviews. ACM SIGKDD-2004.

Disagreements between polarity lexicons Opinion Lexicon General Inquirer SentiWordNet MPQA33/5402 (0.6%) 49/2867 (2%) 1127/4214 (27%) Opinion Lexicon32/2411 (1%) 1004/3994 (25%) General Inquirer520/2306 (23%) SentiWordNet 26 Christopher Potts, Sentiment Tutorial, 2011Sentiment Tutorial

Hand Annotated/Compiled Lexicons WordNet-based approaches Distributional Approaches – 2 papers for discussion today How to build sentiment lexicons

Presenter: Smaranda Muresan Predicting the semantic orientation of adjectives Hatzivassiloglou & McKeown 1997

Goal Predicting polarity of adjectives from a large corpus Test the hypothesis: the morphosyntactic properties of coordination provide reliable information about adjectival oppositions and lexical polarities

Adjectives conjoined by “and” have same polarity – Fair and legitimate, corrupt and brutal – *fair and brutal, *corrupt and legitimate Adjectives conjoined by “but” do not – fair but brutal

Approach Extract conjunctions of adjectives from a large corpus, along with relevant morphological relations Use a log-linear regression model to predict orientation of two different adjectives Use a clustering algorithm to separate the adjectives into two subsets of different orientation Use average frequencies in each group to assign the label (group with highest frequency is labeled positive)

Seed data Label seed set of 1336 adjectives (all >20 in 21 million word Wall Street Journal corpus) – 657 positive adequate central clever famous intelligent remarkable reputed sensitive slender thriving… – 679 negative contagious drunken ignorant lanky listless primitive strident troublesome unresolved unsuspecting… Further validation: ask 4 human judges to label a subset of 500 adjectives: 96.97% average inter-judge agreement 32

Validating the Hypothesis Run a parser on 21 million words dataset to get 15,048 conjunction tokens involving 9,296 pairs of distinct adjective pairs. Each conjunction was classified into : – 1) conjunction used (and, or, but,…) – 2) type of modification (attributive, predicative) – 3) number modified noun (singular or plural) Considered conjunction where both members were in the seed set (e.g. clever and sensitive) Count percentage of conjunction in each category with adjectives of same or different orientation

Validating Hypothesis For almost all the cases p-values are low. Hence the statistics are significant. ‘and’ usually joins adjectives of same orientation ‘but’ is opposite and joins adjectives of different orientation

Link Prediction classy nice helpful fair brutal irrational corrupt Baseline: always use same orientation – 77.84% the “but” rule morphological rules (adequate-inadequate) Better idea: supervised learning using log-linear regression

Result of Prediction Log Linear Regression models performs slightly better than baseline

Clustering for partitioning the graph into two groups Log Linear model generates a dissimilarity score between two adjective between 0 and 1 37 classy nice helpful fair brutal irrational corrupt

Labeling the clusters Two key insights about pairs of words of opposite orientations: - semantically unmarked member has positive orientation (e.g honest (unmarked) vs dishonest (marked)) - semantically unmarked member is the most frequent 38 classy nice helpful fair brutal irrational corrupt + -

Output polarity lexicon Positive – bold decisive disturbing generous good honest important large mature patient peaceful positive proud sound stimulating straightforward strange talented vigorous witty… Negative – ambiguous cautious cynical evasive harmful hypocritical inefficient insecure irrational irresponsible minor outspoken pleasant reckless risky selfish tedious unsupported vulnerable wasteful… 39

Output polarity lexicon Positive – bold decisive disturbing generous good honest important large mature patient peaceful positive proud sound stimulating straightforward strange talented vigorous witty… Negative – ambiguous cautious cynical evasive harmful hypocritical inefficient insecure irrational irresponsible minor outspoken pleasant reckless risky selfish tedious unsupported vulnerable wasteful… 40

Evaluating Clustering of Adjectives Tried to account for graph connectivity Used the adjectives from seed set (A) and links given by conjunction and morphological rules Separate in training/testing using a parameter α –higher α creates subset of A such that more adjectives are connected to each other.

Clustering Results Highest accuracy obtained when highest number of links were present. Ratio of group frequency correctly identified the positive subgroup

Graph Connectivity and Performance Parameter P measures how well each link is predicted independently – Precision Parameter k – average number of links for each adjective: Goal: even if P is low, given enough data (high k) a high performance for group prediction is achieved

Results

Discussion points What do you see the major contribution of this paper? - Helps to highlight in a quantitative way the relationship between sentiment and particular words and constructions (coordination)- useful linguistic insight - corpus best method (thus avoiding limitation of human built resources such as WordNet) - Can be extended to nouns and verbs. Classic paper, cited 1127 times

Discussion points Does it have all the information for anyone to be able to replicate the results? – How is the dissimilarity value computed? (multiple values are delivered for an adjective pair in different environments) What are the limitations of the approach? – Method is limited by human cleverness in coming up with useful constructions

Velikovich et al

Class Today Word level sentiment analysis (Sentiment Lexicons) Discussion of the two papers Introduction to Sentiment Analysis beyond words (phrase level, text level) (to facilitate discussion of articles next week)

What is sentiment analysis? Attempts to identify the sentiment/opinion/attitude that a person may hold towards an object/person/topic etc

Components 1.Holder (source) of attitude 2.Target (aspect) of attitude 3.Type of attitude From a set of types – Like, love, hate, value, desire, etc. Or (more commonly) simple weighted polarity: – positive, negative, neutral, together with strength 4.Text containing the attitude Sentence or entire document 50 This film should be brilliant. It sounds like a great plot, the actors are first grade, and the supporting cast is good as well, and Stallone is attempting to deliver a good performance. However, it can’t hold up.

Sentiment Analysis Simplest task: – Is the attitude of this text positive or negative? More complex: – Rank the attitude of this text from 1 to 5 Advanced: – Detect the target, source, or complex attitude types

Sentiment Analysis A Baseline Algorithm

Sentiment Classification in Movie Reviews Polarity detection: – Is an IMDB movie review positive or negative? Data: Polarity Data 2.0: – http://www.cs.cornell.edu/people/pabo/movie -review-data http://www.cs.cornell.edu/people/pabo/movie -review-data Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP-2002, 79—86. Bo Pang and Lillian Lee. 2004. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. ACL, 271-278

Text Classification: definition The classifier (test phase): – Input: a document d (e.g., a movie review) – Output: a predicted class c from some fixed set of labels c 1,...,c K (e,g,pos, neg) The learner (training phase): – Input: a set of m hand-labeled documents (d 1,c 1 ),....,(d m,c m ) – Output: a learned classifier f:d  c

IMDB data in the Pang and Lee database when _star wars_ came out some twenty years ago, the image of traveling throughout the stars has become a commonplace image. […] when han solo goes light speed, the stars change to bright lines, going towards the viewer in lines that converge at an invisible point. cool. _october sky_ offers a much simpler image–that of a single white dot, traveling horizontally across the night sky. [... ] “ snake eyes ” is the most aggravating kind of movie : the kind that shows so much potential then becomes unbelievably disappointing. it’s not just because this is a brian depalma film, and since he’s a great director and one who’s films are always greeted with at least some fanfare. and it’s not even because this was a film starring nicolas cage and since he gives a brauvara performance, this film is hardly worth his talents. ✓✗

Baseline Algorithm (adapted from Pang and Lee) Tokenization Feature Extraction Classification using different classifiers – Naïve Bayes – MaxEnt – Support Vector Machines (SVM)

Sentiment Tokenization Issues Deal with HTML and XML markup Twitter mark-up (names, hash tags) Capitalization (preserve for words in all caps) Phone numbers, dates Emoticons Useful code: – Christopher Potts sentiment tokenizer Christopher Potts sentiment tokenizer – Brendan O’Connor twitter tokenizer Brendan O’Connor twitter tokenizer 58

Extracting Features for Sentiment Classification How to handle negation –I didn’t like this movie vs –I really like this movie Which words to use? – Only adjectives – All words All words turns out to work better, at least on this data 59

Negation Add NOT_ to every word between negation and following punctuation: didn’t like this movie, but I didn’t NOT_like NOT_this NOT_movie but I Das, Sanjiv and Mike Chen. 2001. Yahoo! for Amazon: Extracting market sentiment from stock message boards. In Proceedings of the Asia Pacific Finance Association Annual Conference (APFA). Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP-2002, 79—86.

Classification methods Naïve Bayes MaxEnt SVM

Evaluating Classification Evaluation must be done on test data that are independent of the training data – usually a disjoint set of instances Classification accuracy: c / n where n is the total number of test instances and c is the number of test instances correctly classified by the system. – Adequate if one class per document Results can vary based on sampling error due to different training and test sets. – Average results over multiple training and test sets (splits of the overall data) for the best results. Slide from Chris Manning

Cross-Validation Break up data into 10 folds – (Equal positive and negative inside each fold?) For each fold – Choose the fold as a temporary test set – Train on 9 folds, compute performance on the test fold Report average performance of the 10 runs

Other issues in Classification MaxEnt and SVM tend to do better than Naïve Bayes 64

Problems: What makes reviews hard to classify? Subtlety: – Perfume review in Perfumes: the Guide: “If you are reading this because it is your darling fragrance, please wear it at home exclusively, and tape the windows shut.” 65

Thwarted Expectations and Ordering Effects “This film should be brilliant. It sounds like a great plot, the actors are first grade, and the supporting cast is good as well, and Stallone is attempting to deliver a good performance. However, it can’t hold up.” Well as usual Keanu Reeves is nothing special, but surprisingly, the very talented Laurence Fishbourne is not so good either, I was surprised. 66

Due Next Class Readings – Chapter 4 from Pang and Lee “Opinion Mining and Sentiment Analysis” book – 2 papers for discussions A short data analysis assignment – Description on Courseworks under Assignments – Goal is to get a better understanding of data and the problems discussed in class – Grade: Excellent/Good/Insufficient – Due before class. No late submissions

Next class Discussion of 2 papers (50 minutes) – 25 minutes per paper – Prepare a 15 min presentations and lead discussion for 10 minutes 5 min break More in depth lecture on sentiment analysis & open questions (can lead to ideas for projects) – 30 minutes Introduction to Emotion/Mood (25 minutes)

Announcements The assignments of paper for discussions will be done by Saturday, Feb 1, 5pm. TA office hours – 4:15-5:15pm Mondays in the TA room in Mudd TA email: ta.cmsm@gmail.com Email TA if you’d like a tutorial on Text Classification and existing toolkits

Announcements Grading policy slightly updated to include data analysis assignments – 10% data analysis assignments (3 assignments, grading Excellent/Good/Insufficient). No late submissions! See class website or details – 30% discussion of papers – 60% project 10% literature review part 5% class presentation 45% final paper and project

Instructor: Smaranda Muresan Columbia University Sentiment Lexicons.

Similar presentations

Presentation on theme: "Instructor: Smaranda Muresan Columbia University Sentiment Lexicons."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Instructor: Smaranda Muresan Columbia University Sentiment Lexicons.

Similar presentations

Presentation on theme: "Instructor: Smaranda Muresan Columbia University Sentiment Lexicons."— Presentation transcript:

Similar presentations

About project

Feedback