Sarcasm Detection on Twitter A Behavioral Modeling Approach

Slides:



Advertisements
Similar presentations
Dan Jurafsky Lecture 4: Sarcasm, Alzheimers, +Distributional Semantics Computational Extraction of Social and Interactional Meaning SSLST, Summer 2011.
Advertisements

Sentiment Analysis on Twitter Data
Linking Entities in #Microposts ROMIL BANSAL, SANDEEP PANEM, PRIYA RADHAKRISHNAN, MANISH GUPTA, VASUDEVA VARMA INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY,
Farag Saad i-KNOW 2014 Graz- Austria,
Identifying Sarcasm in Twitter: A Closer Look
Distant Supervision for Emotion Classification in Twitter posts 1/17.
Open access and data processing of Social Media (Twitter) data – a new and valuable consumer research instrument Thierry Worch, Anne Hasted & Hal MacFie.
Emoticons in IM Conversations  Past Research: –IM supplies a flexible medium for a wide range of conversations (Nardi et al., 2000). –According to the.
Problem Semi supervised sarcasm identification using SASI
Semi Supervised Recognition of Sarcastic Sentences in Twitter and Amazon Dmitry DavidovOren TsurAri Rappoport.
Comparing Twitter Summarization Algorithms for Multiple Post Summaries David Inouye and Jugal K. Kalita SocialCom May 10 Hyewon Lim.
Pollyanna Gonçalves (UFMG, Brazil) Matheus Araújo (UFMG, Brazil) Fabrício Benevenuto (UFMG, Brazil) Meeyoung Cha (KAIST, Korea) Comparing and Combining.
CIS630 Spring 2013 Lecture 2 Affect analysis in text and speech.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, Bing Qin
Extracting Interest Tags from Twitter User Biographies Ying Ding, Jing Jiang School of Information Systems Singapore Management University AIRS 2014, Kuching,
Scalable Text Mining with Sparse Generative Models
To Trust of Not To Trust? Predicting Online Trusts using Trust Antecedent Framework Viet-An Nguyen 1, Ee-Peng Lim 1, Aixin Sun 2, Jing Jiang 1, Hwee-Hoon.
Wang, Z., et al. Presented by: Kayla Henneman October 27, 2014 WHO IS HERE: LOCATION AWARE FACE RECOGNITION.
Sentiment Analysis with a Multilingual Pipeline 12th International Conference on Web Information System Engineering (WISE 2011) October 13, 2011 Daniëlla.
Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews K. Dave et al, WWW 2003, citations Presented by Sarah.
More than words: Social networks’ text mining for consumer brand sentiments A Case on Text Mining Key words: Sentiment analysis, SNS Mining Opinion Mining,
Language Identification of Search Engine Queries Hakan Ceylan Yookyung Kim Department of Computer Science Yahoo! Inc. University of North Texas 2821 Mission.
Lecture 6: The Ultimate Authorship Problem: Verification for Short Docs Moshe Koppel and Yaron Winter.
Detecting Promotional Content in Wikipedia Shruti Bhosale Heath Vinicombe Ray Mooney University of Texas at Austin 1.
Sentiment Analysis of Social Media Content using N-Gram Graphs Authors: Fotis Aisopos, George Papadakis, Theordora Varvarigou Presenter: Konstantinos Tserpes.
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
Designing Ranking Systems for Consumer Reviews: The Economic Impact of Customer Sentiment in Electronic Markets Anindya Ghose Panagiotis Ipeirotis Stern.
1 Co-Training for Cross-Lingual Sentiment Classification Xiaojun Wan ( 萬小軍 ) Associate Professor, Peking University ACL 2009.
Data Mining and Machine Learning Lab Unsupervised Feature Selection for Linked Social Media Data Jiliang Tang and Huan Liu Computer Science and Engineering.
Joint Models of Disagreement and Stance in Online Debate Dhanya Sridhar, James Foulds, Bert Huang, Lise Getoor, Marilyn Walker University of California,
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Prediction of Influencers from Word Use Chan Shing Hei.
Paired Sampling in Density-Sensitive Active Learning Pinar Donmez joint work with Jaime G. Carbonell Language Technologies Institute School of Computer.
Sentiment Analysis with Incremental Human-in-the-Loop Learning and Lexical Resource Customization Shubhanshu Mishra 1, Jana Diesner 1, Jason Byrne 2, Elizabeth.
Linking Organizational Social Networking Profiles PROJECT ID: H JEROME CHENG ZHI KAI (A H ) 1.
Your Sentiment Precedes You: Using an author’s historical tweets to predict sarcasm Anupam Khattri 2, Aditya Joshi 1,3, Pushpak Bhattacharyya 1, Mark James.
CoCQA : Co-Training Over Questions and Answers with an Application to Predicting Question Subjectivity Orientation Baoli Li, Yandong Liu, and Eugene Agichtein.
Date: 2015/11/19 Author: Reza Zafarani, Huan Liu Source: CIKM '15
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of.
1 Generating Comparative Summaries of Contradictory Opinions in Text (CIKM09’)Hyun Duk Kim, ChengXiang Zhai 2010/05/24 Yu-wen,Hsu.
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
Extending SASI to Satirical Product Reviews: A Preview Bernease Herman University of Michigan Monday, April 22, 2013.
What vocal cues indicate sarcasm? By: Jack Dolan Rockwell, P. (2000). Lower, slower, louder: Vocal cues of sarcasm. Journal of Psycholinguistic Research,
11 A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 1, Michael R. Lyu 1, Irwin King 1,2 1 The Chinese.
Advanced Gene Selection Algorithms Designed for Microarray Datasets Limitation of current feature selection methods: –Ignores gene/gene interaction: single.
TECHNICAL WRITING 2013 UNIT 3: DESIGNING FOR CHANGE.
Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts Khairun-nisa Hassanali 1, Yang Liu 1 and Thamar.
Semi-Supervised Recognition of Sarcastic Sentences in Twitter and Amazon -Smit Shilu.
Twitter as a Corpus for Sentiment Analysis and Opinion Mining
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
PREDICTION ON TWEET FROM DYNAMIC INTERACTION Group 19 Chan Pui Yee Wong Tsz Wing Yeung Chun Kit.
More than words: Social network’s text mining for consumer brand sentiments Expert Systems with Applications 40 (2013) 4241–4251 Mohamed M. Mostafa Reporter.
Applications of Text Analytics: Drunk-texting and Sarcasm Aditya Joshi IITB-Monash Research Academy Joint work with Abhijit Mishra, Vinita Sharma, Balamurali.
Language Identification and Part-of-Speech Tagging
Sentiment Analysis of Twitter Messages Using Word2Vec
By : Namesh Kher Big Data Insights – INFM 750
Grey Sentiment Analysis
Approach, Design, & Procedure The Weakness & Strength
Erasmus University Rotterdam
Summary Presented by : Aishwarya Deep Shukla
Review-Level Aspect-Based Sentiment Analysis Using an Ontology
Emotions in Social Networks: Distributions, Patterns, and Models
Text Mining & Natural Language Processing
Biased Random Walk based Social Regularization for Word Embeddings
Socialized Word Embeddings
Text Mining & Natural Language Processing
Introduction to Sentiment Analysis
Presentation transcript:

Sarcasm Detection on Twitter A Behavioral Modeling Approach Ashwin Rajadesingan, Reza Zafarani, and Huan Liu The best way to pull your advisor’s legs is to let your advisor present the work

Sarcasm a nuanced form of language where usually, the user explicitly states the opposite of what she implies.

One Reason is to Avoid PR Blunders Why Detect Sarcasm? Most large companies have dedicated social media teams providing real-time assistance to consumers. These teams use social media tools such as Salesforce’s Social Hub to manage the high volume, high velocity tweets. One Reason is to Avoid PR Blunders

Overview of methodology Related Work Authors Conference Overview of methodology Riloff et al. EMNLP 2013 Lexicon-based approach contrasting positive sentiment and negative situation Liebrecht et al. WASSA (ACL 2013 workshop) Unigram, bigram and trigram features used to train a Balanced Winnow classifier Reyes et al. DKE 2012 Ambiguity, emotional cues etc., to train decision trees Gonzalez-Ibanez et al. ACL 2011 lexical and pragmatic features to train SMO classifier Davidov et al. Tsur et al. CoNLL 2010 ICWSM 2010 Patterns and punctuations based features used in a weighted k-nearest neighbor classifier SMO – sequential minimal optimization Viewing sarcasm from a linguistic perspective

Some Characteristics of Twitter Fewer word cues (140 character limit) Evolving slang words, abbreviations, etc. However, Twitter provides Past tweets Profile information Social graph

Problem Definition Given an unlabeled tweet t from user u along with a set of u's past tweets T, a solution to sarcasm detection aims to automatically detect if t is sarcastic or not.

SCUBA ( Sarcasm Classification using a Behavioral modeling Approach ) SCUBA learns from findings of behavioral and psychological aspects of sarcasm to determine if a tweet is sarcastic SCUBA captures these behavioral patterns in users’ past tweets and profiles to complement the (relatively little) information available in tweets SCUBA constructs computational features to train a supervised model to detect sarcastic tweets

1. Sarcasm as a Contrast of Sentiment (Grice, 1975) a. Contrasting Connotations Using words with contrasting connotations within the same tweet. Difference between the maximum positive and negative sentiment/affect words present as features. b. Contrasting Present with Past

2. Sarcasm as a Complex form of Expression (Rockwell, 2007) Readability Features inspired from tests measuring readability of text Measures Readability Test Number of words and syllables Flesch-Kincaid Grade Level Formula Number of polysyllables SMOG test Average word length Automated Readability Index

3. Sarcasm as a Means for Conveying Emotion (Basavanna, 2000), (Toplak, 2000), (Ducharme, 1994), (Grice, 1978) Mood: A user in a foul mood is more likely to use sarcasm Emotional expressiveness: how expressive a Twitter user is based on past sentiment usage Frustration: People use sarcasm to vent out frustration (Ducharme, 1994) - number of swear words

4. Sarcasm as a Function of Familiarity (Cheang, 2011; Rockwell,2003, 2011) a. Familiarity of environment People express sarcasm better when they are well acquainted with the environment. We can model it with features such as: Number of tweets posted in Twitter Number of friends and followers Frequency of Twitter usage b. Familiarity of language (Dress, 2008) Measured with vocabulary/grammar skills We measure vocabulary and POS usage in Tweets

5. Sarcasm as a Form of Written Expression Sarcasm in speech includes low pitch, high intensity and a slow tempo (Rockwell, 2000). Written sarcasm is devoid of such options. Prosodic variations Structural variations: Structural variations are inadvertent variations in the POS composition of tweets to express sarcasm. Part of Speech

Research Questions Does our behavior modeling approach work? How well? Does using historical information actually benefit sarcasm detection? If so, how much historical information is required? Which features from theories contribute most to sarcasm detection on Twitter? Work means it performs better than current approaches

Dataset Sarcastic tweets: Other tweets: 9,104 tweets containing #sarcasm and #not Other tweets: 81,936 random sample of tweets (after removing tweets containing #sarcasm and #not) Dataset: http://bit.ly/SarcasmDetectionWSDM2015

Baselines Contrast Approach - tweet is sarcastic if it contains a positive verb phrase or positive predicative expression and a negative situation phrase (Riloff et al., EMNLP 2013) Hybrid Approach - Contrast Approach + n-gram model (Riloff et al., EMNLP 2013) Embedded results from the n-gram model into SCUBA as well. We call the n-gram augmented framework, SCUBA++

Baseline Algorithms SCUBA – {past sarcasm hashtags feature} Majority classifier N-gram model used in Hybrid Approach and SCUBA++ SCUBA++ - our n-gram augmented framework

Performance Comparison 10-Fold Cross Validation Technique Dataset Distribution 1:1 20:80 10:90 Accuracy AUC SCUBA++ 86.08 0.86 89.81 0.80 92.94 0.70 SCUBA 83.46 0.83 88.10 0.76 92.24 0.60 SCUBA - #sarcasm 83.41 87.53 0.74 91.87 0.63 Baseline: Contrast Approach 56.50 0.56 78.98 0.57 86.59 Baseline: Hybrid Approach 77.26 0.77 78.40 0.75 83.87 0.67 Baseline: N-gram Classifier 78.56 0.78 81.63 87.89 0.65 Baseline: Majority Classifier 50.00 0.50 80.00 90.00

Can historical information improve sarcasm detection? SCUBA without historical data: 79.38% accuracy Outperforms all other approaches Historical Data Helps 4.14% increase in performance. 30 tweets seem sufficient

Which Forms Contribute most to Sarcasm Detection Feature set Accuracy All feature sets 83.46% - Contrast-based features 57.34% - Complexity-based features 73.00% - Emotion expression-based features 71.52% - Familiarity-based features 73.67% - Written expression-based features 76.72% - here means remove

What Features Contribute Most to Sarcasm Detection Percentage of emoticons and adjectives in a tweet Percentage of past words with sentiment score 2,3,-3 Number of polysyllables per word in a tweet Lexical density of a tweet Number of past sarcastic tweets posted Percentage of positive to negative sentiment transitions made by a user Percentage of capitalized hashtags in a tweet

Summary A behavioral Modeling framework of identifying different forms of online sarcasm as: a contrast of sentiments a complex form of expression a means of conveying emotion a function of familiarity, and a form of written expression Modeled on Twitter to build a supervised learning algorithm to detect sarcastic tweets Experiments demonstrate that SCUBA is effective in detecting sarcastic tweets Even limited amount of historical data helps in sarcasm detection

Future Work How does a user’s social network influences her propensity to use sarcasm? Does the strength of social ties matter in generating sarcasm? Can SCUBA be extended to other social networking sites?