Presentation on theme: "Identifying Sarcasm in Twitter: A Closer Look"— Presentation transcript:
1 Identifying Sarcasm in Twitter: A Closer Look Roberto Gonzalez Smaranda Muresan Nina Wacholder
2 Aim of the studyTo construct a corpus of sarcastic utterances that have been explicitly labeled so by the composers themselves. (#sarcasm, #sarcastic)To exemplify the difficulty in distinguishing sarcastic sentences from negative/positive sentences.
3 DataData for the study is divided in three sets of 900 tweets each: sarcastic, positive and negative.Each data set is culled from twitter using appropriate hash-tags.Sarcasm: #sarcasm, #sarcasticPositive: #happy, #joy, #luckyNegative: #sadness, #frustrated, #angry
4 Data PreprocessingTweets tagged with #sarcasm or #sarcastic in the middle of the tweet removed.Manually checked to see if the tags were a part of the content of the tweet.Eg: “I really love #sarcasm”
5 Lexical features Unigrams Dictionary based Pennebaker et al (LIWC) Linguistic Processes (adverbs, pronouns)Psychological Processes (Positive, negative emotion)Personal Concerns (work, achievement)Spoken Categories ( assent, non-fluencies)WordNet AffectList of interjections and punctuations
8 ClassificationLogistic Regression and Support Vector Machine with SMO (sequential minimal optimization)Features used:UnigramsDictionary features presence (LIWC+_P)Dictionary features frequency (LIWC+_F)