Presentation on theme: "A cognitive study of subjectivity extraction in sentiment annotation Abhijit Mishra 1, Aditya Joshi 1,2,3, Pushpak Bhattacharyya 1 1 IIT Bombay, India."— Presentation transcript:
A cognitive study of subjectivity extraction in sentiment annotation Abhijit Mishra 1, Aditya Joshi 1,2,3, Pushpak Bhattacharyya 1 1 IIT Bombay, India 2 Monash University, Australia 3 IITB-Monash Research Academy At 5th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis, ACL 2014, Baltimore
Subjectivity Extraction Goal: To identify subjective portions of text
Motivation Strong AI suggests that a machine must be perform sentiment analysis in a manner and accuracy similar to human beings Do humans perform subjective extraction as well? A “cognitive study” of subjectivity extraction in sentiment annotation
Sentiment Oscillations & subjectivity extraction Subjective documents may be: Humans perform subjectivity extraction either as a result of “anticipation” or as “homing”. Which of the two methods are adopted depends on the linear/oscillating nature of the subjective document. Linear: The story was captivating. The actors did a great job. I absolutely loved the movie! Oscillating: The story was captivating. Only if they had better actors. But then I enjoyed the movie, on the whole.
Experiment Setup (1/2) A human annotator reads a document and predicts its sentiment A Tobii T120 eye-tracker records eye movements while he/she reads the document * No time restriction, no user input required: to minimize errors.
Experiment Setup (2/2) Dataset – 3 Movie reviews in English from imdb – One linear, one oscillating, one between the two extremes (D0, D1, D2 respectively) Three documents? Really?! – To eliminate predictability – To reduce errors due to fatigue 12 human annotators (P0,.. P11 respectively)
Observations: Anticipation (1/2) In case of linear subjective documents, an annotator reads some sentences and begins to skip sentences.
Observations: Anticipation (2/2) DocumentLengthAverage number of non-unique sentences read by participants D01021 D1933.83 D21350.42
Observations: Homing (1/3) In case of oscillating subjective documents, an annotator (a) first reads all sentences, (b) revisits some sentences again
Observations: Homing (2/3) Considerable overlap between sentences that are read in the second pass All of them are subjective. ParticipantTFD-SEPTFDTFC-SE P57.3821 P73.1511 P951.941026 P11116.61656 Reading statistics for D1 TFD: Total fixation duration for subjective extract; PTFD: Proportion of total fixation duration = (TFD)/(Total duration); TFC-SE: Total fixation count for subjective extract
Observations: Homing (3/3) Homing at a sub-sentence level – Sarcasm Multiple regressions around the sarcasm portion for participant P1, document D1 Participant P1 does not correctly detect the sentiment of the document – Thwarting
Conclusion & Future Work Based on how sentiment changes through a document, humans may perform subjectivity extraction as a result of anticipation or homing Applications: – Pricing models for crowd-sourced annotation – Sentiment classifiers that incorporate “sentiment runlengths”
References WikiSent : Weakly Supervised Sentiment Analysis Through Extractive Summarization With Wikipedia, Subhabrata Mukherjee and Pushpak Bhattacharyya, ECML PKDD 2012 A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts, Bo Pang, Lillian Lee, ACL 2004