Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.

Slides:



Advertisements
Similar presentations
Suleyman Cetintas 1, Monica Rogati 2, Luo Si 1, Yi Fang 1 Identifying Similar People in Professional Social Networks with Discriminative Probabilistic.
Advertisements

Sentiment Analysis on Twitter Data
Farag Saad i-KNOW 2014 Graz- Austria,
WWW 2014 Seoul, April 8 th SNOW 2014 Data Challenge Two-level message clustering for topic detection in Twitter Georgios Petkos, Symeon Papadopoulos, Yiannis.
Distant Supervision for Emotion Classification in Twitter posts 1/17.
1.Accuracy of Agree/Disagree relation classification. 2.Accuracy of user opinion prediction. 1.Task extraction performance on Bing web search log with.
BEHAVIORAL PREDICTION OF TWITTER USERS BASED ON TEXTUAL INFORMATION Shiyao Wang.
Problem Semi supervised sarcasm identification using SASI
Sentiment Analysis An Overview of Concepts and Selected Techniques.
Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Supervised and Unsupervised Methods in Employing Discourse Relations for Improving Opinion Polarity Classification Aaron Michelony CMPS245 April 12, 2011.
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
Who Needs Polls? Gauging Public Opinion from Twitter Data David Cummings Haruki Oh Ningxuan (Jason) Wang.
ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.
Recognizing Stances in Online Debates Debate: iPhone vs. Blackberry iPhone of course. Blackberry is now for the senior businessmen market! The iPhone incarnate.
Decision Tree Models in Data Mining
Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text Soo-Min Kim and Eduard Hovy USC Information Sciences Institute 4676.
More than words: Social networks’ text mining for consumer brand sentiments A Case on Text Mining Key words: Sentiment analysis, SNS Mining Opinion Mining,
Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.
Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.
Overview: Humans are unique creatures. Everything we do is slightly different from everyone else. Even though many times these differences are so minute.
Bayesian Networks. Male brain wiring Female brain wiring.
1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
Introduction to Text and Web Mining. I. Text Mining is part of our lives.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
1 Emotion Classification Using Massive Examples Extracted from the Web Ryoko Tokuhisa, Kentaro Inui, Yuji Matsumoto Toyota Central R&D Labs/Nara Institute.
Incident Threading for News Passages (CIKM 09) Speaker: Yi-lin,Hsu Advisor: Dr. Koh, Jia-ling. Date:2010/06/14.
Feature selection LING 572 Fei Xia Week 4: 1/29/08 1.
Learning from Multi-topic Web Documents for Contextual Advertisement KDD 2008.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
14/12/2009ICON Dipankar Das and Sivaji Bandyopadhyay Department of Computer Science & Engineering Jadavpur University, Kolkata , India ICON.
1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.
Minimally Supervised Event Causality Identification Quang Do, Yee Seng, and Dan Roth University of Illinois at Urbana-Champaign 1 EMNLP-2011.
TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.
Author Age Prediction from Text using Linear Regression Dong Nguyen Noah A. Smith Carolyn P. Rose.
Sentiment Analysis with Incremental Human-in-the-Loop Learning and Lexical Resource Customization Shubhanshu Mishra 1, Jana Diesner 1, Jason Byrne 2, Elizabeth.
Recognizing Stances in Ideological Online Debates.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Blog Summarization We have built a blog summarization system to assist people in getting opinions from the blogs. After identifying topic-relevant sentences,
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
CSC 594 Topics in AI – Text Mining and Analytics
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Modeling Latent Biographic Attributes in Conversational Genres Nikesh Garera David Yarowsky.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
Reputation Management System
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Extracting Opinion Topics for Chinese Opinions using Dependence Grammar Guang Qiu, Kangmiao Liu, Jiajun Bu*, Chun Chen, Zhiming Kang Reporter: Chia-Ying.
A Maximum Entropy Language Model Integrating N-grams and Topic Dependencies for Conversational Speech Recognition Sanjeev Khudanpur and Jun Wu Johns Hopkins.
Fabricio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgilio Almeida Universidade Federal de Minas Gerais Belo Horizonte, Brazil ACSAC 2010 Fabricio.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
Twitter as a Corpus for Sentiment Analysis and Opinion Mining
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Project Deliverable-1 -Prof. Vincent Ng -Girish Ramachandran -Chen Chen -Jitendra Mohanty.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
An Effective Statistical Approach to Blog Post Opinion Retrieval Ben He, Craig Macdonald, Jiyin He, Iadh Ounis (CIKM 2008)
A Simple Approach for Author Profiling in MapReduce
Sentiment analysis algorithms and applications: A survey
SOCIAL COMPUTING Homework 3 Presentation
Aspect-based sentiment analysis
Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang
Ensemble learning.
Roc curves By Vittoria Cozza, matr
Introduction to Sentiment Analysis
Stance Classification of Ideological Debates
Presentation transcript:

Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are indicative of opinion stances. Formulate the debate side classification as an integer Linear Programming problem.

Recognizing Stances in Online Debates IPhone VS Blackberry. – Topic1 = iPhone, topic2 = Blackberry – Side1 = pro-iPhone, side2 = pro-Blackberry Give a post, analysis its side. – Post may express their opinion to some aspects of the topic. Some aspects are particular to one topic, while some aspects are shared. – Concession

Recognizing Stances in Online Debates Method: unsupervised approach – Finding opinions and pairing them with targets. Get opinion-target pairs. Find opinion: – Look up words in a subjectivity lexicon. All instances of those words are treated as opinions. – An opinion is assigned the prior polarity (positive (+), negative (-), neutral (*)). Pair with targets: – Build rule-based system to do syntactic analysis.

Recognizing Stances in Online Debates Syntactic rules:

Recognizing Stances in Online Debates Once opinion-target pairs are created, mask the identity of the opinion word, replacing the word with its polarity. – For instance, “pleasing interface” is converted to “interface+” (polarity-target pairs)

Recognizing Stances in Online Debates Learning aspects and preferences from web – Search web for pages which contain “iPhone” and “Blackberry” – Yield polarity-target pairs. – If the target in a polarity-target pair happens to be one of the topics, select the polarity-target pairs in its vicinity for further process. Vicinity is defined as the same sentence plus the following 5 sentences. (someone expresses an opinion about a topic, he or she is likely to follow it up with reasons for that opinion)

Recognizing Stances in Online Debates Learning aspects and preferences from web – where p = {+,−,*} and q = {+,−,* } denote the polarities of the target and the topic, respectively; j = {1, 2}; and i = {1...M}, where M is the number of unique targets in the corpus. For example, P(Mac+|interface+) is the probability that "interface" is the target of a positive opinion that is in the vicinity of a positive opinion toward "Mac."

Recognizing Stances in Online Debates Learning aspects and preferences from web

Recognizing Stances in Online Debates Debate-side classification – First, extract all polarity-target pairs in one post. – Second, Let N be the number of instances of polarity-target pairs in the post. For each instance Ij (j = {1...N}), we look up the learned probabilities of to create two scores, wj and uj :

Recognizing Stances in Online Debates Debate-side classification – Third, formulate the problem as an Integer Linear Programming problem. Maximum the objective function

Recognizing Stances in Online Debates Accounting for concession – Use the list of connectives from Concession and Contra- expectation. Like “while”, “nonetheless”, “however”. – If the connective is mid-sentence, the part of the sentence prior to the connective is considered conceded. – If the connective is sentence-initial, the sentence is split at the first comma. The first part is considered conceded – The opinion in the conceded part are interpreted in reverse. The weight corresponding to the sides wj and uj are interchanged.

Recognizing Stances in Online Debates Experiments – Baseline OpTopic systems. Only find all the topic1+, topic1-, topic2+ and topic2- instances in the post. OpPMI – Find opinion target pairs not only the topic, but also the words in the debated that are significantly related to either of the topic. – Calculate PMI between each noun in the post and topic over the whole web corpus. Assign each noun to the topic with higher PMI. – Next, the polarity-target pairs are found for the post, as before, and Equations 10 and 11 are used to assign a side to the post as in the OpTopic system, except that here, related nouns are also counted as instances of their associated topics.

Recognizing Stances in Online Debates Results – In our task, it is desirable to make a prediction for all the posts; hence #relevant = #Total posts. This results in Recall and Accuracy being the same. However, all of the systems do not classify a post if the post does not contain the information it needs. Thus, #guessed < #Total posts, and Precision is not the same as Accuracy.

Recognizing Stances in Online Debates Results

Recognizing Stances in Online Debates Regard the opinion word towards to the topic as cause? Regard vicinity polarity-target pairs as cause?

Classifying Latent User Attributes in Twitter Classify tweet users’ latent attributes, including gender, age, regional origin and political orientation. Supervised classification. Use below features for all four attributes classification. – Using lexical-feature based approaches – Extracting and utilizing sociolinguistics-inspired features.

Classifying Latent User Attributes in Twitter Attribute: – Gender: Male or Female – Age: below 30 or above 30 – Regional Origin: south India or north India – Political Orientation: Republican/Conservation vs. Liberal/Left/Democratic. Manually annotate training data.

Classifying Latent User Attributes in Twitter Network structure. (An initial investigation of potential features of classification) – The follower-following ratio: The ratio of number of followers of an user to the number of users he/she is following. – The follower frequency: The number of followers – The following frequency: The number of followees

Classifying Latent User Attributes in Twitter Network structure and communication behavior. (An initial investigation of potential features of classification)

Classifying Latent User Attributes in Twitter Network structure and communication behavior. (An initial investigation of potential features of classification)

Classifying Latent User Attributes in Twitter Communication behavior. (An initial investigation of potential features of classification) – Response frequency: percentage of tweets from the user that are replies – Retweet frequency: percentage of tweets that are retweets. – Tweet frequency: percentage of tweets that are from the user, uninitiated.

Classifying Latent User Attributes in Twitter Communication behavior. (An initial investigation of potential features of classification)

Classifying Latent User Attributes in Twitter Classification Models – Sociolinguistics feature models. Certain utterances like "umm", “uh-huh” are more prevalent among female speakers than their male counterparts.

Classifying Latent User Attributes in Twitter Classification Models – Sociolinguistics feature models. The templates from Table 2 resulted in 3774 unique instantiated feature types. The extracted features were used to learn an SVM based binary classier; we call this model socling.

Classifying Latent User Attributes in Twitter Classification Models – Ngram-features Model. Derive the unigrams and bigrams of the tweet text. Build another SVM-based classification model.

Classifying Latent User Attributes in Twitter Classification Models – Stacked Model. Finally, we employed a stacked model to do simple classifier stacking. We utilized another SVM for this task, but its features are the predictions from the Ngram- feature and Sociolinguistic models along with their prediction weights.

Classifying Latent User Attributes in Twitter Evaluation – Gender

Classifying Latent User Attributes in Twitter Evaluation – Age

Classifying Latent User Attributes in Twitter Evaluation – Regional Origin

Classifying Latent User Attributes in Twitter Evaluation – Political orientation

Classifying Latent User Attributes in Twitter Combine user attribute with sentiment or cause analysis??? – Different kind of people may have different opinion towards the same event.

Reference Recognizing Stances in Online Debates Classifying Latent User Attributes in Twitter