17 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, 19-21 May 2010 Sentiment Analysis in the News 7 th International.

Slides:



Advertisements
Similar presentations
Wednesday 13 April /02/ :58 European Union: Keeping up-to-date Eva Koundouraki Information Specialist, European Union EUI Library
Advertisements

Persuasive Writing Mr Tronerud 7RB.
How to Write a Poem Analysis Essay
ON-DEMAND WRITING.
Using media to present ideas . . .
Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions Eric Breck and Claire Cardie Department of Computer.
Sentiment Analysis An Overview of Concepts and Selected Techniques.
Determining Negation Scope and Strength in Sentiment Analysis SMC 2011 Paul van Iterson Erasmus School of Economics Erasmus University Rotterdam
Annotating Topics of Opinions Veselin Stoyanov Claire Cardie.
ISHER: Integrated Social History Environment for Research Sophia Ananiadou National Centre for Text Mining School of Computer Science.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Annotating Expressions of Opinions and Emotions in Language Wiebe, Wilson, Cardie.
Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam
Approaches to automatic summarization Lecture 5. Types of summaries Extracts – Sentences from the original document are displayed together to form a summary.
1 Attributions and Private States Jan Wiebe (U. Pittsburgh) Theresa Wilson (U. Pittsburgh) Claire Cardie (Cornell U.)
Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae.
Analyzing Sentiment in a Large Set of Web Data while Accounting for Negation AWIC 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam.
Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text Soo-Min Kim and Eduard Hovy USC Information Sciences Institute 4676.
Top Ten Tips for teachers preparing students for the academic version of IELTS Sam McCarter Macmillan Online Conference 2013.
Sentiment Analysis with a Multilingual Pipeline 12th International Conference on Web Information System Engineering (WISE 2011) October 13, 2011 Daniëlla.
More than words: Social networks’ text mining for consumer brand sentiments A Case on Text Mining Key words: Sentiment analysis, SNS Mining Opinion Mining,
Opinion mining in social networks Student: Aleksandar Ponjavić 3244/2014 Mentor: Profesor dr Veljko Milutinović.
CHAPTER 3: DEVELOPING LITERATURE REVIEW SKILLS
Carmen Banea, Rada Mihalcea University of North Texas A Bootstrapping Method for Building Subjectivity Lexicons for Languages.
To look at how to critically examine issues and how to effectively write essays in Physical Education studies.
O VERVIEW OF THE W RITING P ROCESS Language Network – Chapter 12.
A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende,
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
 Organizing and Presenting a Persuasive Message.
The New English Curriculum September The new programme of study for English is knowledge-based; this means its focus is on knowing facts. It is.
Incident Threading for News Passages (CIKM 09) Speaker: Yi-lin,Hsu Advisor: Dr. Koh, Jia-ling. Date:2010/06/14.
Day 3. Standards Reading: 1.0 Word Analysis, Fluency, and Systematic Vocabulary Development- Students apply their knowledge of word origins to determine.
1 Determining the Hierarchical Structure of Perspective and Speech Expressions Eric Breck and Claire Cardie Cornell University Department of Computer Science.
Exploiting Subjectivity Classification to Improve Information Extraction Ellen Riloff University of Utah Janyce Wiebe University of Pittsburgh William.
A Language Independent Method for Question Classification COLING 2004.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
14/12/2009ICON Dipankar Das and Sivaji Bandyopadhyay Department of Computer Science & Engineering Jadavpur University, Kolkata , India ICON.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
*Erasmus University Rotterdam P.O. Box 1738, NL-3000 DR Rotterdam, the Netherlands † Teezir BV Wilhelminapark 46, NL-3581 NL, Utrecht, the Netherlands.
SERIES OF PARAGRAPHS EXPRESSING AN OPINION OLC. What is a S.O.P?  A series of paragraphs, is a way of expressing your opinion FOR OR AGAINST a given.
How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings Stefan Siersdorfer, Sergiu Chelaru, Wolfgang Nejdl, Jose San.
Prototype-Driven Learning for Sequence Models Aria Haghighi and Dan Klein University of California Berkeley Slides prepared by Andrew Carlson for the Semi-
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Automatic Identification of Pro and Con Reasons in Online Reviews Soo-Min Kim and Eduard Hovy USC Information Sciences Institute Proceedings of the COLING/ACL.
Emotions from text: machine learning for text-based emotion prediction Cecilia Alm, Dan Roth, Richard Sproat UIUC, Illinois HLT/EMPNLP 2005.
Blog Summarization We have built a blog summarization system to assist people in getting opinions from the blogs. After identifying topic-relevant sentences,
Politics and Social media: The Political Blogosphere and the 2004 U.S. election: Divided They Blog Crystal: Analyzing Predictive Opinions on the Web Swapna.
Writing better press releases Helping journalists to help you.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
1 Generating Comparative Summaries of Contradictory Opinions in Text (CIKM09’)Hyun Duk Kim, ChengXiang Zhai 2010/05/24 Yu-wen,Hsu.
How to Research– Finding RELIABLE Information. Getting Started  Where is the first place you go when you start researching a project?  Google, Wikipedia,
Speech to the Virginia Convention
7/2003EMNLP031 Learning Extraction Patterns for Subjective Expressions Ellen Riloff Janyce Wiebe University of Utah University of Pittsburgh.
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
GCSE English Language 8700 GCSE English Literature 8702 A two year course focused on the development of skills in reading, writing and speaking and listening.
EXAM ESSAY STRUCTURE.
Extracting Opinion Topics for Chinese Opinions using Dependence Grammar Guang Qiu, Kangmiao Liu, Jiajun Bu*, Chun Chen, Zhiming Kang Reporter: Chia-Ying.
LO: To understand stereotypes. What is a stereotype? a widely held but fixed and oversimplified image or idea of a particular type of person or thing.
124. Cont. 5 Re-read RW1.5 Understand and explain the figurative and metaphorical use of words in context.
Identifying Expressions of Opinion in Context Eric Breck and Yejin Choi and Claire Cardie IJCAI 2007.
Research Report.
Aspect-based sentiment analysis
Q1-Identify and Interpret List four things from the text about…
Review-Level Aspect-Based Sentiment Analysis Using an Ontology
“See Red” Speech Purpose – to convince
Have you watched/read The Hunger Games?
The Invisible Process to help with analysis:
Basics & Stretch Yourself Assessment Objectives (AOs)
Ontology-Enhanced Aspect-Based Sentiment Analysis
Presentation transcript:

17 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Sentiment Analysis in the News 7 th International Conference on Language Resources and Evaluation – LREC 2010 Alexandra Balahur, Ralf Steinberger, Mijail Kabadjov, Vanni Zavarella, Erik van der Goot, Matina Halkia, Bruno Pouliquen, Jenya Belyaeva

27 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Agenda Introduction Motivation Use in multilingual Europe Media Monitor (EMM) family of applications Defining sentiment analysis for the news domain Data used Gold standard collection of quotations (reported speech) Sentiment dictionaries Experiments Method Results Error analysis Conclusions and future work

37 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Background: multilingual news analysis in EMM Current news analysis in Europe Media Monitor 100,000 articles per day in 50 languages; Clustering and classification (subject domain classes); Topic detection and tracking; Collecting multilingual information about entities; Cross-lingual linking and aggregation, … Publicly accessible at

47 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Objective: add opinions to news content analysis E.g. Detect opinions on European Constitution; EU press releases; Entities (persons, organisations, EU programmes and initiatives); Use for social network analysis Detect and display opinion differences across sources and across countries; Follow trends over time. Highly multilingual (20+ languages)  use simple means no syntactic analysis, no POS taggers, no large-scale dictionaries.  count sentiment words in word windows

57 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Sentiment analysis – Definitions Definition of sentiment analysis: Many Definitions, e.g. Wiebe (1994), Esuli & Sebastiani (2006), Dave et al. (2003), Kim & Hovy (2005) Sentiment/Opinion of a Source/Opinion Holder on a Target (e.g. a blogger or reviewer’s opinion on a movie / product and its features) Negative sentiment in news on natural disaster or bombing: what does it mean?

67 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Complexity of sentiment in news analysis SUBJ OBJ/SUBJ SUBJ/OBJ OBJ/SUBJ Author/Reader Author/Pol.A Pol.B/Author Author Reader/Author 1 million people die every year because of drug consumption. Politician A said: “We have declared a war on drugs”. Politician B said: “We support politician A’s reform.” Politician A’s son was caught selling drugs. It is incredible how something like this can happen! Sentiment? Source? Target? Inter-annotator agreement ~50%

77 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Helpful model: distinguish three perspectives Author may convey opinion by stressing upon some facts, omitting other aspects; word choice; story framing; … Reader interprets texts differently depending on background and opinions. Text Some opinions are stated explicitly in the text (even if metaphorically) Contains (pos. or neg.) news content and (pos. or neg.) sentiment values.

87 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 News sentiment analysis – What are we looking for? Before annotating, we need to specify what we want to annotate :  sentiment or not? Do we want to distinguish positive and negative sentiment from good and bad news ! Inter-annotator agreement rose from ~50% to ~ 60%. What is the Target of the sentiment expression? No Yes Entities

97 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 News sentiment analysis – Annotation guidelines used Sentiment annotation guidelines, annotating 1592 quotes, included: Only annotate the selected entity as a Target; Distinguish news content from sentiment value; Annotate attitude, not news content ; If you were that entity, would you like or dislike the statement; Try not to use your world knowledge (political affiliations, etc.), focus on explicit sentiment ; In case of doubt, leave un-annotated (neutral).  Inter-annotator agreement reached 81%.

107 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Quotation test set / inter-annotator agreement Test set of 1592 quotes (reported speech) whose source and target are known. Test set of 1114 usable quotes agreed upon by 2 annotators. Baseline: percentage of quotes in the largest class (objective) = 61% Histogram of quotes’ length in characters No. quotes No. agreed quotes No. agreed neg. quotes No. agreed pos. quotes No. Agreed obj. quotes Agreement 81%78% 83%

117 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Sentiment dictionaries Distinguishing four sentiment categories (HP, HN, P, N) Summing the respective intuitive values (weights) of ± 4, ± 1; Performed better than binary categories (Pos/Neg). Mapping various English language resources to these four categories: JRC Lists MicroWN-Op ([-1 … 1]; cut-off point ± 0.5) WNAffect (HN: anger, disgust; N: fear, sadness; P: joy; HP: surprise ) SentiWN ([-1 … 1]; cut-off point ± 0.5)

127 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Experiments, focusing on entities 1. Count sentiment word scores in windows of different sizes around the entity (or its co-reference expressions, e.g. Gordon Brown = UK Prime Minister, Minister Brown, etc. ); 2.Using different dictionaries and combinations of dictionaries; 3.Subtracting the sentiment value of words that belong to EMM category definitions to reduce the impact of news content ; Simplistic and quick approximation. E.g. category definition for EMM category CONFLICT. car bomb military clash air raid armed conflict civil unrest armed conflict genocide war insurrection massacre rebellion …

137 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Evaluation results Word window With or Without using Category Definitions JRC DictionariesMicroWNWNAffectSentiWN Whole textyes no yes no yes no yes no yes no Results in terms of accuracy (number of quotes correctly classified as positive, negative or neutral)

147 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Error analysis Largest portion of failures: erroneous misclassification of quotes as neutral: No sentiment words present – but clear sentiment expressed “We have video evidence that the activists of X are giving out food products to voters” “He was the one behind all these atomic policies” “X has been doing favours to friends” Use of idiomatic expressions to express sentiment: “They’ve stirred the hornet’s nest” Misclassification of sentences as positive or negative Because of the presence of another target: “ Anyone who wants X to fail is an idiot, because it means we’re all in trouble”

157 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Conclusion News sentiment analysis (SA) is different from the ‘classic’ SA text types. It is less clear what source and target are, and they can change within the text Shown by low inter-annotator agreement; Need to define exactly what we are looking for  We focused on entities. Search in windows around entities. We tested different sentiment dictionaries. We tried to separate (in a simplistic manner) pos./neg. news content from pos./neg. sentiment.

167 th International Conference on Language Resources and Evaluation, LREC, Valletta, Malta, May 2010 Future Work Future work: Use cross-lingual bootstrapping methods to produce sentiment dictionaries in many languages ; Compare opinion trends across multilingual sources and countries over time.