1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu.

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

Chapter 5: Introduction to Information Retrieval

Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.

1 Entity Ranking Using Wikipedia as a Pivot (CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps 2010/12/14 Yu-wen,Hsu.

Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)

Evaluating Search Engine

Modern Information Retrieval

Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam

Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.

An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,

An Overview of Relevance Feedback, by Priyesh Sudra 1 An Overview of Relevance Feedback PRIYESH SUDRA.

The Relevance Model  A distribution over terms, given information need I, (Lavrenko and Croft 2001). For term r, P(I) can be dropped w/o affecting the.

Chapter 5: Information Retrieval and Web Search

(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence

1 Opinion Spam and Analysis (WSDM,08)Nitin Jindal and Bing Liu Date: 04/06/09 Speaker: Hsu, Yu-Wen Advisor: Dr. Koh, Jia-Ling.

Leveraging Conceptual Lexicon ： Query Disambiguation using Proximity Information for Patent Retrieval Date : 2013/10/30 Author : Parvaz Mahdabi, Shima.

Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?

IR Evaluation Evaluate what? –user satisfaction on specific task –speed –presentation (interface) issue –etc. My focus today: –comparative performance.

 An important problem in sponsored search advertising is keyword generation, which bridges the gap between the keywords bidded by advertisers and queried.

Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.

1 Retrieval and Feedback Models for Blog Feed Search SIGIR 2008 Advisor ： Dr. Koh Jia-Ling Speaker ： Chou-Bin Fan Date ：

1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.

A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.

Finding Better Answers in Video Using Pseudo Relevance Feedback Informedia Project Carnegie Mellon University Carnegie Mellon Question Answering from Errorful.

Estimating Topical Context by Diverging from External Resources SIGIR’13, July 28–August 1, 2013, Dublin, Ireland. Presenter: SHIH, KAI WUN Romain Deveaud.

Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.

Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,

Multilingual Relevant Sentence Detection Using Reference Corpus Ming-Hung Hsu, Ming-Feng Tsai, Hsin-Hsi Chen Department of CSIE National Taiwan University.

INTERESTING NUGGETS AND THEIR IMPACT ON DEFINITIONAL QUESTION ANSWERING Kian-Wei Kor, Tat-Seng Chua Department of Computer Science School of Computing.

Chapter 6: Information Retrieval and Web Search

A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:

Predicting Question Quality Bruce Croft and Stephen Cronen-Townsend University of Massachusetts Amherst.

Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.

Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.

Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.

LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.

1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)

Personalization with user’s local data Personalizing Search via Automated Analysis of Interests and Activities 1 Sungjick Lee Department of Electrical.

Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.

Positional Relevance Model for Pseudo–Relevance Feedback Yuanhua Lv & ChengXiang Zhai Department of Computer Science, UIUC Presented by Bo Man 2014/11/18.

A Word Clustering Approach for Language Model-based Sentence Retrieval in Question Answering Systems Saeedeh Momtazi, Dietrich Klakow University of Saarland,Germany.

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.

Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.

Active Feedback in Ad Hoc IR Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:

Generating Query Substitutions Alicia Wood. What is the problem to be solved?

Learning to Estimate Query Difficulty Including Applications to Missing Content Detection and Distributed Information Retrieval Elad Yom-Tov, Shai Fine,

From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:

CONTEXTUAL SEARCH AND NAME DISAMBIGUATION IN USING GRAPHS EINAT MINKOV, WILLIAM W. COHEN, ANDREW Y. NG SIGIR’06 Date: 2008/7/17 Advisor: Dr. Koh,

DISTRIBUTED INFORMATION RETRIEVAL Lee Won Hee.

A Generation Model to Unify Topic Relevance and Lexicon-based Sentiment for Opinion Retrieval Min Zhang, Xinyao Ye Tsinghua University SIGIR

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

{ Adaptive Relevance Feedback in Information Retrieval Yuanhua Lv and ChengXiang Zhai (CIKM ‘09) Date: 2010/10/12 Advisor: Dr. Koh, Jia-Ling Speaker: Lin,

Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:

Usefulness of Quality Click- through Data for Training Craig Macdonald, ladh Ounis Department of Computing Science University of Glasgow, Scotland, UK.

Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.

Information Retrieval and Extraction 2009 Term Project – Modern Web Search Advisor: 陳信希 TA: 蔡銘峰＆許名宏.

Using Blog Properties to Improve Retrieval Gilad Mishne (ICWSM 2007)

University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G

LEARNING IN A PAIRWISE TERM-TERM PROXIMITY FRAMEWORK FOR INFORMATION RETRIEVAL Ronan Cummins, Colm O’Riordan (SIGIR’09) Speaker : Yi-Ling Tai Date : 2010/03/15.

An Effective Statistical Approach to Blog Post Opinion Retrieval Ben He, Craig Macdonald, Jiyin He, Iadh Ounis (CIKM 2008)

Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance Hello everyone,

Compact Query Term Selection Using Topically Related Text

CS 430: Information Discovery

Relevance and Reinforcement in Interactive Browsing

Topic: Semantic Text Mining

A Neural Passage Model for Ad-hoc Document Retrieval

Presentation transcript:

1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu

2 Outline Introduction Formal Framework for Opinion Retrieval Sentiment expansion techniques Test Collections and Sentiment Resources Experiments Conclusion

3 Introduction Opinion retrieval is very different to traditional topic-based retrieval.  Relevant documents should not only be relevant to the targets, but also contain subjective opinions about them.  The text collections are more informal “word of mouth” web data.

4 the greatest challenge : the difficulty in representing the user’s information need. typical queries : content words, some cue words. the majority of previous work in opinion retrieval  documents are ranked by topical relevance only  candidate relevant documents are re-ranked by their opinion scores

5 The novelty and effectiveness of the approach:  topic retrieval and sentiment classification can be converted to a unified opinion retrieval procedure through query expansion;  not only suitable for English data, but also to be effective for a Chinese benchmark collection;  extended to other tasks where user queries are inadequate to express the information need, geographical information retrieval and music retrieval.

6 Formal Framework for Opinion Retrieval Suppose we can obtain an opinion relevance model from a query.  R : the opinion relevance model for a query  D : the document model  V : the vocabulary

7 Content words and opinion words contribute differently to opinion retrieval. Topical relevance is determined by the matching of content words and the user query The sentiment and subjectivity of a document is decided by the opinion words. We divide the vocabulary V into two disjoint subsets:  a content word vocabulary CV  an opinion word vocabulary OV

8 An opinion relevance model is a unified model of both topic relevance and sentiment. In this model, Score(D), the score of a document is defined as: CV : the set of original query terms, or obtained by any query expansion technique OV : only opinion words are used to expand an original query instead of content words.

9 Sentiment expansion techniques Query-independent sentiment expansion Query-dependent sentiment expansion Mixture relevance model

10 *Query-independent sentiment expansion Sentiment expansion based on seed words  positive seeds, negative seeds  advantage: nearly always express strong sentiment  disadvantage: some of them are infrequent in the text collections

11 Sentiment expansion based on text corpora  only the most frequent opinion words are selected to expand the original queries  occurrences in a general corpus may be misleading  reliable opinionated corpus: Cornell movie review datasets and MPQA Corpus

12 Sentiment expansion based on relevance data  The goal is to automatically learn a function from training data to rank documents  Given a set of query relevance judgments, we can define the individual contribution to opinion retrieval for an opinion word w : an opinion word, used to expand every original query. contribution of w : the maximum increase in the MAP of the expanded queries over a set of original queries

13  Qi : the i-th query in the training set  Qi ∪ w: the i-th expanded query  weight(w) : the weight of w,  AP(Qi): the average precision of the retrieved documents for Qi  AP(Qi ∪ w,weight(w)) :the average precision of the retrieved documents for the expanded query while the weight of w is set as weight(w)

14 *Query-dependent sentiment expansion A target is always associated with some particular opinion words.  e.g. Mozart: Austrian musician (genius, famous) extract opinion words from a set of user- provided relevant opinionated documents when there are no user-provided relevant documents.  the relevant document set C can be acquired through pseudo-relevance feedback. rank documents using query likelihood scores select some top ranked documents to get the pseudo relevant set of C.

15 *Mixture relevance model query-independent and query-dependent.  query-independent: the most valuable opinion words are always general words and can be used to express opinions about any target.  query-dependent: those words most likely to co-occur with the terms in the original query are used for expansion. These words are used to express opinions about particular targets.

16 the interpolation of the scores assigned by original query, query-independent sentiment expansion, and query- dependent sentiment expansion

17 Test Collections and Sentiment Resources Both evaluations aim at locating documents that express an opinion about a given target.

18 *Sentiment resources English resources  General Inquirer : 3,600 opinion words  OpinionFinder’s Subjectivity Lexicon : >5,600

19 Chinese resources  HowNet Sentiment Vocabulary: 7000

20 Experiments

21

22

23 If retrieval effectiveness is preferred, mixture approaches should be adopted If retrieval efficiency is preferred, query independent sentiment expansion should be adopted.

24 Conclusion we have proposed the opinion relevance model, a formal framework for directly modeling the information need for opinion retrieval. query terms are expanded with a small number of opinion words to represent the information need. We propose a series of sentiment expansion approaches to find the most appropriate query- independent or query-dependent opinion words.