Can you believe what you read online?: Modeling and Predicting Trustworthiness of Online Textual Information V.G.Vinod Vydiswaran Department of Computer.

Slides:

Advertisements

Similar presentations

1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.

Advertisements

Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi Advisors: Dinesh Govindaraj SVN Vishwanathan* Group:

Jason H.D. Cho 1,2, Parikshit Sondhi 1, Chengxiang Zhai 1, Bruce R. Schatz 1,2,3 1 Department of Computer Science, 2 Institute of Genomic Biology, 3 Department.

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

1.Accuracy of Agree/Disagree relation classification. 2.Accuracy of user opinion prediction. 1.Task extraction performance on Bing web search log with.

2008 © ChengXiang Zhai Dragon Star Lecture at Beijing University, June 21-30, Introduction to IR Research ChengXiang Zhai Department of Computer.

Jean-Eudes Ranvier 17/05/2015Planet Data - Madrid Trustworthiness assessment (on web pages) Task 3.3.

GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.

KDD 2011 Research Poster Content - Driven Trust Propagation Framwork V. G. Vinod Vydiswaran, ChengXiang Zhai, and Dan Roth University of Illinois at Urbana-Champaign.

An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.

A Quality Focused Crawler for Health Information Tim Tang.

Evaluating Search Engine

Information Retrieval in Practice

Search Engines and Information Retrieval

1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.

Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.

Chapter 1 Introduction to the Scientific Method Can Science Cure the Common Cold?

Overview of Search Engines

The noted critics Statler and Waldorf. What critical thinking is and why it matters How it can be applied to different academic disciplines What it means.

Fenglong Ma1, Yaliang Li1, Qi Li1, Minghui Qiu2,

Modern Retrieval Evaluations Hongning Wang

Can you believe what you read online?: Modeling and Predicting Trustworthiness of Online Textual Information Ph.D. Defense V.G.Vinod Vydiswaran Department.

Evaluation David Kauchak cs458 Fall 2012 adapted from:

Evaluation David Kauchak cs160 Fall 2009 adapted from:

Cluster based fact finders Manish Gupta, Yizhou Sun, Jiawei Han Feb 10, 2011.

Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.

Search Engines and Information Retrieval Chapter 1.

Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.

Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?

©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.

2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.

BiasTrust: Teaching Biased Users About Controversial Topics V.G.Vinod Vydiswaran, ChengXiang Zhai, Dan Roth University of Illinois at Urbana-Champaign.

A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.

When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.

 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.

Designing Ranking Systems for Consumer Reviews: The Economic Impact of Customer Sentiment in Electronic Markets Anindya Ghose Panagiotis Ipeirotis Stern.

Summary-Response Essay Responding to Reading. Reading Critically Not about finding fault with author Rather engaging author in a discussion by asking.

Learning from Multi-topic Web Documents for Contextual Advertisement KDD 2008.

MedSearch Vaishnav Janardhan COMS E6125 Web-Enhanced Information Management.

Toward A Session-Based Search Engine Smitha Sriram, Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

Evaluation of information. Introduction It is common for people to challenge things they learn It is known that not every information is true Medical.

Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.

Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.

Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.

Personalization with user’s local data Personalizing Search via Automated Analysis of Interests and Activities 1 Sungjick Lee Department of Electrical.

BiasTrust: Trusting credible information in presence of human bias V.G.Vinod Vydiswaran ChengXiang Zhai, Dan Roth Department of Computer Science University.

Truth Discovery with Multiple Conflicting Information Providers on the Web KDD 07.

Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.

CoCQA : Co-Training Over Questions and Answers with an Application to Predicting Question Subjectivity Orientation Baoli Li, Yandong Liu, and Eugene Agichtein.

Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.

KDD 2011 Doctoral Session Modeling Trustworthiness of Online Content V. G. Vinod Vydiswaran Advisors: Prof.ChengXiang Zhai, Prof.Dan Roth University of.

Exploring in the Weblog Space by Detecting Informative and Affective Articles Xiaochuan Ni, Gui-Rong Xue, Xiao Ling, Yong Yu Shanghai Jiao-Tong University.

A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,

UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.

Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:

SCARAB Substance No depth or written for children. Lacking the depth needed for your purpose. Written for the general public. Depth of coverage.

From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:

Modern Retrieval Evaluations Hongning Wang

Argumentative Writing Grades College and Career Readiness Standards for Writing Text Types and Purposes arguments 1.Write arguments to support a.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Issues/Research KNR 208. Topic vs Issue Topic – The subject of a discussion, speech Issue - Are matters of wide public concern arising out of complex.

Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.

Experimental Psychology PSY 433 Chapter 5 Research Reports.

Research methods revision The next couple of lessons will be focused on recapping and practicing exam questions on the following parts of the specification:

Using Blog Properties to Improve Retrieval Gilad Mishne (ICWSM 2007)

Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.

Critical thinking for assignments to get a better grade

Lecture 12: Relevance Feedback & Query Expansion - II

Presentation transcript:

Can you believe what you read online?: Modeling and Predicting Trustworthiness of Online Textual Information V.G.Vinod Vydiswaran Department of Computer Science University of Illinois at Urbana-Champaign December 5 th, 2011

Web content: structured and free-text 2 Are all these pieces of information equally trustable?

Even reputed sources can make mistakes 3 Some sources / claims can be misleading on purpose.

Blogs and forums give information too 4 Sources may not be “reputed”, but information can still be trusted.

Echoing quack of a duck 5 RESULT : Duck’s quack echoes! Is trustworthiness always objective?

Treating cancer 6 Should trustworthiness be subjective?

Is drinking alcohol good for the body? “A review article of the latest studies looking at red wine and cardiovascular health shows drinking two to three glasses of red wine daily is good for the heart.” Dr. Bauer Sumpio, M.D., Prof., Yale School of Medicine Journal of American College of Surgeons, 2005 “Women who regularly drink a small amount of alcohol — less than a drink a day — are more likely to develop breast cancer in their lifetime than those who don’t.” Dr. Wendy Chen, Asst. Prof. (Medicine), Harvard Medical School Journal of the American Medical Association (JAMA), 2011 Is drinking alcohol good for the body? Are these sources “reliable”? What other sources/documents say about this information? 7

Every coin has two sides People tend to be biased, and may be exposed to only one side of the story Confirmation bias Effects of filter bubble For intelligent choices, it is wiser to also know about the other side What is considered trustworthy may depend on the person’s viewpoint 8 Should trustworthiness be user-dependent?

Milk is good for humans… or is it? 9 Milk contains nine essential nutrients… Dairy products add significant amounts of cholesterol and saturated fat to the diet... The protein in milk is high quality, which means it contains all of the essential amino acids or 'building blocks' of protein. Milk proteins, milk sugar, and saturated fat in dairy products pose health risks for children and encourage the development of obesity, diabetes, and heart disease... Drinking of cow milk has been linked to iron- deficiency anemia in infants and children It is long established that milk supports growth and bone development One outbreak of development of enlarged breasts in boys and premature development of breast buds in girls in Bahrain was traced to ingestion of milk from a cow given continuous estrogen treatment by its owner to ensure uninterrupted milk production. rbST [man-made bovine growth hormone] has no biological effects in humans. There is no way that bST [naturally-occurring bovine growth hormone] or rbST in milk induces early puberty. Given these evidence docs, users can make a decision

Actors in the trustworthiness story 10 Claim Source Data Users Evidence Drinking alcohol is good for the body. “A review article of the latest studies looking at red wine and cardiovascular health shows drinking two to three glasses of red wine daily is good for the heart.” Women who regularly drink a small amount of alcohol — less than a drink a day — are more likely to develop breast cancer in their lifetime than those who don’t. Dr. Bauer Sumpio Dr. Wendy Chen News Corpus Medical sites Blogs Forums ClaimVerifier

ClaimVerifier : Thesis contribution 11 Claim Source Data Users Evidence ClaimVerifier Novel system for users to validate textual claims Free-text claims Incorporate textual evidence into trust models

Challenges in building ClaimVerifier 12 Claim Source Data Users Evidence ClaimVerifier What kind of data can be utilized? How to find relevant pieces of evidence ? Are sources trustworthy? How to present evidence? How to assign truth values to textual claims? How to address user bias? How to build trust models that make use of evidence?

Outline of the talk Measuring source trustworthiness Using forums as data source Content-based trust propagation models User biases and interface design

ClaimVerifier : Trustworthiness factors 14 Claim Source Data Users Evidence ClaimVerifier Knowing why something is true is important Source Trustworthiness Information from blogs and forums User bias may affect perceived trustworthiness

Identify trustworthy websites (sources) Case study: Medical websites Joint work with Parikshit Sondhi and ChengXiang Zhai (ECIR 2012) 1

Variations in online medical information 16

Problem Statement For a (medical) website  What features indicate trustworthiness?  How can you automate extracting these features? Can you learn to distinguish trustworthy websites from others? 17

“cure back pain”: Top 10 results 18 health2us.com Content Presentation Financial interest Transparency Complementarity Authorship Privacy

Trustworthiness of medical websites HON code Principles Authoritative Complementarity Privacy Attribution Justifiability Transparency Financial disclosure Advertising policy Our model (automated) Link-based features  Transparency  Privacy Policy  Advertising links Page-based features  Commercial words  Content words  Presentation Website-based features  Page Rank 19

Research questions For a (medical) website  What features indicate trustworthiness?  How can you automate extracting these features? Can you learn to distinguish trustworthy websites from others? Bias results to prefer trustworthy websites? 20 Yes HON code principles link, page, site features Learned SVM and used it to re-rank results

Use classifier to re-rank results 21 MAPGoogleOurs 22 queries

Understanding what is written (evidence) Case study: Scoring medical claims based on health forums (KDD 2011 Workshop on Data Mining for Medicine and Healthcare) 2

Many medical support groups available 23

Claim DB Evidence & Support DB 24 Scoring claims via community knowledge Claim Essiac tea is an effective treatment for cancer. Chemotherapy is an effective treatment for cancer.

Problem statement Given  A corpus of community generated content E.g. forum postings, mailing lists  A database of relations (“claims”) E.g. [disease, treatments], [treatments, side-effects] can we  Rank and score the claims based on their support / verifiability, as demonstrated in the text corpus  Build scoring function to rate databases as a whole 25

Relation Retrieval  Query Formulation Parse result snippet  Find sentiment expressed in snippet Score snippets Aggregate them to get claim score Collect relevant evidence for claims A Analyze the evidence documents B Score and aggregate evidence C Ranked treatment claims 26 Key steps

What is a Relation? Entities  Nouns  Objects of interest  Possibly typed Relation words  Verbs  Binary, for now  Usually with roles (entities participating in the relation have specific roles) SanduskyPennState protect Essiac teaCancer cured by ORGPER 27 Entity Relation type Collect relevant evidence for claims A Disease Treatment

Relation Retrieval Query Formulation  structured relation  possibly typed Query Expansion  Relation: with synonyms, words with similar contexts  Entities: with acronyms, common synonyms Query weighting Entity 2Entity 1 cured by cure treat help prevent reduce Chemotherapy Chemo Impotence ED Erectile Dysfunction Infertility 28 Entity Relation type Collect relevant evidence for claims A Disease Treatment

Relation Retrieval  Query Formulation Parse result snippet  Find sentiment expressed in snippet Score snippets Aggregate them to get claim score Collect relevant evidence for claims A Analyze the evidence documents B Score and aggregate evidence C Ranked treatment claims 29 Key steps revisited Entity Relation type

Number of snippets found relevant (popularity) Number of evidence contexts extracted Orientation of the posts (positive / negative) Percentage of opinion words Subjectivity of opinion words used (strong, weak) Length of posts Relevance of the post to the claim 30 Score and aggregate evidence C Scoring post snippets

Scoring functions 4 variants to score posts  Opin: based on number of opinionated words  Subj: based on subjectivity of opinionated words  Scaled: scaling opinion words with subjectivity  Orient: based on orientation (polarity) of the post 3 ways to aggregate scores  Aggregating counts over relevant posts  Averaging scores of individual posts  Combining post scores in a weighted average combinations

Treatment effectiveness based on forums 32 Claim Source Data Users Evidence ClaimVerifier Health forums and medical message boards Treatment claims Forum posts describing effectiveness of the treatment Ignored QUESTION: Which treatments are more effective than others for a disease?

Evaluation: Corpus statistics Collection of nine health forums and discussion boards  Not restricted to any specific disease or treatment class 33

DiseaseApproved TreatmentsAlternate Treatments AIDSAbcavir, Kivexa, Zidovudine, Tenofovir, Nevirapine Acupuncture, Herbal medicines, Multi-vitamins, Tylenol, Selenium ArthritisPhysical therapy, Exercise, Tylenol, Morphine, Knee brace Acupuncture, Chondroitin, Gluosamine, Ginger rhizome, Selenium AsthmaSalbutamol, Advair, Ventolin Bronchodilator, Xolair Atrovent, Serevent, Foradil, Ipratropium CancerSurgery, Chemotherapy, Quercetin, Selenium, Glutathione Essiac tea, Budwig diet, Gerson therapy, Homeopathy COPDSalbutamol, Smoking cessation, Spiriva, Oxygen, Surgery Ipratropium, Atrovent, Apovent ImpotenceTestesterone, Implants, Viagra, Levitra, Cialis Ginseng root, Naltrexone, Enzyte, Diet 34 Treatment claims considered

Results: Ranking valid treatments Datasets  Skewed: 5 random valid + all invalid treatments  Balanced: 5 random valid + 5 random invalid treatments Finding: Our approach improves ranking of valid treatments, significant in Skewed dataset. 35

Expt 2: Variation in scoring schemes Counting baseline: number of posts returned for a treatment Findings  Subjectivity scoring seems better  Rank-based weighting of posts during aggregation of scores seems to improve MAP MAP score for Cancer treatments Counting baseline MAP:

Measuring site “trustworthiness” 37 Database score Ratio of degradation Trustworthiness should decrease

Over all six disease test sets As noise added to the claim database, the overall score reduces. Exception: Arthritis, because it starts off with a negative score 38

Conclusion: Scoring claims using forums It is feasible to score trustworthiness claims using signal from million of patient posts We scored treatment posts based on subjectivity of opinion words, and extended the notion to score databases It is possible to reliably leverage this general idea of validating knowledge through crowd-sourcing 39

Modeling trust over content and source 40 Claim Source Data Users Evidence ClaimVerifier What kind of data can be utilized? How to find relevant pieces of evidence ? Are sources trustworthy? How to assign truth values to textual claims? How to build trust models that make use of evidence? 1 2 3

Content-Driven Trust Propagation Framework Case study: News Trustworthiness (KDD 2011) 3

Problem: How to verify claims? 42 Claim 1 EvidenceClaimsSources Web Sources Passages that give evidence for the claim News media (or reporters) News stories “A duck’s quack doesn’t echo.” “PennState protected Sandusky.” News coverage on the issue of “Immigration” is biased. Trust

Typical fact-finding is over structured data 43 Claim 1 Claim n Claim ClaimsSources Assume structured claims and accurate IE modules Mt. Everest8848 m K28611 m Mt. Everest8500 m

Traditional 2-layer Fact-Finder model Hub-Authority style Prior research  TruthFinder: Yin, Han, & Yu, 2007  Pasternack & Roth, 2010  Many more … 44 s1s1 s2s2 s3s3 s4s4 s5s5 c4c4 c3c3 c2c2 c1c1 Book TitleAuthors PersonDate of birth CityPopulation No Context Trustworthiness of source Veracity of claims

Incorporating Text in Trust Models 45 Claim 1 Claim n Claim EvidenceClaimsSources Free-text claims Special case: structured data 1. Textual evidence 2. Supports adding IE accuracy, relevance, similarity between text

Evidence-based 3-layer model Defining the model  Defining the three parameters  Initialization Framework layers Scores for layers  Trust propagation over the framework  Handling influence between layers 46 s1s1 s2s2 s3s3 s4s4 s5s5 c4c4 c3c3 c2c2 c1c1 e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 e7e7 e8e8 e9e9 e 10

Understanding model parameters Scores computed  : Claim veracity  : Evidence trust  : Source trust Influence factors  : evidence similarity  : Relevance  : Source-Evidence influence Initializing  Uniform distribution for  Retrieval score for 47

Computing Trust scores 48 Veracity of claims Trustworthiness of sources Confidence in evidence Trust scores computed iteratively Veracity of a claim depends on the evidence documents for the claim and their sources. Trustworthiness of a source is based on the claims it supports. Confidence in an evidence document depends on source trustworthiness and confidence in other similar documents.

Trust scores computed iteratively Adding influence factors Computing Trust scores 49 Similarity of evidence e i to e j Relevance of evidence e j to claim c i Sum over all other pieces of evidence for claim c(e i ) Trustworthiness of source of evidence e j

Possible extensions Mutually exclusive claims, constraints [Pasternack & Roth, 2011] Structure on sources, groups [Pasternack & Roth, 2011] Source copying [Dong, Srivastava, et al., 2009] 50 Structured claims No context

Application: Trustworthiness in News 51 Claim 1 EvidenceClaimsSources News media (or reporters) News stories Biased news coverage on a particular topic or genre? How true is a claim? Which news stories can you trust? Whom can you trust?

News trustworthiness Data collected from NewsTrust (Politics category) Articles have been scored by volunteers on journalistic standards Scores on [1,5] scale Some genres inherently more trustworthy than others 52

Using Trust model to boost retrieval Documents are scored on a 1-5 star scale by NewsTrust users. This is used as golden judgment to compute NDCG values. 53 #TopicRetrieval2-stg modelsOur model 1Healthcare Obama administration Bush administration Democratic policy Republican policy Immigration Gay rights Corruption Election reform WikiLeaks Average

Trust scores for topic-specific news 54

Trust scores for topic-specific news 55

Which news sources should you trust? 56 Does it depend on news genres? News media News reporters

Conclusion: Content-driven Trust models The truth value of a claim depends on its source as well as on evidence.  Evidence documents influence each other and have different relevance to claims. We developed a computational framework that associates relevant stories (evidence) to claims and sources. Global analysis of this data, taking into account the relations between stories, their relevance, and their sources, allows us to determine trustworthiness values over sources and claims. Experiments with News Trustworthiness shows promising results on incorporating evidence in trustworthiness computation 57

Ongoing and future work 58 Claim Source Data Users Evidence ClaimVerifier How to present evidence? How to address user bias? 4

59 Many (manual) fact verification sites

Lookup pieces of evidence supporting and opposing the claim Users search for a claim Lookup pieces of evidence only on relevance Traditional search Evidence search 60 Can claim verification be automated? ClaimVerifier

Evidence Search Retrieve text snippets as evidence that supports or opposes a claim Validity of claim derived based on the evidence results (sources, relevance, support) Different from simple “relevance”-based retrieval  Passage must provide evidence that directly supports or contradicts the claim 61

Textual Entailment in Search 62 Scalable Entailed Relation Recognizer Expanded Lexical Retrieval Entailment Recognition Text Corpus Indexes Hypothesis (Claim) Relation Joint work with Mark Sammons and Dan Roth (ACL 2009) Text Hypothesis

Evidence Search (2) Original Goal: Given a claim c, find truth value of c. Modified Goal: Given a claim c, don’t tell me if it is true or not, but show evidence  Simple claims: “Drinking milk is unhealthy for humans.”  Find relevant documents  Determine polarity (gold labels? Textual Entailment?)  Evaluate trust Baselines: Popularity, Expert ranking Via information network (who says what else…) 63

Challenges in presenting evidence What is a good evidence?  Simply written, addresses claims directly?  Avoids redundancy?  Helps with polarity classification?  Helps in evaluating trust? How to present results that best satisfy users?  What do users prefer – information from credible sources or information that closely aligns to their viewpoint?  Does the judgment change if credibility/ bias information is visible to the user? 64

BiasTrust: User’s perspective … How human biases affect credibility judgment of documents (Ongoing research, joint work with Peter Pirolli, PARC) 4

Research goals Understand human bias  What factors influence credibility judgments? Source expertise? Exposure to information vs. topic ignorance? Study how users interact with result  Preference to credible results?  Preference to confirmation bias? Building ClaimVerifier 66

User study Task setup Subjects asked to learn more about some topic, possibly a “controversial” topic Subjects are shown quotes (documents) from “experts” on the topic  Expertise varies, is subjective  Perceived expertise varies much more Subjects are asked to judge if quotes are biased, informative, interesting 67

Specific Questions What do subjects prefer – information from credible sources or information that closely aligns with their bias? Are (relevance) judgments on documents affected by user bias? Does the judgment change if credibility/ bias information is visible to the user? 68

Many “controversial” topics Is milk good for you?  Is organic milk healthier? Raw? Flavored?  Does milk cause early puberty? Are alternative energy sources viable?  Different sources of alternative energy Israeli – Palestinian Conflict  Statehood? History? Settlements?  International involvement, solution theories Creationism vs. Evolution? Global warming 69 Health Science Politics Education

User study setup Setup similar to learning a new topic  Pre-test: Sense bias and knowledge gap  Expose users to general information  Expose users to alternate viewpoints  Post-test: Did bias shift? Why / why not? Characterize shift by strength of bias during pre-test  Survey to understand key reason for shift 70

User interaction workflow 71 Pre-test Post-test You may find this factoid relevant! Source: Expertise: What do you this about this factoid? Mostly agree Somewhat agree Somewhat disagree Mostly disagree Interesting? Yes No Biased? Yes No Neutral Can’t say Show similar Show contrast Quit

Alternate layout 72 Source: Expertise: Source: Expertise: Source: Expertise: Source: Expertise: Source: Expertise: Source: Expertise: …… Viewpoint 1 Viewpoint 2

What happens in background? Pre-test phase  Fixed questions based on sub-topics  Learn User Ignorance model During experiment phase  Update User Ignorance model as documents are read  Credibility judgments fed back for retrieval Post-test survey to judge bias shift  Did expertise of sources play a role? 73

Evaluation questions Which model shows more acceptable quotes  Ranking based on source expertise  Ranking based on topic analysis (new knowledge for the user)  Random How does it change between topics?  Characterize topics based on bias  Is it transferrable? Can topics be grouped? 74

Summary: Towards building ClaimVerifier 75 Claim Source Data Users Evidence ClaimVerifier … even forums and blogs. Finding trustworthy information is critical. Presenting evidence for or against claims will help. Textual content has a key role.

Proposed Timeline User study to determine [Spring 2012]  factors that affect credibility judgment of documents  Interface to present contradictory evidence Modeling user bias in Trust framework [Summer 2012] Build and evaluate ClaimVerifier [Fall 2012] 76

Thanks! 77

BiasTrust: section outline Motivation Problem statement Topic model to capture bias and topic distribution Learning the user model  Captures user understanding of the topic  Allows user model to influence document retrieval User study setup 78

Text analysis: Evidence search Goal: Recognize supporting passages How to model bias in documents?  Topic model for bias / contrastive opinion How to rank results?  Prefer passages from credible sources  Prefer passages conforming to individual viewpoint  Combination of these (personalized?) 79

Bias – Topic Mixture model … Z1Z1 Z2Z2 + + … w Topic Model Bias Model

User Knowledge Models is a distribution over words V learnt using topic model + prior knowledge Known model learnt from pre-test 81 … … … Known ExpertKnowledge Gap

Four flavors of gauging trustworthiness Building trust models over pieces of evidence  Content-driven trust propagation framework (KDD 2011) Scoring claims based on community knowledge  Analysis of multiple, yet weak signals can yield a strong and reliable evidence (KDD 2011 Data Mining for Medicine and HealthCare) Identifying reliable sources  Analysis of structural clues of medical websites (ECIR 2012) Squashing rumors with evidence search  Find evidence for claims from a large text collection (ACL 2009) 82

Collecting claims and evidence Claims  Collected from fact-verification sites, e.g. factcheck.org, politifact.org  Primarily in free-form text Evidence  Search for the claims on search engines  Collect the result documents to build a working set.  Classify mentions in documents as relevant evidence or just passing references  Score evidence based on retrieval score of the document confidence in the passage being an evidence strength of evidence (strong support, weak contradiction, …) 83

Estimating source trust scores Each claim has many pieces of evidence from multiple sources  Trusted sites give supporting evidence for valid claims contradictory evidence for invalid claims  Untrusted sites give supporting evidence for invalid claims contradictory evidence for valid claims Site trustworthiness score computed over how strongly the evidence supports/contradicts claims  Using typical Fact-finding algorithms, such as Hub-Authority style computation 84

Computing Trust scores Trust scores computed iteratively 85 Trustworthiness of source of evidence e j

Effect of context length Longer contexts help relation retrieval and sentiment-based scoring. Restricting the context length to passage of 3 sentences found to be more effective than entire post. 86

Nuts and bolts Scoring function  Polarity variants Subjectivity Scoring based on rank of the post : more relevant post gets higher score  Aggregation variants First score the post, then average scores across posts First aggregate counts across posts, then score Best found: Rank-weighted Average Subjectivity 87

In the context of proposed framework … 88 Claim 1 Claim n Claim EvidenceClaimsSources Claim: Effectiveness of treatments for a disease Evidence: Experience shared by patients on health forums Source: All considered uniformly trustworthy

Related work Message boards have been used to track events in public healthcare domain  Chee, Berlin, and Schatz, 2009 Manually labeling medical websites as trustworthy  Health on Net foundation, HONcode Finding trustworthy information using information network  Yin et al., 2008; Pasternack & Roth, 2010; Vydiswaran et al., 2011 Other “wisdom-of-crowd” examples  Mining search query logs (Pasca, 2006)  Sentiment analysis (Pang and Lee, 2008) Searching for specific medical information  Gaudinat et al., 2006; Hidola (GuidedMed), WebMD 89

Summary Recognizing trustworthy information is crucial.  More so with medical information online Textual content and context has a key role.  How to incorporate textual information to compute trust?  What kind of text can be used? Humans interact with online information.  Share information  Consume and judge information An interactive claim verification system that presents evidence for or against claims will help users