Presentation is loading. Please wait.

Presentation is loading. Please wait.

Can you believe what you read online?: Modeling and Predicting Trustworthiness of Online Textual Information V.G.Vinod Vydiswaran Department of Computer.

Similar presentations


Presentation on theme: "Can you believe what you read online?: Modeling and Predicting Trustworthiness of Online Textual Information V.G.Vinod Vydiswaran Department of Computer."— Presentation transcript:

1 Can you believe what you read online?: Modeling and Predicting Trustworthiness of Online Textual Information V.G.Vinod Vydiswaran Department of Computer Science University of Illinois at Urbana-Champaign December 5 th, 2011

2 Web content: structured and free-text 2 Are all these pieces of information equally trustable?

3 Even reputed sources can make mistakes 3 Some sources / claims can be misleading on purpose.

4 Blogs and forums give information too 4 Sources may not be “reputed”, but information can still be trusted.

5 Echoing quack of a duck 5 RESULT : Duck’s quack echoes! Is trustworthiness always objective?

6 Treating cancer 6 Should trustworthiness be subjective?

7 Is drinking alcohol good for the body? “A review article of the latest studies looking at red wine and cardiovascular health shows drinking two to three glasses of red wine daily is good for the heart.” Dr. Bauer Sumpio, M.D., Prof., Yale School of Medicine Journal of American College of Surgeons, 2005 “Women who regularly drink a small amount of alcohol — less than a drink a day — are more likely to develop breast cancer in their lifetime than those who don’t.” Dr. Wendy Chen, Asst. Prof. (Medicine), Harvard Medical School Journal of the American Medical Association (JAMA), 2011 Is drinking alcohol good for the body? Are these sources “reliable”? What other sources/documents say about this information? 7

8 Every coin has two sides People tend to be biased, and may be exposed to only one side of the story Confirmation bias Effects of filter bubble For intelligent choices, it is wiser to also know about the other side What is considered trustworthy may depend on the person’s viewpoint 8 Should trustworthiness be user-dependent?

9 Milk is good for humans… or is it? 9 Milk contains nine essential nutrients… Dairy products add significant amounts of cholesterol and saturated fat to the diet... The protein in milk is high quality, which means it contains all of the essential amino acids or 'building blocks' of protein. Milk proteins, milk sugar, and saturated fat in dairy products pose health risks for children and encourage the development of obesity, diabetes, and heart disease... Drinking of cow milk has been linked to iron- deficiency anemia in infants and children It is long established that milk supports growth and bone development One outbreak of development of enlarged breasts in boys and premature development of breast buds in girls in Bahrain was traced to ingestion of milk from a cow given continuous estrogen treatment by its owner to ensure uninterrupted milk production. rbST [man-made bovine growth hormone] has no biological effects in humans. There is no way that bST [naturally-occurring bovine growth hormone] or rbST in milk induces early puberty. Given these evidence docs, users can make a decision

10 Actors in the trustworthiness story 10 Claim Source Data Users Evidence Drinking alcohol is good for the body. “A review article of the latest studies looking at red wine and cardiovascular health shows drinking two to three glasses of red wine daily is good for the heart.” Women who regularly drink a small amount of alcohol — less than a drink a day — are more likely to develop breast cancer in their lifetime than those who don’t. Dr. Bauer Sumpio Dr. Wendy Chen News Corpus Medical sites Blogs Forums ClaimVerifier

11 ClaimVerifier : Thesis contribution 11 Claim Source Data Users Evidence ClaimVerifier Novel system for users to validate textual claims Free-text claims Incorporate textual evidence into trust models

12 Challenges in building ClaimVerifier 12 Claim Source Data Users Evidence ClaimVerifier What kind of data can be utilized? How to find relevant pieces of evidence ? Are sources trustworthy? How to present evidence? How to assign truth values to textual claims? How to address user bias? How to build trust models that make use of evidence?

13 Outline of the talk Measuring source trustworthiness Using forums as data source Content-based trust propagation models User biases and interface design 13 2 1 1 2 3 3 4 4

14 ClaimVerifier : Trustworthiness factors 14 Claim Source Data Users Evidence ClaimVerifier Knowing why something is true is important Source Trustworthiness Information from blogs and forums User bias may affect perceived trustworthiness 1 2 3 4

15 Identify trustworthy websites (sources) Case study: Medical websites Joint work with Parikshit Sondhi and ChengXiang Zhai (ECIR 2012) 1

16 Variations in online medical information 16

17 Problem Statement For a (medical) website  What features indicate trustworthiness?  How can you automate extracting these features? Can you learn to distinguish trustworthy websites from others? 17

18 “cure back pain”: Top 10 results 18 health2us.com Content Presentation Financial interest Transparency Complementarity Authorship Privacy

19 Trustworthiness of medical websites HON code Principles Authoritative Complementarity Privacy Attribution Justifiability Transparency Financial disclosure Advertising policy Our model (automated) Link-based features  Transparency  Privacy Policy  Advertising links Page-based features  Commercial words  Content words  Presentation Website-based features  Page Rank 19

20 Research questions For a (medical) website  What features indicate trustworthiness?  How can you automate extracting these features? Can you learn to distinguish trustworthy websites from others? Bias results to prefer trustworthy websites? 20 Yes HON code principles link, page, site features Learned SVM and used it to re-rank results

21 Use classifier to re-rank results 21 MAPGoogleOurs 22 queries0.7530.817

22 Understanding what is written (evidence) Case study: Scoring medical claims based on health forums (KDD 2011 Workshop on Data Mining for Medicine and Healthcare) 2

23 Many medical support groups available 23

24 Claim DB Evidence & Support DB 24 Scoring claims via community knowledge Claim Essiac tea is an effective treatment for cancer. Chemotherapy is an effective treatment for cancer.

25 Problem statement Given  A corpus of community generated content E.g. forum postings, mailing lists  A database of relations (“claims”) E.g. [disease, treatments], [treatments, side-effects] can we  Rank and score the claims based on their support / verifiability, as demonstrated in the text corpus  Build scoring function to rate databases as a whole 25

26 Relation Retrieval  Query Formulation Parse result snippet  Find sentiment expressed in snippet Score snippets Aggregate them to get claim score Collect relevant evidence for claims A Analyze the evidence documents B Score and aggregate evidence C Ranked treatment claims 26 Key steps

27 What is a Relation? Entities  Nouns  Objects of interest  Possibly typed Relation words  Verbs  Binary, for now  Usually with roles (entities participating in the relation have specific roles) SanduskyPennState protect Essiac teaCancer cured by ORGPER 27 Entity Relation type Collect relevant evidence for claims A Disease Treatment

28 Relation Retrieval Query Formulation  structured relation  possibly typed Query Expansion  Relation: with synonyms, words with similar contexts  Entities: with acronyms, common synonyms Query weighting Entity 2Entity 1 cured by cure treat help prevent reduce Chemotherapy Chemo Impotence ED Erectile Dysfunction Infertility 28 Entity Relation type Collect relevant evidence for claims A Disease Treatment

29 Relation Retrieval  Query Formulation Parse result snippet  Find sentiment expressed in snippet Score snippets Aggregate them to get claim score Collect relevant evidence for claims A Analyze the evidence documents B Score and aggregate evidence C Ranked treatment claims 29 Key steps revisited Entity Relation type

30 Number of snippets found relevant (popularity) Number of evidence contexts extracted Orientation of the posts (positive / negative) Percentage of opinion words Subjectivity of opinion words used (strong, weak) Length of posts Relevance of the post to the claim 30 Score and aggregate evidence C Scoring post snippets

31 Scoring functions 4 variants to score posts  Opin: based on number of opinionated words  Subj: based on subjectivity of opinionated words  Scaled: scaling opinion words with subjectivity  Orient: based on orientation (polarity) of the post 3 ways to aggregate scores  Aggregating counts over relevant posts  Averaging scores of individual posts  Combining post scores in a weighted average 31 12 combinations

32 Treatment effectiveness based on forums 32 Claim Source Data Users Evidence ClaimVerifier Health forums and medical message boards Treatment claims Forum posts describing effectiveness of the treatment Ignored QUESTION: Which treatments are more effective than others for a disease?

33 Evaluation: Corpus statistics Collection of nine health forums and discussion boards  Not restricted to any specific disease or treatment class 33

34 DiseaseApproved TreatmentsAlternate Treatments AIDSAbcavir, Kivexa, Zidovudine, Tenofovir, Nevirapine Acupuncture, Herbal medicines, Multi-vitamins, Tylenol, Selenium ArthritisPhysical therapy, Exercise, Tylenol, Morphine, Knee brace Acupuncture, Chondroitin, Gluosamine, Ginger rhizome, Selenium AsthmaSalbutamol, Advair, Ventolin Bronchodilator, Xolair Atrovent, Serevent, Foradil, Ipratropium CancerSurgery, Chemotherapy, Quercetin, Selenium, Glutathione Essiac tea, Budwig diet, Gerson therapy, Homeopathy COPDSalbutamol, Smoking cessation, Spiriva, Oxygen, Surgery Ipratropium, Atrovent, Apovent ImpotenceTestesterone, Implants, Viagra, Levitra, Cialis Ginseng root, Naltrexone, Enzyte, Diet 34 Treatment claims considered

35 Results: Ranking valid treatments Datasets  Skewed: 5 random valid + all invalid treatments  Balanced: 5 random valid + 5 random invalid treatments Finding: Our approach improves ranking of valid treatments, significant in Skewed dataset. 35

36 Expt 2: Variation in scoring schemes Counting baseline: number of posts returned for a treatment Findings  Subjectivity scoring seems better  Rank-based weighting of posts during aggregation of scores seems to improve MAP MAP score for Cancer treatments Counting baseline MAP: 0.3 36

37 Measuring site “trustworthiness” 37 Database score Ratio of degradation Trustworthiness should decrease

38 Over all six disease test sets As noise added to the claim database, the overall score reduces. Exception: Arthritis, because it starts off with a negative score 38

39 Conclusion: Scoring claims using forums It is feasible to score trustworthiness claims using signal from million of patient posts We scored treatment posts based on subjectivity of opinion words, and extended the notion to score databases It is possible to reliably leverage this general idea of validating knowledge through crowd-sourcing 39

40 Modeling trust over content and source 40 Claim Source Data Users Evidence ClaimVerifier What kind of data can be utilized? How to find relevant pieces of evidence ? Are sources trustworthy? How to assign truth values to textual claims? How to build trust models that make use of evidence? 1 2 3

41 Content-Driven Trust Propagation Framework Case study: News Trustworthiness (KDD 2011) 3

42 Problem: How to verify claims? 42 Claim 1 EvidenceClaimsSources Web Sources Passages that give evidence for the claim News media (or reporters) News stories “A duck’s quack doesn’t echo.” “PennState protected Sandusky.” News coverage on the issue of “Immigration” is biased. Trust

43 Typical fact-finding is over structured data 43 Claim 1 Claim n Claim 2...... ClaimsSources Assume structured claims and accurate IE modules Mt. Everest8848 m K28611 m Mt. Everest8500 m

44 Traditional 2-layer Fact-Finder model Hub-Authority style Prior research  TruthFinder: Yin, Han, & Yu, 2007  Pasternack & Roth, 2010  Many more … 44 s1s1 s2s2 s3s3 s4s4 s5s5 c4c4 c3c3 c2c2 c1c1 Book TitleAuthors PersonDate of birth CityPopulation No Context Trustworthiness of source Veracity of claims

45 Incorporating Text in Trust Models 45 Claim 1 Claim n Claim 2...... EvidenceClaimsSources Free-text claims Special case: structured data 1. Textual evidence 2. Supports adding IE accuracy, relevance, similarity between text

46 Evidence-based 3-layer model Defining the model  Defining the three parameters  Initialization Framework layers Scores for layers  Trust propagation over the framework  Handling influence between layers 46 s1s1 s2s2 s3s3 s4s4 s5s5 c4c4 c3c3 c2c2 c1c1 e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 e7e7 e8e8 e9e9 e 10

47 Understanding model parameters Scores computed  : Claim veracity  : Evidence trust  : Source trust Influence factors  : evidence similarity  : Relevance  : Source-Evidence influence Initializing  Uniform distribution for  Retrieval score for 47

48 Computing Trust scores 48 Veracity of claims Trustworthiness of sources Confidence in evidence Trust scores computed iteratively Veracity of a claim depends on the evidence documents for the claim and their sources. Trustworthiness of a source is based on the claims it supports. Confidence in an evidence document depends on source trustworthiness and confidence in other similar documents.

49 Trust scores computed iteratively Adding influence factors Computing Trust scores 49 Similarity of evidence e i to e j Relevance of evidence e j to claim c i Sum over all other pieces of evidence for claim c(e i ) Trustworthiness of source of evidence e j

50 Possible extensions Mutually exclusive claims, constraints [Pasternack & Roth, 2011] Structure on sources, groups [Pasternack & Roth, 2011] Source copying [Dong, Srivastava, et al., 2009] 50 Structured claims No context

51 Application: Trustworthiness in News 51 Claim 1 EvidenceClaimsSources News media (or reporters) News stories Biased news coverage on a particular topic or genre? How true is a claim? Which news stories can you trust? Whom can you trust?

52 News trustworthiness Data collected from NewsTrust (Politics category) Articles have been scored by volunteers on journalistic standards Scores on [1,5] scale Some genres inherently more trustworthy than others 52

53 Using Trust model to boost retrieval Documents are scored on a 1-5 star scale by NewsTrust users. This is used as golden judgment to compute NDCG values. 53 #TopicRetrieval2-stg modelsOur model 1Healthcare0.8860.8950.932 2Obama administration0.8520.8760.927 3Bush administration0.9310.9210.971 4Democratic policy0.8940.7690.922 5Republican policy0.7740.8480.936 6Immigration0.8200.9520.983 7Gay rights0.8320.8640.807 8Corruption0.8740.8410.941 9Election reform0.8640.8890.908 10WikiLeaks0.8860.8600.825 Average0.8610.8690.915

54 Trust scores for topic-specific news 54

55 Trust scores for topic-specific news 55

56 Which news sources should you trust? 56 Does it depend on news genres? News media News reporters

57 Conclusion: Content-driven Trust models The truth value of a claim depends on its source as well as on evidence.  Evidence documents influence each other and have different relevance to claims. We developed a computational framework that associates relevant stories (evidence) to claims and sources. Global analysis of this data, taking into account the relations between stories, their relevance, and their sources, allows us to determine trustworthiness values over sources and claims. Experiments with News Trustworthiness shows promising results on incorporating evidence in trustworthiness computation 57

58 Ongoing and future work 58 Claim Source Data Users Evidence ClaimVerifier How to present evidence? How to address user bias? 4

59 59 Many (manual) fact verification sites

60 Lookup pieces of evidence supporting and opposing the claim Users search for a claim Lookup pieces of evidence only on relevance Traditional search Evidence search 60 Can claim verification be automated? ClaimVerifier

61 Evidence Search Retrieve text snippets as evidence that supports or opposes a claim Validity of claim derived based on the evidence results (sources, relevance, support) Different from simple “relevance”-based retrieval  Passage must provide evidence that directly supports or contradicts the claim 61

62 Textual Entailment in Search 62 Scalable Entailed Relation Recognizer Expanded Lexical Retrieval Entailment Recognition Text Corpus Indexes Hypothesis (Claim) Relation Joint work with Mark Sammons and Dan Roth (ACL 2009) Text Hypothesis

63 Evidence Search (2) Original Goal: Given a claim c, find truth value of c. Modified Goal: Given a claim c, don’t tell me if it is true or not, but show evidence  Simple claims: “Drinking milk is unhealthy for humans.”  Find relevant documents  Determine polarity (gold labels? Textual Entailment?)  Evaluate trust Baselines: Popularity, Expert ranking Via information network (who says what else…) 63

64 Challenges in presenting evidence What is a good evidence?  Simply written, addresses claims directly?  Avoids redundancy?  Helps with polarity classification?  Helps in evaluating trust? How to present results that best satisfy users?  What do users prefer – information from credible sources or information that closely aligns to their viewpoint?  Does the judgment change if credibility/ bias information is visible to the user? 64

65 BiasTrust: User’s perspective … How human biases affect credibility judgment of documents (Ongoing research, joint work with Peter Pirolli, PARC) 4

66 Research goals Understand human bias  What factors influence credibility judgments? Source expertise? Exposure to information vs. topic ignorance? Study how users interact with result  Preference to credible results?  Preference to confirmation bias? Building ClaimVerifier 66

67 User study Task setup Subjects asked to learn more about some topic, possibly a “controversial” topic Subjects are shown quotes (documents) from “experts” on the topic  Expertise varies, is subjective  Perceived expertise varies much more Subjects are asked to judge if quotes are biased, informative, interesting 67

68 Specific Questions What do subjects prefer – information from credible sources or information that closely aligns with their bias? Are (relevance) judgments on documents affected by user bias? Does the judgment change if credibility/ bias information is visible to the user? 68

69 Many “controversial” topics Is milk good for you?  Is organic milk healthier? Raw? Flavored?  Does milk cause early puberty? Are alternative energy sources viable?  Different sources of alternative energy Israeli – Palestinian Conflict  Statehood? History? Settlements?  International involvement, solution theories Creationism vs. Evolution? Global warming 69 Health Science Politics Education

70 User study setup Setup similar to learning a new topic  Pre-test: Sense bias and knowledge gap  Expose users to general information  Expose users to alternate viewpoints  Post-test: Did bias shift? Why / why not? Characterize shift by strength of bias during pre-test  Survey to understand key reason for shift 70

71 User interaction workflow 71 Pre-test Post-test You may find this factoid relevant! Source: Expertise: What do you this about this factoid? Mostly agree Somewhat agree Somewhat disagree Mostly disagree Interesting? Yes No Biased? Yes No Neutral Can’t say Show similar Show contrast Quit

72 Alternate layout 72 Source: Expertise: Source: Expertise: Source: Expertise: Source: Expertise: Source: Expertise: Source: Expertise: …… Viewpoint 1 Viewpoint 2

73 What happens in background? Pre-test phase  Fixed questions based on sub-topics  Learn User Ignorance model During experiment phase  Update User Ignorance model as documents are read  Credibility judgments fed back for retrieval Post-test survey to judge bias shift  Did expertise of sources play a role? 73

74 Evaluation questions Which model shows more acceptable quotes  Ranking based on source expertise  Ranking based on topic analysis (new knowledge for the user)  Random How does it change between topics?  Characterize topics based on bias  Is it transferrable? Can topics be grouped? 74

75 Summary: Towards building ClaimVerifier 75 Claim Source Data Users Evidence ClaimVerifier … even forums and blogs. Finding trustworthy information is critical. Presenting evidence for or against claims will help. Textual content has a key role.

76 Proposed Timeline User study to determine [Spring 2012]  factors that affect credibility judgment of documents  Interface to present contradictory evidence Modeling user bias in Trust framework [Summer 2012] Build and evaluate ClaimVerifier [Fall 2012] 76

77 Thanks! vgvinodv@illinois.edu 77

78 BiasTrust: section outline Motivation Problem statement Topic model to capture bias and topic distribution Learning the user model  Captures user understanding of the topic  Allows user model to influence document retrieval User study setup 78

79 Text analysis: Evidence search Goal: Recognize supporting passages How to model bias in documents?  Topic model for bias / contrastive opinion How to rank results?  Prefer passages from credible sources  Prefer passages conforming to individual viewpoint  Combination of these (personalized?) 79

80 Bias – Topic Mixture model … Z1Z1 Z2Z2 + + … w Topic Model Bias Model

81 User Knowledge Models is a distribution over words V learnt using topic model + prior knowledge Known model learnt from pre-test 81 … … … Known ExpertKnowledge Gap

82 Four flavors of gauging trustworthiness Building trust models over pieces of evidence  Content-driven trust propagation framework (KDD 2011) Scoring claims based on community knowledge  Analysis of multiple, yet weak signals can yield a strong and reliable evidence (KDD 2011 Data Mining for Medicine and HealthCare) Identifying reliable sources  Analysis of structural clues of medical websites (ECIR 2012) Squashing rumors with evidence search  Find evidence for claims from a large text collection (ACL 2009) 82

83 Collecting claims and evidence Claims  Collected from fact-verification sites, e.g. factcheck.org, politifact.org  Primarily in free-form text Evidence  Search for the claims on search engines  Collect the result documents to build a working set.  Classify mentions in documents as relevant evidence or just passing references  Score evidence based on retrieval score of the document confidence in the passage being an evidence strength of evidence (strong support, weak contradiction, …) 83

84 Estimating source trust scores Each claim has many pieces of evidence from multiple sources  Trusted sites give supporting evidence for valid claims contradictory evidence for invalid claims  Untrusted sites give supporting evidence for invalid claims contradictory evidence for valid claims Site trustworthiness score computed over how strongly the evidence supports/contradicts claims  Using typical Fact-finding algorithms, such as Hub-Authority style computation 84

85 Computing Trust scores Trust scores computed iteratively 85 Trustworthiness of source of evidence e j

86 Effect of context length Longer contexts help relation retrieval and sentiment-based scoring. Restricting the context length to passage of 3 sentences found to be more effective than entire post. 86

87 Nuts and bolts Scoring function  Polarity variants Subjectivity Scoring based on rank of the post : more relevant post gets higher score  Aggregation variants First score the post, then average scores across posts First aggregate counts across posts, then score Best found: Rank-weighted Average Subjectivity 87

88 In the context of proposed framework … 88 Claim 1 Claim n Claim 2...... EvidenceClaimsSources Claim: Effectiveness of treatments for a disease Evidence: Experience shared by patients on health forums Source: All considered uniformly trustworthy

89 Related work Message boards have been used to track events in public healthcare domain  Chee, Berlin, and Schatz, 2009 Manually labeling medical websites as trustworthy  Health on Net foundation, HONcode Finding trustworthy information using information network  Yin et al., 2008; Pasternack & Roth, 2010; Vydiswaran et al., 2011 Other “wisdom-of-crowd” examples  Mining search query logs (Pasca, 2006)  Sentiment analysis (Pang and Lee, 2008) Searching for specific medical information  Gaudinat et al., 2006; Hidola (GuidedMed), WebMD 89

90 Summary Recognizing trustworthy information is crucial.  More so with medical information online Textual content and context has a key role.  How to incorporate textual information to compute trust?  What kind of text can be used? Humans interact with online information.  Share information  Consume and judge information An interactive claim verification system that presents evidence for or against claims will help users. 90 1 2 4 3 5 3


Download ppt "Can you believe what you read online?: Modeling and Predicting Trustworthiness of Online Textual Information V.G.Vinod Vydiswaran Department of Computer."

Similar presentations


Ads by Google