Presentation is loading. Please wait.

Presentation is loading. Please wait.

Yukiko Kawai*, Yusuke Fujita*, Tadahiko Kumamoto**, Jianwei Zhang*, Katsumi Tanaka*** * Kyoto Sangyo University, Japan ** Chiba Institute of Technology,

Similar presentations


Presentation on theme: "Yukiko Kawai*, Yusuke Fujita*, Tadahiko Kumamoto**, Jianwei Zhang*, Katsumi Tanaka*** * Kyoto Sangyo University, Japan ** Chiba Institute of Technology,"— Presentation transcript:

1 Yukiko Kawai*, Yusuke Fujita*, Tadahiko Kumamoto**, Jianwei Zhang*, Katsumi Tanaka*** * Kyoto Sangyo University, Japan ** Chiba Institute of Technology, Japan *** Kyoto University, Japan Using a Sentiment Map for Visualizing Credibility of News Sites on the Web 1

2 Outline Background Research goal System overview Offline processing Online processing Experimental evaluation Conclusion and future work 2

3 Background 3 To answer this question, I want to read some news to have an opinion about this topic. Rapid spread of web news sites (e.g., MSN, GoogleNews) Different sites may have different opinions about the topic A question: What is your attitude towardsIraq war? agree or disagree?

4 Sentiment tendencies of sites Background 4 ??? Is the Iraq war right or wrong? I agree this war If it is a pro- war site If it is an anti-war site ??? Is the Iraq war right or wrong? Well, I have now opinions on different sites Site A Site B I disagree this war News Site A misconception may be caused, if sites tendencies are not known in advance positive negative positive negative Information credibility is improved This may cause a more fair-minded judgment

5 Outline Background Research goal System overview Offline processing Online processing Experimental evaluation Conclusion and future work 5

6 A concept of sentiment map A query is Iraq war Mapping Graph of sentiment based on location Top ranked articles from each news site 6 Demonstration Positive Negative

7 Outline Background Research goal System overview Offline processing Online processing Experimental evaluation Conclusion and future work 7

8 System overview 8 Offline processing (Preprocessing) Online processing (Runtime processing) query articles database including tf-idf, sentiment values Yomiuri Osaka Yomiuri Tokyo news articles collection morphological analysis crawling 1) retrieve articles from each news site 2) rank the articles based on tf-idf in each site Asahi Tokyo Web tf-idf value calculation sentiment values calculation sentiment dictionary news sites sentiment map 3) calculate the average of sentiment values for each site 4) generate a sentiment map

9 Outline Background Research goal System overview Offline processing Online processing Experimental evaluation Conclusion and future work 9

10 Offline processing 10 News articles collection Crawl news articles from various news sites and store them into DB News articles analysis Eliminate HTML tags Make morphological analysis to extract nouns, verbs, and adjectives Calculate tf-idf values of extracted word j for each news article p i Attach a sentiment vector to each news article Use a sentiment dictionary F j : the frequency of word j appearing on article p i F all : the number of all words on p i N: the number of all articles N j : the number of articles including j

11 Entry word (w) Sentiment (e) a: Dark Bright Sentiment (e) b: Rejection Acceptance Sentiment (e) c: Tension Relaxation Sentiment (e) d: Fear Anger challenge0.6180.6870.7520.500 collide0.3440.3530.3150.529 death 0.280.3580.2600.364 derailment0.310.5460.4030.291 revival0.910.5210.4290.000 rich0.5970.6760.7610.466 11 O c (death) = 0.260 Sample of sentiment dictionary e = a, b, c, d Sentiment value O e (w) of an entry word w A value between 0~1, (e.g., 0: dark, 1: bright) Calculated by analyzing co-occurrence with the original sentiment words, based on 200 million articles of Nikkei newspapers

12 12 Calculation of Sentiment value O e (w) Sentiments and their corresponding original sentiment words Sentiment (e = a, b, c, d)Original sentiment words (e 1, e 2 ) a: Bright Dark bright, glad, happy dark, sad, painful b: Acceptance Rejection approval, love, like reject, aversion, dislike c: Relaxation Tension comfortable, peaceful, slow tension, emergency d: Anger Fear angry, roar fear, scary, dread e1e1 e2e2 df(e): occurrence times of original sentiment words e df(e&w): co-occurrence times of original sentiment words e and an entry word w Sentiment value:

13 13 Calculation of Sentiment value O e (w) Sentiment (e = a, b, c, d)Original sentiment words (e 1, e 2 ) a: Bright Dark bright, glad, happy dark, sad, painful b: Acceptance Rejection approval, love, like reject, aversion, dislike c: Relaxation Tension comfortable, peaceful, slow tension, emergency d: Anger Fear angry, roar fear, scary, dread e1e1 e2e2 Sentiment value of word death on the dimension c: O c (death) = 0.260 Because df(comfortable & death), df(peaceful & death), df(slow & death) << df(tension& death), df(emergency& death) Sentiments and their corresponding original sentiment words

14 Sentiment vector O(TEXT) of a news article 14 a news article text = TEXT TEXT has the number of n keywords keywords = {w} Each sentiment value O e (TEXT) Sentiment vector O(w) of the article for the keyword w

15 Outline Background Research goal System overview Offline processing Online processing Experimental evaluation Conclusion and future work 15

16 Online processing 16 When a user enters query keywords, 1. Retrieve news articles including the keywords 2. Rank articles based on tf-idf values for each news site 3. Calculate the average of sentiment vectors of top n articles for each site 4. Attach sentiment graphs to corresponding locations of news sites Also present a list of articles grouped by each site

17 Outline Background Research goal System overview Offline processing Online processing Experimental evaluation Conclusion and future work 17

18 Query: Daisuke Matsuzaka A famous Japanese Major Leaguer A reviewer read all the retrieved articles of different news sites and decided the sentiments of each news site positive, negative or neutral For comparison, numeric sentiment values given from our system are categorized to discrete values positive, negative or neutral Experimental evaluation 18

19 Experimental evaluation 19 a: Dark Bright b: Rejection Acceptance c: Tension Relaxation d: Fear Anger reviewerBrightAcceptanceTensionNeutral Web site 1BrightAcceptanceTensionNeutral reviewerBrightAcceptanceRelaxationNeutral Web site 2BrightAcceptanceTensionNeutral reviewerBrightAcceptanceRelaxationFear Web site 3BrightAcceptanceTensionFear reviewerNeutral Anger Web site 4DarkAcceptanceTensionFear Precision is about 70% There exist some distinctions among different news sites

20 Outline Background Research goal System overview Offline processing Online processing Experimental evaluation Conclusion and future work 20

21 Conclusion and future work 21 Conclusion Developed a system called sentiment map for visualizing the sentiment distinction of different news sites Tested its effectiveness A prototype: http://klab.kyoto-su.ac.jp/~fujita/cgi-bin/Fuzilla/News/ http://klab.kyoto-su.ac.jp/~fujita/cgi-bin/Fuzilla/News/ Future work More experiments Sentiment analysis of readers and information recommendation based on it

22 Thank you for your attention 22

23

24 Entry word (w) Sentiment (e) a: Bright Dark Sentiment (e) b: Acceptance Rejection Sentiment (e) c: Relaxation Tension Sentiment (e) d: Anger Fear chosen-suru (challenge) 0.6180.6870.7520.500 1.3991.3301.2511.090 dassen (derailment) 0.310.5460.4030.291 0.5140.6030.7370.549 hofu-da (rich) 0.5970.6760.7610.466 1.4161.3521.2991.109 shibou (death) 0.280.3580.2600.364 1.1321.2721.3061.112 shototsu-suru (collide) 0.3440.3530.3150.529 1.0041.0161.0990.948 sosei (revival) 0.910.5210.4290.000 0.4640.5820.7320.328 S e (w): impression value 24 M e (w): weight S c (death) = 0.260 M c (death) = 1.306 Sample of sentiment dictionary e = a, b, c, d

25 Sentiment (e) e = a, b, c, d Original impression words (e 1, e 2 ) a: Bright Dark akarui (bright), ureshii (glad), tanoshii (happy) kurai (dark), kanashii (sad), kurushii (painful) b: Acceptance Rejection shonin (approval), aikou (love), suki-da (like) kyohi (reject), keno (aversion), kirai-da (dislike) c: Relaxation Tension yuttari (comfortable), nonbiri (peaceful), yukkuri (slow) kincho (tension), kinkyuu (emergency) d: Anger Fear okoru (angry), dogou (roar) osoreru (fear), kowai (scary), kyofu (dread) 25 e1e1 e2e2 Sentiment value O e (w) of an entry word w A value between 1~0, (1: positive, 0: negative) Calculated by analyzing the co-occurrence with the original impression words, based on Nikkei Newspaper Full Text Database (about 200 million articles) Original impression words and their correspondence with sentiments

26 Sentiment (e) e = a, b, c, d Original impression words (e 1, e 2 ) a: Bright Dark akarui (bright), ureshii (glad), tanoshii (happy) kurai (dark), kanashii (sad), kurushii (painful) b: Acceptance Rejection shonin (approval), aikou (love), suki-da (like) kyohi (reject), keno (aversion), kirai-da (dislike) c: Relaxation Tension yuttari (comfortable), nonbiri (peaceful), yukkuri (slow) kincho (tension), kinkyuu (emergency) d: Anger Fear okoru (angry), dogou (roar) osoreru (fear), kowai (scary), kyofu (dread) 26 e1e1 e2e2 Sentiment value O e (w) of an entry word w Sentiment value of word death on the dimension c: O c (death) = 0.260 comfortable and death, peaceful and death << tension and death, emergency and death S e (w): impression valueM e (w): weight

27 A proposition of sentiment map 27 Demonstration query is scandal Sentiment map for each news site positive negative 0 0.5 -0.5 Top ranked articles from each news site

28 System overview 28 Offline processing (Preprocessing) Online processing (Runtime processing) query articles database including tf-idf, sentiment values Yomiuri Osaka Yomiuri Tokyo news articles collection morphological analysis crawling 1) retrieve articles from each news site 2) rank the articles based on tf-idf in each site Asahi Tokyo Web tf-idf value calculation sentiment values calculation sentiment dictionary news sites sentiment map 3) calculate the average of sentiment values for each site 4) generate a sentiment map


Download ppt "Yukiko Kawai*, Yusuke Fujita*, Tadahiko Kumamoto**, Jianwei Zhang*, Katsumi Tanaka*** * Kyoto Sangyo University, Japan ** Chiba Institute of Technology,"

Similar presentations


Ads by Google