Presentation is loading. Please wait.

Presentation is loading. Please wait.

NTCIR-5, 20051 An Overview of Opinionated Tasks and Corpus Preparation Hsin-Hsi Chen Department of Computer Science and Information Engineering National.

Similar presentations


Presentation on theme: "NTCIR-5, 20051 An Overview of Opinionated Tasks and Corpus Preparation Hsin-Hsi Chen Department of Computer Science and Information Engineering National."— Presentation transcript:

1 NTCIR-5, 20051 An Overview of Opinionated Tasks and Corpus Preparation Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan http://research.nii.ac.jp/ntcir/ntcir-ws6/opinion/ntcir5- opinionws-en.html

2 NTCIR-5, 20052 What is an opinion? Opinion is a subjective information Opinion usually contains an opinion holder an attitude, and a target, but not obligatory A sentential clause or a meaningful unit (in Chinese) is the smallest unit of an opinion.

3 NTCIR-5, 20053 Why opinion processing is important? There is explosive information on the Internet, and it’s hard to extract opinions by humans. Opinions of the public is an important index of companies and the government. Opinions change over time, so to keep track of opinions automatically is an important issue.

4 NTCIR-5, 20054 Fact-based vs. Opinion-based Examples: –Circular vs. Happy –He is an engineer. vs. He thinks that his boss is a kind person. –Why the sky is blue? vs. Do people support the government?

5 NTCIR-5, 20055 Previous Work (1) English: –Sentiment words (Wiebe et al., Kim and Hovy, Takamura et al.) –Opinion sentence extraction (Riloff and Wiebe, Kim and Hovy) –Opinion document extraction (Wiebe et al., Pang et al.) –Opinion summarization: reviews and products (Hu and Liu, Dave et al.)

6 NTCIR-5, 20056 Previous Work (2) Japanese –Opinion extraction (Kobayasi et al.: reviews, at word/sentence level) –Opinion summarization (Morinaga et al.: product reputations, Seki, Eguchi, and Kando) Chinese –Opinion extraction (Ku, Wu, Li and Chen) –Opinion summarization (Ku, Li, Wu and Chen) –News and Blog Corpora (Ku, Liang and Chen) Korean?

7 NTCIR-5, 20057 Corpus Preparation (1) Quantity –How much materials should we collect? Words/Sentences/Documents Source –What source should we pick? Mining opinions from general documents or the obvious opinionated documents? (ex. Discussion group) –News, Reviews, Blogs, …

8 NTCIR-5, 20058 Corpus Preparation (2) Different granularity –Word level –Sentence level –Clause level –Document level –Multi-documents (summarization) Different sources Different languages

9 NTCIR-5, 20059 Previous Work (Corpus Preparation 1/5) Example: NRRC Summer Workshop on Multiple- Perspective QA –People involved: 1 researcher, 3 graduate students, 6 professors –Collect 270,000 documents, over 11-month periods, retrieve documents relevant to 8 topics, more than 200 documents of each topic Workshop: MPQA: Multi-Perspective Question Answering RRC Host: Northeast Regional Research Center (NRRC) 2002 Leader: Prof. Janyce Wiebe Participants: Eric Breck, Chris Buckley, Claire Cardie, Paul Davis, Bruce Fraser, Diane Litman,Prof. Janyce Wiebe David Pierce, Ellen Riloff, Theresa Wilson

10 NTCIR-5, 200510 Previous Work (Corpus Preparation 2/5) Source: news documents (World News Connection - WNC) In another work on word level: 2,615 words

11 NTCIR-5, 200511 Previous Work (Corpus Preparation 3/5) Example: Using NTCIR Corpus (Chinese) –Reusable –NTCIR2, news documents –Retrieve documents relevant to 6 topics –On average, 34 documents for each topic –At Word level: 838 words –Experiments using NTCIR3 are ongoing

12 NTCIR-5, 200512 Previous Work (Corpus Preparation 4/5)

13 NTCIR-5, 200513 Previous Work (Corpus Preparation 5/5) Example: Using reviews from Web (Japanese) –Specific domains: cars and games –15,000 reviews (230,000 sentences) for cars, 9,700 reviews (90,000 sentences) for games –Using topic words (ex. Companies of cars and games) –Semi-automatic methods for collecting opinion terms (with patterns)

14 NTCIR-5, 200514 Corpus Annotation Annotation types (1) –Support/Non-support –Sentiment/Non-sentiment –Positive/Neutral/Negative –Strong/Medium/Weak Annotation types (2) –Opinion holder/Attitude/Target –Nested opinions

15 NTCIR-5, 200515 Previous Work (Corpus Annotation 1/4) Example: NRRC Summer Workshop on Multiple- Perspective QA (English) –Total 114 documents annotated –57 with deep annotations, 57 with shallow annotations –7 annotators

16 NTCIR-5, 200516 Previous Work (Corpus Annotation 2/4) Tags –Opinion: on=implicit/formally declared –Fact: onlyfactive=yes/no –Subjectivity: strength=high/medium/lo –Attitude: neg-attitude/pos-attitude –Writer: opinion holder information

17 NTCIR-5, 200517 Previous Work (Corpus Annotation 3/4) Example: Using NTCIR Corpus (Chinese) –Total 204 documents are annotated –3 annotators –Using XML-style tags –Define types, but no strength (considering the agreement issue)

18 NTCIR-5, 200518 Previous Work (Corpus Annotation 4/4)

19 NTCIR-5, 200519 Corpus Evaluation (1) How to choose materials? –Filter out candidates whose annotations are too diverse among annotators? (Agreements?) –How many annotators are needed for one candidate? (More annotators, lower agreements) –How to build the gold standard? Voting Use instances with consistent annotations

20 NTCIR-5, 200520 Corpus Evaluation (2) How to evaluate a corpus for a subjective task? –Agreement (Is it enough?) –Kappa value (To what agreement level ?) Almost perfect agreement Substantial agreement Moderate agreement Fair agreement Slight agreement Less than change agreement

21 NTCIR-5, 200521 Kappa coefficient (wiki) Cohen's kappa coefficient is a statistical measure of inter-rater agreement. It is generally thought to be a more robust measure than simple percent agreement calculation since κ takes into account the agreement occurring by chance. Cohen's kappa measures the agreement between two raters who each classify N items into C mutually exclusive categories. The first evidence of Cohen's Kappa in print can be attributed to Galton (1892).

22 NTCIR-5, 200522 Kappa coefficient (wiki) The equation for κ is: Pr(a) is the relative observed agreement among raters Pr(e) is the hypothetical probability of chance agreement If the raters are in complete agreement then κ = 1 If there is no agreement among the raters (other than what would be expected by chance) then κ ≤ 0.

23 NTCIR-5, 200523 Kappa coefficient Two raters are asked to classify objects into categories 1 and 2. The table below contains cell probabilities for a 2 by 2 table. P 0 =P 11 +P 22, observed level of agreement –This value needs to be compared to the value that you would expect if the two raters were totally independent P e =P 1 P 1 +P 2 P 2 Rater #1 Rater #2 12Total 1P 11 P 12 P1P1 2P 21 P 22 P2P2 TotalP1P1 P2P2 1 http://www.childrensmercy.org/stats/definitions/kappa.htm

24 NTCIR-5, 200524 Example Hypothetical Example: 29 patients are examined by two independent doctors (see Table). 'Yes' denotes the patient is diagnosed with disease X by a doctor. 'No' denotes the patient is classified as no disease X by a doctor. P 0 =P 11 +P 22 =(10 + 12)/29 = 0.76 P e =P 1 P 1 +P 2 P 2 =0.586 * 0.345 + 0.655 * 0.414 = 0.474 Kappa = (0.76 - 0.474)/(1 - 0.474) = 0.54 Doctor #1 Doctor #2 NoYesTotal No10/29 (34.5%)7/29 (24.1%)17/29 (58.6%) Yes0 (0.0%)12/29 (41.4%) Total10/29 (34.5%)19/29 (65.5%)1 http://www.dmi.columbia.edu/homepages/chuangj/kappa/

25 NTCIR-5, 200525 Online Kappa Calculator http://justus.randolph.name/kappa

26 NTCIR-5, 200526 Previous Work Corpus Evaluation Different languages/annotations may have different agreements. –Kappa: 0.32-0.65 (only factivity, English) –Kappa: 0.40-0.68 (word level, Chinese) Different annotators with different background may have different agreements.

27 NTCIR-5, 200527 What are needed for this work? What kind of documents? News? Others? All relevant documents? Provide only the type of documents, or fully annotated documents for training? Provide some sentiment words as clues? To what granularity? Word, clause, sentence, document, or multi-document? In which language? Mono-lingual, multi-lingual or cross- lingual?

28 NTCIR-5, 200528 Natural Language Processing Lecture 15 Opinionated Applications Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan

29 NTCIR-5, 200529 Opinionated Applications Opinion extraction –Sentiment word mining –Opinionated sentence extraction –Opinionated document extraction Opinion summarization Opinion tracking Opinionated question answering Multi-lingual/Cross-lingual opinionated issues

30 NTCIR-5, 200530 Opinion Mining Opinion extraction identifies opinion holders, extracts the relevant opinion sentences and decides their polarity. Opinion summarization recognizes the major events embedded in documents and summarizes the supportive and the non-supportive evidence. Opinion tracking captures subjective information from various genres and monitors the developments of opinions from spatial and temporal dimensions.

31 NTCIR-5, 200531 Opinion extraction Extracting opinion evidence from words, sentences, and documents, and then to tell their polarities. The composition of semantics and that of opinions are very much alike in documents: –Word -> Sentence -> Document The algorithm is designed based on the composition of different granularities.

32 NTCIR-5, 200532 Seeds Sentiment words in General Inquirer (GI) and Chinese Network Sentiment Dictionary (CNSD) are collected as seeds. GI is in English, while CNSD is in Chinese. GI is translated in Chinese. A total of 10,542 qualified seeds are collected in NTUSD.

33 NTCIR-5, 200533 Statistics of Seeds

34 NTCIR-5, 200534 Thesaurus Expansion The seed vocabulary is enlarged by – 同義詞詞林 – 中央研究院中英雙語知識本體詞網 (The Academia Sinica Bilingual Ontological WordNet) Words in the same clusters may not always have the same opinion tendency. – 寬恕 (forgive) vs. 姑息 (appease) How to distinguish words with different polarities within the same cluster/synset Opinion tendency of a word and its strength

35 NTCIR-5, 200535 Sentiment Tendency of a Character (raw score)

36 NTCIR-5, 200536 Sentiment Tendency of a Character (normalization) ?

37 NTCIR-5, 200537 Sentiment Tendency of a Word A sentiment degree of a Chinese word w is the average of the sentiment scores of the composing characters c 1, c 2,..., c p A positive score denotes a positive word. A negative score denotes a negative word. Score zero denotes non-sentiment or neutral.

38 NTCIR-5, 200538 Opinion Extraction at Sentence Level at Sentence Level ?

39 NTCIR-5, 200539 Opinion Extraction at Document Level

40 NTCIR-5, 200540 Evaluation Corpus Preparation Source: TREC (English;News) / NTCIR (Chinese;News) / Blog (Chinese:Casual Writing) Corpus is prepared for multi-genre and multi- lingual issues. Corpus is prepared to evaluate opinion extraction, summarization, and tracking.

41 NTCIR-5, 200541 Opinion Summarization Find important topics of a document set. Find relative sentences of important topics Find opinions embedded in sentences. Summarize opinions of important topics.

42 NTCIR-5, 200542 Opinion Tracking Opinion tracking is a kind of graph-based opinion summarization. We are concerned of how opinions change over time. An opinion tracking system tells how people change their opinions as time goes by. To track opinions, opinion extraction and summarization are necessary. Opinion extraction tells the changes of opinion polarities, while opinion summarization tells the correlated events.


Download ppt "NTCIR-5, 20051 An Overview of Opinionated Tasks and Corpus Preparation Hsin-Hsi Chen Department of Computer Science and Information Engineering National."

Similar presentations


Ads by Google