Presentation is loading. Please wait.

Presentation is loading. Please wait.

A MIXED MODEL FOR CROSS LINGUAL OPINION ANALYSIS Lin Gui, Ruifeng Xu, Jun Xu, Li Yuan, Yuanlin Yao, Jiyun Zhou, Shuwei Wang, Qiaoyun Qiu, Ricky Chenug.

Similar presentations


Presentation on theme: "A MIXED MODEL FOR CROSS LINGUAL OPINION ANALYSIS Lin Gui, Ruifeng Xu, Jun Xu, Li Yuan, Yuanlin Yao, Jiyun Zhou, Shuwei Wang, Qiaoyun Qiu, Ricky Chenug."— Presentation transcript:

1 A MIXED MODEL FOR CROSS LINGUAL OPINION ANALYSIS Lin Gui, Ruifeng Xu, Jun Xu, Li Yuan, Yuanlin Yao, Jiyun Zhou, Shuwei Wang, Qiaoyun Qiu, Ricky Chenug

2 Content 2 Introduction Our approach The evaluation Future work

3 Introduction 3 opinion analysis technique become a focus topic in natural language processing research Rule-based/machine learning methods The bottle neck of machine learning method The cross-lingual method for opinion analysis Related work

4 Our approach 4 Cross-lingual self-training Cross-lingual co-training The mixed model

5 Cross-lingual self-training 5

6 6 MT Source language annotated corpus Unlabeled source language corpus SVMs Target language annotated corpus MT Unlabeled target language corpus SVMs Top K

7 Cross-lingual co-training 7 similar to cross-lingual transfer self-training Difference : – In the co-training model, the classification results for a sample in one language and its translation in another language are incorporated for classification in each iteration

8 Cross-lingual co-training 8 Source language annotated corpus Unlabeled source language corpus SVMs Target language annotated corpus Unlabeled target language corpus SVMs Top K MT

9 The mixed model 9

10 The evaluation 10 This dataset consists of the reviews on DVD, Book and Music category The training data of each category contains 4,000 English annotated documents and 40 Chinese annotated documents Chinese raw corpus contains 17,814 DVD documents, 47,071 Book documents and 29,677 Music documents. each category contains 4,000 Chinese documents

11 The baseline performance 11 Only use 40 Chinese annotated documents and the machine translation result of 4000 English annotated documents The classifier: SVM-LIGHT Feature: lexicon/unigram/bigram MT system: Baidu fanyi Segmentation tools: ICTCLA Lexicon resource: WordNet Affect

12 The baseline performance 12 CategoryAccuracy Accuracy DVD 0.7373 Accuracy Book 0.7215 Accuracy Music 0.7423 Accuracy0.7337

13 The evaluation of self-training 13

14 The evaluation of co-training 14

15 The evaluation of mixed model 15

16 The evaluation result 16 TeamDVDMusicBookAccuracy BISTU0.64730.66050.59800.6353 HLT-Hitsz0.77730.75130.78500.7712 THUIR-SENTI0.73900.73250.74230.7379 SJTUGSLIU0.77200.74530.72400.7471 LEO_WHU0.78330.75950.77000.7709 Our Approach0.79650.78300.78700.7889

17 Conclusion & future works 17 This weighted based mixed model achieves the best performance on NLP&CC 2013 CLOA bakeoff dataset T ransfer learning process does not satisfy the independent identical distribution hypothesis

18 Thanks 18


Download ppt "A MIXED MODEL FOR CROSS LINGUAL OPINION ANALYSIS Lin Gui, Ruifeng Xu, Jun Xu, Li Yuan, Yuanlin Yao, Jiyun Zhou, Shuwei Wang, Qiaoyun Qiu, Ricky Chenug."

Similar presentations


Ads by Google