A MIXED MODEL FOR CROSS LINGUAL OPINION ANALYSIS Lin Gui, Ruifeng Xu, Jun Xu, Li Yuan, Yuanlin Yao, Jiyun Zhou, Shuwei Wang, Qiaoyun Qiu, Ricky Chenug.

A MIXED MODEL FOR CROSS LINGUAL OPINION ANALYSIS Lin Gui, Ruifeng Xu, Jun Xu, Li Yuan, Yuanlin Yao, Jiyun Zhou, Shuwei Wang, Qiaoyun Qiu, Ricky Chenug

Content 2 Introduction Our approach The evaluation Future work

Introduction 3 opinion analysis technique become a focus topic in natural language processing research Rule-based/machine learning methods The bottle neck of machine learning method The cross-lingual method for opinion analysis Related work

Our approach 4 Cross-lingual self-training Cross-lingual co-training The mixed model

Cross-lingual self-training 5

6 MT Source language annotated corpus Unlabeled source language corpus SVMs Target language annotated corpus MT Unlabeled target language corpus SVMs Top K

Cross-lingual co-training 7 similar to cross-lingual transfer self-training Difference ： – In the co-training model, the classification results for a sample in one language and its translation in another language are incorporated for classification in each iteration

Cross-lingual co-training 8 Source language annotated corpus Unlabeled source language corpus SVMs Target language annotated corpus Unlabeled target language corpus SVMs Top K MT

The mixed model 9

The evaluation 10 This dataset consists of the reviews on DVD, Book and Music category The training data of each category contains 4,000 English annotated documents and 40 Chinese annotated documents Chinese raw corpus contains 17,814 DVD documents, 47,071 Book documents and 29,677 Music documents. each category contains 4,000 Chinese documents

The baseline performance 11 Only use 40 Chinese annotated documents and the machine translation result of 4000 English annotated documents The classifier: SVM-LIGHT Feature: lexicon/unigram/bigram MT system: Baidu fanyi Segmentation tools: ICTCLA Lexicon resource: WordNet Affect

The baseline performance 12 CategoryAccuracy Accuracy DVD 0.7373 Accuracy Book 0.7215 Accuracy Music 0.7423 Accuracy0.7337

The evaluation of self-training 13

The evaluation of co-training 14

The evaluation of mixed model 15

The evaluation result 16 TeamDVDMusicBookAccuracy BISTU0.64730.66050.59800.6353 HLT-Hitsz0.77730.75130.78500.7712 THUIR-SENTI0.73900.73250.74230.7379 SJTUGSLIU0.77200.74530.72400.7471 LEO_WHU0.78330.75950.77000.7709 Our Approach0.79650.78300.78700.7889

Conclusion & future works 17 This weighted based mixed model achieves the best performance on NLP&CC 2013 CLOA bakeoff dataset T ransfer learning process does not satisfy the independent identical distribution hypothesis

Thanks 18

A MIXED MODEL FOR CROSS LINGUAL OPINION ANALYSIS Lin Gui, Ruifeng Xu, Jun Xu, Li Yuan, Yuanlin Yao, Jiyun Zhou, Shuwei Wang, Qiaoyun Qiu, Ricky Chenug.

Similar presentations

Presentation on theme: "A MIXED MODEL FOR CROSS LINGUAL OPINION ANALYSIS Lin Gui, Ruifeng Xu, Jun Xu, Li Yuan, Yuanlin Yao, Jiyun Zhou, Shuwei Wang, Qiaoyun Qiu, Ricky Chenug."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A MIXED MODEL FOR CROSS LINGUAL OPINION ANALYSIS Lin Gui, Ruifeng Xu, Jun Xu, Li Yuan, Yuanlin Yao, Jiyun Zhou, Shuwei Wang, Qiaoyun Qiu, Ricky Chenug.

Similar presentations

Presentation on theme: "A MIXED MODEL FOR CROSS LINGUAL OPINION ANALYSIS Lin Gui, Ruifeng Xu, Jun Xu, Li Yuan, Yuanlin Yao, Jiyun Zhou, Shuwei Wang, Qiaoyun Qiu, Ricky Chenug."— Presentation transcript:

Similar presentations

About project

Feedback