Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Xinxiong Chen, Yabin Zheng, Maosong Sun 2011, FCCNLL Automatic Keyphrase.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Xinxiong Chen, Yabin Zheng, Maosong Sun 2011, FCCNLL Automatic Keyphrase."— Presentation transcript:

1 Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Xinxiong Chen, Yabin Zheng, Maosong Sun 2011, FCCNLL Automatic Keyphrase Extraction by Bridging Vocabulary Gap 1

2 Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments 2

3 Intelligent Database Systems Lab Motivation Most methods extract keyphrases according to their statistical properties in the given document. This makes a large vocabulary gap between a document and its keyphrases. ApproachProperty TFIDFstatistical frequencies TextRanktends to statistical frequencies ExpandRanktopic drift LDAsuggest general words 3

4 Intelligent Database Systems Lab Objectives We use word alignment models in statistical machine translation to learn translation probabilities between the words in documents and the words in keyphrases. 4

5 Intelligent Database Systems Lab Methodology- Bridging Vocabulary Gap Using WAM 5

6 Intelligent Database Systems Lab Methodology- Preparing Translation Pairs 6

7 Intelligent Database Systems Lab Methodology- Title-based Pairs 7

8 Intelligent Database Systems Lab Methodology- Summary-based Pairs ApproachProperty Sampling methodloses the order split methodLonger training time of WAM 8

9 Intelligent Database Systems Lab Methodology- Training Translation Models translation pair connection 9

10 Intelligent Database Systems Lab Methodology- Keyphrase Extraction Noun phrase normalized TFIDF scores 10

11 Intelligent Database Systems Lab Experiment Dataset: NameArticlekeyphrasesNumber of words Chinese news articles 13702website editors 72900 documentstitlessummaries average lengths971.711.645.8 5-fold cross validation 11

12 Intelligent Database Systems Lab Experiment- Evaluation on Keyphrase Extraction Performance Comparison and Analysis 12

13 Intelligent Database Systems Lab Experiment- Influences of Parameters to TPR Influence of Parameters When Titles/Summaries Are Unavailable 13

14 Intelligent Database Systems Lab Experiment - Beyond Extraction: Keyphrase Generation 14

15 Intelligent Database Systems Lab Conclusions We use IBM Model-1 to bridge the vocabulary gap between the two languages for keyphrase generation. 15

16 Intelligent Database Systems Lab Comments Advantages – Our method can capture the semantic relations between words in documents and keyphrases. Applications – Keyphrase extraction. 16


Download ppt "Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Xinxiong Chen, Yabin Zheng, Maosong Sun 2011, FCCNLL Automatic Keyphrase."

Similar presentations


Ads by Google