Presentation is loading. Please wait.

Presentation is loading. Please wait.

Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan.

Similar presentations


Presentation on theme: "Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan."— Presentation transcript:

1

2 Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan

3 2 Key Term Extraction, National Taiwan University

4 Definition Key Term Higher term frequency Core content Two types Keyword Key phrase Advantage Indexing and retrieval The relations between key terms and segments of documents 3 Key Term Extraction, National Taiwan University

5 Introduction 4 Key Term Extraction, National Taiwan University

6 Introduction 5 Key Term Extraction, National Taiwan University acoustic model language model hmm n gram phonehidden Markov model

7 Introduction 6 Key Term Extraction, National Taiwan University hmm acoustic model language model n gram phonehidden Markov model bigram Target: extract key terms from course lectures

8 7 Key Term Extraction, National Taiwan University

9 Automatic Key Term Extraction 8 Key Term Extraction, National Taiwan University ▼ Original spoken documents Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal ASR trans

10 Automatic Key Term Extraction 9 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal ASR trans

11 Automatic Key Term Extraction 10 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal ASR trans

12 Phrase Identification Automatic Key Term Extraction 11 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal First using branching entropy to identify phrases ASR trans

13 Phrase Identification Key Term Extraction Automatic Key Term Extraction 12 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal Key terms entropy acoustic model : Then using learning methods to extract key terms by some features ASR trans

14 Phrase Identification Key Term Extraction Automatic Key Term Extraction 13 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal Key terms entropy acoustic model : ASR trans

15 Branching Entropy 14 Key Term Extraction, National Taiwan University “hidden” is almost always followed by the same word hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :

16 Branching Entropy 15 Key Term Extraction, National Taiwan University “hidden” is almost always followed by the same word “hidden Markov” is almost always followed by the same word hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :

17 Branching Entropy 16 Key Term Extraction, National Taiwan University hidden Markov model boundary Define branching entropy to decide possible boundary How to decide the boundary of a phrase? represent is can : : is of in : : “hidden” is almost always followed by the same word “hidden Markov” is almost always followed by the same word “hidden Markov model” is followed by many different words

18 Branching Entropy 17 Key Term Extraction, National Taiwan University hidden Markov model Definition of Right Branching Entropy Probability of children x i for X Right branching entropy for X X xixi How to decide the boundary of a phrase? represent is can : : is of in : :

19 Branching Entropy 18 Key Term Extraction, National Taiwan University hidden Markov model Decision of Right Boundary Find the right boundary located between X and x i where X boundary How to decide the boundary of a phrase? represent is can : : is of in : :

20 Branching Entropy 19 Key Term Extraction, National Taiwan University hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :

21 Branching Entropy 20 Key Term Extraction, National Taiwan University hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :

22 Branching Entropy 21 Key Term Extraction, National Taiwan University hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :

23 Branching Entropy 22 Key Term Extraction, National Taiwan University hidden Markov model Decision of Left Boundary Find the left boundary located between X and x i where X: model Markov hidden How to decide the boundary of a phrase? boundary X represent is can : : is of in : : Using PAT Tree to implement

24 Branching Entropy 23 Key Term Extraction, National Taiwan University Implementation in the PAT tree Probability of children x i for X Right branching entropy for X hidden Markov 1 model 2 chain 3 state 5 distribution 6 variable 4 X x1x1 x2x2 X : hidden Markov x 1 : hidden Markov model x 2 : hidden Markov chain How to decide the boundary of a phrase?

25 Phrase Identification Key Term Extraction Automatic Key Term Extraction 24 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal Key terms entropy acoustic model : Extract prosodic, lexical, and semantic features for each candidate term ASR trans

26 Feature Extraction 25 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Speaker tends to use longer duration to emphasize key terms using 4 values for duration of the term duration of phone “a” normalized by avg duration of phone “a”

27 Feature Extraction 26 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Higher pitch may represent significant information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range)

28 Feature Extraction 27 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Higher pitch may represent significant information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Pitch (I - IV) F0 (max, min, mean, range)

29 Feature Extraction 28 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Higher energy emphasizes important information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Pitch (I - IV) F0 (max, min, mean, range)

30 Feature Extraction 29 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Higher energy emphasizes important information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Pitch (I - IV) F0 (max, min, mean, range) Energy (I - IV) energy (max, min, mean, range)

31 Feature Extraction 30 Key Term Extraction, National Taiwan University Lexical features Feature NameFeature Description TFterm frequency IDFinverse document frequency TFIDFtf * idf PoSthe PoS tag Using some well-known lexical features for each candidate term

32 Feature Extraction 31 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Probability Key terms tend to focus on limited topics D i : documents T k : latent topics t j : terms

33 Feature Extraction 32 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Probability Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics describe a probability distribution How to use it?

34 Feature Extraction 33 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Significance Within-topic to out-of-topic ratio Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics within-topic freq. out-of-topic freq.

35 Feature Extraction 34 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Significance Within-topic to out-of-topic ratio Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) LTS (I - III) Latent Topic Significance (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics within-topic freq. out-of-topic freq.

36 Feature Extraction 35 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Entropy Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) LTS (I - III) Latent Topic Significance (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics

37 Feature Extraction 36 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Entropy Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) LTS (I - III) Latent Topic Significance (mean, variance, standard deviation) LTEterm entropy for latent topic non-key term key term Key terms tend to focus on limited topics Higher LTE Lower LTE

38 Phrase Identification Key Term Extraction Automatic Key Term Extraction 37 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal ASR trans Key terms entropy acoustic model : Using unsupervised and supervised approaches to extract key terms

39 Learning Methods 38 Key Term Extraction, National Taiwan University Unsupervised learning K-means Exemplar  Transform a term into a vector in LTS (Latent Topic Significance) space  Run K-means  Find the centroid of each cluster to be the key term The candidate term in the same group are related to the key term The key term can represent this topic The terms in the same cluster focus on a single topic

40 Learning Methods 39 Key Term Extraction, National Taiwan University Supervised learning Adaptive Boosting Neural Network Automatically adjust the weights of features to produce a classifier

41 40 Key Term Extraction, National Taiwan University

42 Experiments 41 Key Term Extraction, National Taiwan University Corpus NTU lecture corpus  Mandarin Chinese embedded by English words  Single speaker  45.2 hours 我們的 solution 是 viterbi algorithm (Our solution is viterbi algorithm)

43 Experiments 42 Key Term Extraction, National Taiwan University ASR Accuracy LanguageMandarinEnglishOverall Char Acc (%)78.1553.4476.26 CHEN SI Model some data from target speaker AM Out-of-domain corpora Background In-domain corpus Adaptive trigram interpolation LM Bilingual AM and model adaptation

44 Experiments 43 Key Term Extraction, National Taiwan University Reference Key Terms Annotations from 61 students who have taken the course  If the k-th annotator labeled N k key terms, he gave each of them a score of, but 0 to others  Rank the terms by the sum of all scores given by all annotators for each term  Choose the top N terms form the list ( N is average N k ) N = 154 key terms  59 key phrases and 95 keywords

45 Experiments 44 Key Term Extraction, National Taiwan University Evaluation Unsupervised learning  Set the number of key terms to be N Supervised learning  3-fold cross validation

46 Experiments 45 Key Term Extraction, National Taiwan University Feature Effectiveness Neural network for keywords from ASR transcriptions Each set of these features alone gives F1 from 20% to 42% Prosodic features and lexical features are additiveThree sets of features are all useful 20.78 42.86 35.63 48.15 56.55 Pr: Prosodic Lx: Lexical Sm: Semantic F-measure

47 Experiments 46 Key Term Extraction, National Taiwan University Overall Performance 51.95 55.84 62.39 67.31 23.38 Conventional TFIDF scores w/o branching entropy stop word removal PoS filtering Branching entropy performs well K-means Exempler outperforms TFIDFSupervised approaches are better than unsupervised approaches F-measure AB: AdaBoost NN: Neural Network

48 Experiments 47 Key Term Extraction, National Taiwan University Overall Performance The performance of ASR is slightly worse than manual but reasonable Supervised learning using neural network gives the best results 23.38 20.78 51.95 43.51 55.84 52.60 62.39 57.68 67.31 62.70 F-measure AB: AdaBoost NN: Neural Network

49 48 Key Term Extraction, National Taiwan University

50 Conclusion 49 Key Term Extraction, National Taiwan University We propose the new approach to extract key terms The performance can be improved by Identifying phrases by branching entropy Prosodic, lexical, and semantic features together The results are encouraging

51 Thank reviewers for valuable comments NTU Virtual Instructor: http://speech.ee.ntu.edu.tw/~RA/lecture 50 Key Term Extraction, National Taiwan University


Download ppt "Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan."

Similar presentations


Ads by Google