Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan.

Slides:



Advertisements
Similar presentations
Machine Learning Approaches to the Analysis of Large Corpora : A Survey Xunlei Rose Hu and Eric Atwell University of Leeds.
Advertisements

Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
Punctuation Generation Inspired Linguistic Features For Mandarin Prosodic Boundary Prediction CHEN-YU CHIANG, YIH-RU WANG AND SIN-HORNG CHEN 2012 ICASSP.
Sub-Project I Prosody, Tones and Text-To-Speech Synthesis Sin-Horng Chen (PI), Chiu-yu Tseng (Co-PI), Yih-Ru Wang (Co-PI), Yuan-Fu Liao (Co-PI), Lin-shan.
Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis.
Acoustic Model Adaptation Based On Pronunciation Variability Analysis For Non-Native Speech Recognition Yoo Rhee Oh, Jae Sam Yoon, and Hong Kook Kim Dept.
15.0 Utterance Verification and Keyword/Key Phrase Spotting References: 1. “Speech Recognition and Utterance Verification Based on a Generalized Confidence.
Supervised Learning Recap
REDUCED N-GRAM MODELS FOR IRISH, CHINESE AND ENGLISH CORPORA Nguyen Anh Huy, Le Trong Ngoc and Le Quan Ha Hochiminh City University of Industry Ministry.
1 A scheme for racquet sports video analysis with the combination of audio-visual information Visual Communication and Image Processing 2005 Liyuan Xing,
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Speaker: Yun-Nung Chen 陳縕儂 Advisor: Prof. Lin-Shan Lee 李琳山 National Taiwan University Automatic Key Term Extraction and Summarization from Spoken Course.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
Automatic Prosody Labeling Final Presentation Andrew Rosenberg ELEN Speech and Audio Processing and Recognition 4/27/05.
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg Weekly Speech Lab Talk 6/27/06.
Statistical techniques in NLP Vasileios Hatzivassiloglou University of Texas at Dallas.
Improved Tone Modeling for Mandarin Broadcast News Speech Recognition Xin Lei 1, Manhung Siu 2, Mei-Yuh Hwang 1, Mari Ostendorf 1, Tan Lee 3 1 SSLI Lab,
Goal: Goal: Learn to automatically  File s into folders  Filter spam Motivation  Information overload - we are spending more and more time.
Introduction to Machine Learning Approach Lecture 5.
May 20, 2006SRIV2006, Toulouse, France1 Acoustic Modeling of Accented English Speech for Large-Vocabulary Speech Recognition ATR Spoken Language Communication.
Keyphrase Extraction in Scientific Documents Thuy Dung Nguyen and Min-Yen Kan School of Computing National University of Singapore Slides available at.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Zero Resource Spoken Term Detection on STD 06 dataset Justin Chiu Carnegie Mellon University 07/24/2012, JHU.
Text mining.
1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.
1 Wikification CSE 6339 (Section 002) Abhijit Tendulkar.
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
Summary  The task of extractive speech summarization is to select a set of salient sentences from an original spoken document and concatenate them to.
Feature selection LING 572 Fei Xia Week 4: 1/29/08 1.
Japanese Spontaneous Spoken Document Retrieval Using NMF-Based Topic Models Xinhui Hu, Hideki Kashioka, Ryosuke Isotani, and Satoshi Nakamura National.
Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,
Pseudo-supervised Clustering for Text Documents Marco Maggini, Leonardo Rigutini, Marco Turchi Dipartimento di Ingegneria dell’Informazione Università.
An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee
Speech Signal Processing I By Edmilson Morais And Prof. Greg. Dogil Second Lecture Stuttgart, October 25, 2001.
Text mining. The Standard Data Mining process Text Mining Machine learning on text data Text Data mining Text analysis Part of Web mining Typical tasks.
1 Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition Qi Li, Senior Member, IEEE, Jinsong Zheng, Augustine.
1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.
Prototype-Driven Learning for Sequence Models Aria Haghighi and Dan Klein University of California Berkeley Slides prepared by Andrew Carlson for the Semi-
Automatic Cue-Based Dialogue Act Tagging Discourse & Dialogue CMSC November 3, 2006.
Non-Photorealistic Rendering and Content- Based Image Retrieval Yuan-Hao Lai Pacific Graphics (2003)
Tokenization & POS-Tagging
Chapter 23: Probabilistic Language Models April 13, 2004.
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
1 Prosody-Based Automatic Segmentation of Speech into Sentences and Topics Elizabeth Shriberg Andreas Stolcke Speech Technology and Research Laboratory.
National Taiwan University, Taiwan
Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.
From Text to Image: Generating Visual Query for Image Retrieval Wen-Cheng Lin, Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information.
Results of the 2000 Topic Detection and Tracking Evaluation in Mandarin and English Jonathan Fiscus and George Doddington.
Latent Topic Modeling of Word Vicinity Information for Speech Recognition Kuan-Yu Chen, Hsuan-Sheng Chiu, Berlin Chen ICASSP 2010 Hao-Chin Chang Department.
Performance Comparison of Speaker and Emotion Recognition
1 Unsupervised Adaptation of a Stochastic Language Model Using a Japanese Raw Corpus Gakuto KURATA, Shinsuke MORI, Masafumi NISHIMURA IBM Research, Tokyo.
Predicting Voice Elicited Emotions
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Copyright © 2013 by Educational Testing Service. All rights reserved. Evaluating Unsupervised Language Model Adaption Methods for Speaking Assessment ShaSha.
Text Categorization by Boosting Automatically Extracted Concepts Lijuan Cai and Tommas Hofmann Department of Computer Science, Brown University SIGIR 2003.
Combining Text and Image Queries at ImageCLEF2005: A Corpus-Based Relevance-Feedback Approach Yih-Cheng Chang Department of Computer Science and Information.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
Recent Paper of Md. Akmal Haidar Meeting before ICASSP 2013 報告者:郝柏翰 2013/05/23.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Sentiment analysis algorithms and applications: A survey
Tools for Natural Language Processing Applications
11.0 Spoken Document Understanding and Organization for User-content Interaction References: 1. “Spoken Document Understanding and Organization”, IEEE.
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
Intent-Aware Semantic Query Annotation
Handwritten Characters Recognition Based on an HMM Model
Hsien-Chin Lin, Chi-Yu Yang, Hung-Yi Lee, Lin-shan Lee
Automatic Prosodic Event Detection
Presentation transcript:

Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan

2 Key Term Extraction, National Taiwan University

Definition Key Term Higher term frequency Core content Two types Keyword Key phrase Advantage Indexing and retrieval The relations between key terms and segments of documents 3 Key Term Extraction, National Taiwan University

Introduction 4 Key Term Extraction, National Taiwan University

Introduction 5 Key Term Extraction, National Taiwan University acoustic model language model hmm n gram phonehidden Markov model

Introduction 6 Key Term Extraction, National Taiwan University hmm acoustic model language model n gram phonehidden Markov model bigram Target: extract key terms from course lectures

7 Key Term Extraction, National Taiwan University

Automatic Key Term Extraction 8 Key Term Extraction, National Taiwan University ▼ Original spoken documents Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal ASR trans

Automatic Key Term Extraction 9 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal ASR trans

Automatic Key Term Extraction 10 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal ASR trans

Phrase Identification Automatic Key Term Extraction 11 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal First using branching entropy to identify phrases ASR trans

Phrase Identification Key Term Extraction Automatic Key Term Extraction 12 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal Key terms entropy acoustic model : Then using learning methods to extract key terms by some features ASR trans

Phrase Identification Key Term Extraction Automatic Key Term Extraction 13 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal Key terms entropy acoustic model : ASR trans

Branching Entropy 14 Key Term Extraction, National Taiwan University “hidden” is almost always followed by the same word hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :

Branching Entropy 15 Key Term Extraction, National Taiwan University “hidden” is almost always followed by the same word “hidden Markov” is almost always followed by the same word hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :

Branching Entropy 16 Key Term Extraction, National Taiwan University hidden Markov model boundary Define branching entropy to decide possible boundary How to decide the boundary of a phrase? represent is can : : is of in : : “hidden” is almost always followed by the same word “hidden Markov” is almost always followed by the same word “hidden Markov model” is followed by many different words

Branching Entropy 17 Key Term Extraction, National Taiwan University hidden Markov model Definition of Right Branching Entropy Probability of children x i for X Right branching entropy for X X xixi How to decide the boundary of a phrase? represent is can : : is of in : :

Branching Entropy 18 Key Term Extraction, National Taiwan University hidden Markov model Decision of Right Boundary Find the right boundary located between X and x i where X boundary How to decide the boundary of a phrase? represent is can : : is of in : :

Branching Entropy 19 Key Term Extraction, National Taiwan University hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :

Branching Entropy 20 Key Term Extraction, National Taiwan University hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :

Branching Entropy 21 Key Term Extraction, National Taiwan University hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :

Branching Entropy 22 Key Term Extraction, National Taiwan University hidden Markov model Decision of Left Boundary Find the left boundary located between X and x i where X: model Markov hidden How to decide the boundary of a phrase? boundary X represent is can : : is of in : : Using PAT Tree to implement

Branching Entropy 23 Key Term Extraction, National Taiwan University Implementation in the PAT tree Probability of children x i for X Right branching entropy for X hidden Markov 1 model 2 chain 3 state 5 distribution 6 variable 4 X x1x1 x2x2 X : hidden Markov x 1 : hidden Markov model x 2 : hidden Markov chain How to decide the boundary of a phrase?

Phrase Identification Key Term Extraction Automatic Key Term Extraction 24 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal Key terms entropy acoustic model : Extract prosodic, lexical, and semantic features for each candidate term ASR trans

Feature Extraction 25 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Speaker tends to use longer duration to emphasize key terms using 4 values for duration of the term duration of phone “a” normalized by avg duration of phone “a”

Feature Extraction 26 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Higher pitch may represent significant information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range)

Feature Extraction 27 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Higher pitch may represent significant information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Pitch (I - IV) F0 (max, min, mean, range)

Feature Extraction 28 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Higher energy emphasizes important information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Pitch (I - IV) F0 (max, min, mean, range)

Feature Extraction 29 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Higher energy emphasizes important information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Pitch (I - IV) F0 (max, min, mean, range) Energy (I - IV) energy (max, min, mean, range)

Feature Extraction 30 Key Term Extraction, National Taiwan University Lexical features Feature NameFeature Description TFterm frequency IDFinverse document frequency TFIDFtf * idf PoSthe PoS tag Using some well-known lexical features for each candidate term

Feature Extraction 31 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Probability Key terms tend to focus on limited topics D i : documents T k : latent topics t j : terms

Feature Extraction 32 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Probability Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics describe a probability distribution How to use it?

Feature Extraction 33 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Significance Within-topic to out-of-topic ratio Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics within-topic freq. out-of-topic freq.

Feature Extraction 34 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Significance Within-topic to out-of-topic ratio Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) LTS (I - III) Latent Topic Significance (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics within-topic freq. out-of-topic freq.

Feature Extraction 35 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Entropy Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) LTS (I - III) Latent Topic Significance (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics

Feature Extraction 36 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Entropy Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) LTS (I - III) Latent Topic Significance (mean, variance, standard deviation) LTEterm entropy for latent topic non-key term key term Key terms tend to focus on limited topics Higher LTE Lower LTE

Phrase Identification Key Term Extraction Automatic Key Term Extraction 37 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal ASR trans Key terms entropy acoustic model : Using unsupervised and supervised approaches to extract key terms

Learning Methods 38 Key Term Extraction, National Taiwan University Unsupervised learning K-means Exemplar  Transform a term into a vector in LTS (Latent Topic Significance) space  Run K-means  Find the centroid of each cluster to be the key term The candidate term in the same group are related to the key term The key term can represent this topic The terms in the same cluster focus on a single topic

Learning Methods 39 Key Term Extraction, National Taiwan University Supervised learning Adaptive Boosting Neural Network Automatically adjust the weights of features to produce a classifier

40 Key Term Extraction, National Taiwan University

Experiments 41 Key Term Extraction, National Taiwan University Corpus NTU lecture corpus  Mandarin Chinese embedded by English words  Single speaker  45.2 hours 我們的 solution 是 viterbi algorithm (Our solution is viterbi algorithm)

Experiments 42 Key Term Extraction, National Taiwan University ASR Accuracy LanguageMandarinEnglishOverall Char Acc (%) CHEN SI Model some data from target speaker AM Out-of-domain corpora Background In-domain corpus Adaptive trigram interpolation LM Bilingual AM and model adaptation

Experiments 43 Key Term Extraction, National Taiwan University Reference Key Terms Annotations from 61 students who have taken the course  If the k-th annotator labeled N k key terms, he gave each of them a score of, but 0 to others  Rank the terms by the sum of all scores given by all annotators for each term  Choose the top N terms form the list ( N is average N k ) N = 154 key terms  59 key phrases and 95 keywords

Experiments 44 Key Term Extraction, National Taiwan University Evaluation Unsupervised learning  Set the number of key terms to be N Supervised learning  3-fold cross validation

Experiments 45 Key Term Extraction, National Taiwan University Feature Effectiveness Neural network for keywords from ASR transcriptions Each set of these features alone gives F1 from 20% to 42% Prosodic features and lexical features are additiveThree sets of features are all useful Pr: Prosodic Lx: Lexical Sm: Semantic F-measure

Experiments 46 Key Term Extraction, National Taiwan University Overall Performance Conventional TFIDF scores w/o branching entropy stop word removal PoS filtering Branching entropy performs well K-means Exempler outperforms TFIDFSupervised approaches are better than unsupervised approaches F-measure AB: AdaBoost NN: Neural Network

Experiments 47 Key Term Extraction, National Taiwan University Overall Performance The performance of ASR is slightly worse than manual but reasonable Supervised learning using neural network gives the best results F-measure AB: AdaBoost NN: Neural Network

48 Key Term Extraction, National Taiwan University

Conclusion 49 Key Term Extraction, National Taiwan University We propose the new approach to extract key terms The performance can be improved by Identifying phrases by branching entropy Prosodic, lexical, and semantic features together The results are encouraging

Thank reviewers for valuable comments NTU Virtual Instructor: 50 Key Term Extraction, National Taiwan University