A Survey for Interspeech 2013. Xavier Anguera Information Retrieval-based Dynamic TimeWarping.

Slides:



Advertisements
Similar presentations
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Advertisements

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Learning to Cluster Web Search Results SIGIR 04. ABSTRACT Organizing Web search results into clusters facilitates users quick browsing through search.
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
Sequence Clustering and Labeling for Unsupervised Query Intent Discovery Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: WSDM’12 Date: 1 November,
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
Information Extraction and Ontology Learning Guided by Web Directory Authors:Martin Kavalec Vojtěch Svátek Presenter: Mark Vickers.
Presented by Zeehasham Rasheed
Using Relevance Feedback in Multimedia Databases
1 Today  Tools (Yves)  Efficient Web Browsing on Hand Held Devices (Shrenik)  Web Page Summarization using Click- through Data (Kathy)  On the Summarization.
Scalable Text Mining with Sparse Generative Models
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Introduction to Machine Learning Approach Lecture 5.
Information Retrieval in Practice
Advisor: Hsin-Hsi Chen Reporter: Chi-Hsin Yu Date:
Lightly Supervised and Unsupervised Acoustic Model Training Lori Lamel, Jean-Luc Gauvain and Gilles Adda Spoken Language Processing Group, LIMSI, France.
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Ronan Collobert Jason Weston Leon Bottou Michael Karlen Koray Kavukcouglu Pavel Kuksa.
Search Engines and Information Retrieval Chapter 1.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
1 Wikification CSE 6339 (Section 002) Abhijit Tendulkar.
Survey of Semantic Annotation Platforms
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
A Weakly-Supervised Approach to Argumentative Zoning of Scientific Documents Yufan Guo Anna Korhonen Thierry Poibeau 1 Review By: Pranjal Singh Paper.
Discriminative Models for Spoken Language Understanding Ye-Yi Wang, Alex Acero Microsoft Research, Redmond, Washington USA ICSLP 2006.
1 A Hierarchical Approach to Wrapper Induction Presentation by Tim Chartrand of A paper bypaper Ion Muslea, Steve Minton and Craig Knoblock.
GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Bootstrapping for Text Learning Tasks Ramya Nagarajan AIML Seminar March 6, 2001.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
LOGO 1 Corroborate and Learn Facts from the Web Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : Shubin Zhao, Jonathan Betz (KDD '07 )
1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,
Finding frequent and interesting triples in text Janez Brank, Dunja Mladenić, Marko Grobelnik Jožef Stefan Institute, Ljubljana, Slovenia.
Ranking Definitions with Supervised Learning Methods J.Xu, Y.Cao, H.Li and M.Zhao WWW 2005 Presenter: Baoning Wu.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
4. Relationship Extraction Part 4 of Information Extraction Sunita Sarawagi 9/7/2012CS 652, Peter Lindes1.
Post-Ranking query suggestion by diversifying search Chao Wang.
Unsupervised Relation Detection using Automatic Alignment of Query Patterns extracted from Knowledge Graphs and Query Click Logs Panupong PasupatDilek.
Extracting and Ranking Product Features in Opinion Documents Lei Zhang #, Bing Liu #, Suk Hwan Lim *, Eamonn O’Brien-Strain * # University of Illinois.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
Information Extraction Entity Extraction: Statistical Methods Sunita Sarawagi.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
BOOTSTRAPPING INFORMATION EXTRACTION FROM SEMI-STRUCTURED WEB PAGES Andrew Carson and Charles Schafer.
Natural Language Processing Group Computer Sc. & Engg. Department JADAVPUR UNIVERSITY KOLKATA – , INDIA. Professor Sivaji Bandyopadhyay
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Recent Paper of Md. Akmal Haidar Meeting before ICASSP 2013 報告者:郝柏翰 2013/05/23.
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
Language Identification and Part-of-Speech Tagging
Sentiment analysis algorithms and applications: A survey
Presented by: Hassan Sayyadi
CSCE 590 Web Scraping – Information Retrieval
Associative Query Answering via Query Feature Similarity
Introduction Task: extracting relational facts from text
Automatic Detection of Causal Relations for Question Answering
Topic: Semantic Text Mining
Presentation transcript:

A Survey for Interspeech 2013

Xavier Anguera Information Retrieval-based Dynamic TimeWarping

The goal of Dynamic time warping system is to find matching subsequences in two time series Recent DTW approaches usually require prior knowledge of start/end time, and memory-hunger This paper propose a matching approach- IR-DTW by using a vector of counts, which is inspired by IR community Introduction

The IR-DRW algorithm Information Retrieval-based Dynamic TimeWarping

Linear/Diagonal Subsequence Matching Information Retrieval-based Dynamic TimeWarping

Non-linear subsequence matching algorithm Information Retrieval-based Dynamic TimeWarping

query ref m

Information Retrieval-based Dynamic TimeWarping

query ref m maxQDist

Time warping constraints Information Retrieval-based Dynamic TimeWarping

The time sequences are constructed by MFCC-39 vector Mediaeval 2012 SWS is used as corpus, which contains 7.5 hours of telephone record Minimum Term Weighted Value is used to measure the performance Experiment and Result

This paper presents the IR-DTW algorithm to find non-linearly matching sequences and reduce the memory use The future work includes using different parts of the algorithm to close the S-DTW Conclusion

Larry Heck, Dilek Hakkani-T¨ur, Gokhan Tur Leveraging Knowledge Graphs forWeb-Scale Unsupervised Semantic Parsing

Spoken language understanding systems aim to automatically identify the intent of the user as expressed in natural language, extract as sociated arguments or slots Most SLU systems are based on statistical methods, and these methods usually rely on supervised training instances This paper leverage web-scale semantic graphs to bootstrap a web-scale semantic parser with no requirement for semantic schema design, no data collection, and no manual annotations Introduction

We align the knowledge populated in the semantic graph with the related documents, and transfer entity annotations Then we use these to bootstrap models and further improve them by combining with gazetteers mined from the knowledge graphs and adapting them to the target domain with a MAP-style algorithm Introduction

Gazetteers is a important feature in SLU, can also seemed as entity lists But gazetteers usually contain ambiguous, confusable or incorrect phrases; to improve the precision, the method learns from user clicks uses the results to compare the importance of entities Refining Gazetteers with Web Search

The unsupervised graph crawling algorithm is summarized as a sequence of the following 6 steps 1.Initialize the crawl Select a entity as the “central pivot node” 2.Retrieve sources of NL surface forms Use the CPN to retrieve documents; the documents are used as resources of natural language surface forms 3.Annotate with first order Use this gazetteer to automatically annotate the sentences from the retrieved documents These annotations will be used as “truth” labels for subsequent (statistical) training passes Unsupervised Data Mining with Knowledge Graphs

4.Extract features with large-scale entity lists For the CPN and each of its K related properties, enumerate all possible entity instances and form large-scale gazetteers The web search refining method is used here to increase the precision 5.Annotate with high order relations Extending to higher order relations, and repeat step 3~4 Documents retrieved above again annotate by these relations 6.Crawl graph to select new CPN and repeat Unsupervised Data Mining with Knowledge Graphs

With high precision label of sentences, we can generate millions if auto annotate sentences for statistical semantic parsers we frame the entity extraction (slot filling) task as a sequence classification problem to obtain the most probable entity sequence; we use discriminative conditional random fields (CRFs) for modeling Modeling Entities for Semantic Parsing

After the transition and emission probabilities are optimized, the most probable state sequence, Y, can be determined using the well-known Viterbi algorithm Modeling Entities for Semantic Parsing

We leverage these patterns to induce semantic parsing grammars or templates, and then use the templates to spot entities –ent(movie name) -> rel(directed by) -> ent(director) we use the grammars of the entity-relationbased parser as a final pass after the entity extraction parsing Modeling Relations for Semantic Parsing

Experiments and Results

We develop a graph crawling algorithm for data mining, and two entity extraction approaches: CRF-based method and a relation model with induced entity extraction grammars We also investigate the impact of higher order knowledge graph relations on semantic parsing Conclusion