Document Expansion for Speech Retrieval (Singhal, Pereira)

Slides:



Advertisements
Similar presentations
Keyboarding Vocabulary III Finals Study Guide Basic Computer.
Advertisements

Multimedia Database Systems
Elliot Holt Kelly Peterson. D4 – Smells Like D3 Primary Goal – improve D3 MAP with lessons learned After many experiments: TREC 2004 MAP = >
CS144: Spatial Index. Example Dataset Grid File (2 points per bucket)
PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement: Relevance Feedback Information Filtering.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
MODULATION SPECTRUM EQUALIZATION FOR ROBUST SPEECH RECOGNITION Source: Automatic Speech Recognition & Understanding, ASRU. IEEE Workshop on Author.
Efficient Computer Interfaces Using Continuous Gestures, Language Models, and Speech Keith Vertanen Inference Group August 4th, 2004.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
A novel log-based relevance feedback technique in content- based image retrieval Reporter: Francis 2005/6/2.
Text Retrieval and Spreadsheets Class 4 LBSC 690 Information Technology.
Databases and Processing Modes. Fundamental Data Storage Concepts and Definitions What is an entity? An entity is something about which information is.
Review of ICASSP 2004 Arthur Chan. Part I of This presentation (6 pages) Pointers of ICASSP 2004 (2 pages) NIST Meeting Transcription Workshop (2 pages)
Access to News Audio User Interaction in Speech Retrieval Systems by Jinmook Kim and Douglas W. Oard May 31, th Annual Symposium and Open House.
Maximum Bottleneck Paths. 2 Maximum Bottleneck Path p q wt  index parent weight 7.
Efficient Computer Interfaces Using Continuous Gestures, Language Models, and Speech Keith Vertanen July 30 th, 2004.
Automatically obtain a description for a larger cluster of relevant documents Identify terms related to query terms  Synonyms, stemming variations, terms.
Automatic Term Mismatch Diagnosis for Selective Query Expansion Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie.
Importance of IT in Accounting
Automatic Transcript Generation Helmer Strik A 2 RT Dept. of Language & Speech University of Nijmegen.
1 Probabilistic Language-Model Based Document Retrieval.
Li Deng Microsoft Research Redmond, WA Presented at the Banff Workshop, July 2009 From Recognition To Understanding Expanding traditional scope of signal.
Lecture #32 WWW Search. Review: Data Organization Kinds of things to organize –Menu items –Text –Images –Sound –Videos –Records (I.e. a person ’ s name,
Oracle Data Block Oracle Concepts Manual. Oracle Rows Oracle Concepts Manual.
Translingual Topic Tracking with PRISE Gina-Anne Levow and Douglas W. Oard University of Maryland February 28, 2000.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Angela Bonifati, “Active XQuery”, ICDE Active XQuery A. Bonifati, D. Braga, A. Campi, S. Ceri Politecnico di Milano (Italy)
Word and Sub-word Indexing Approaches for Reducing the Effects of OOV Queries on Spoken Audio Beth Logan Pedro J. Moreno Om Deshmukh Cambridge Research.
DATA ERRORS. Introduction The processing of incorrect data can produce ridiculous and embarrassing output. Errors can take time to sort out and can be.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
Course ILT Graphics and mail merge Unit objectives Insert clipart and charts, add AutoShapes, insert and format a picture, and delete graphics Create a.
© 2004 Chris Staff CSAW’04 University of Malta of 15 Expanding Query Terms in Context Chris Staff and Robert Muscat Department of.
RDBMS MySQL. MySQL is a Relational Database Management System MySQL allows multiple tables to be related to each other. Similar to a Grandparent to a.
Index Tuning Conventional index. Overview.
Database and Information Management Chapter 9 – Computers: Understanding Technology, 3 rd edition.
Implementation of a Relational Database as an Aid to Automatic Target Recognition Christopher C. Frost Computer Science Mentor: Steven Vanstone.
Adding SubtractingMultiplyingDividingMiscellaneous.
How to add recorded speech to slides in Power Point so they read automatically when you progress to the next slide.
Generating Query Substitutions Alicia Wood. What is the problem to be solved?
10.0 Latent Semantic Analysis for Linguistic Processing References : 1. “Exploiting Latent Semantic Information in Statistical Language Modeling”, Proceedings.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Statistical techniques for video analysis and searching chapter Anton Korotygin.
The Legal Discovery Track at Iowa: an overview Brian Almquist and Padmini Srinivasan TREC November 2007r 2007.
Queries and Interfaces
Automatic Transcription of Polyphonic Music
Tolerant Retrieval Review Questions
Digital Video Library - Jacky Ma.
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Audio Segmentation, Classification, and Retrieval
Lecture 12: Relevance Feedback & Query Expansion - II
An Automatic Construction of Arabic Similarity Thesaurus
Experiments for the CL-SR task at CLEF 2006
Course Projects Speech Recognition Spring 1386
Arrays and the ArrayList Class The ArrayList Class
Do-Gil Lee1*, Ilhwan Kim1 and Seok Kee Lee2
Microsoft Access 2003 Illustrated Complete
Website:
BACK SOLUTION:
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Multimedia Information Retrieval
Case-Based Reasoning System for Bearing Design
ريكاوري (بازگشت به حالت اوليه)
CS224N: Query Focused Multi-Document Summarization
Adding with 9’s.
Adding with 10’s.
Learning Intention I will learn about the different types of programming errors.
Adding ____ + 10.
Presentation transcript:

Document Expansion for Speech Retrieval (Singhal, Pereira) Teoman Toraman Çağrı Toraman Bilkent University, 2010

Reasonable Transcription File: (or Manual) Speech Recognition Problem Statement Reasonable Transcription File: news_today.rtf Speech File: news_today.wav Automatic (or Manual) Speech Recognition 2 / 10

Fatal train crash in Italy Problem Statement Aboutness: Fatal train crash in Italy Query Indexing Results: D1, D2 3 / 10

Problem Statement Corrupted / Erroneous Erroneous Transcription File Noisy / Dirty Sound File Automatic (or Manual) Speech Recognition Corrupted / Erroneous 4 / 10

(Vocabulary Mismatch) Problem Statement Same Query Erroneous Corrupted / Erroneous Indexing Results: D2 (Vocabulary Mismatch) 5 / 10

Recognition Mistakes: Problem Statement Noisy / Dirty Sound File Automatic (or Manual) Speech Recognition Corrupted / Erroneous Recognition Mistakes: Deletions Wrong term weighting Insertions 6 / 10

Solution Corrupted / Erroneous Expanded Document Expansion 7 / 10

What is Document Expansion ? Solution What is Document Expansion ? Step 2) Step 3) Step 1) RELATED CORPUS Corrupted / Erroneous Reweighing & Adding New Terms ... 10 similar files 8 / 10

Experiments & Results 9 / 10

Experiments & Results %10-15 loss %20-25 loss 10 / 10