Presentation is loading. Please wait.

Presentation is loading. Please wait.

Twenty-First Century Automatic Speech Recognition: Meeting Rooms and Beyond ASR 2000 September 20, 2000 John Garofolo

Similar presentations


Presentation on theme: "Twenty-First Century Automatic Speech Recognition: Meeting Rooms and Beyond ASR 2000 September 20, 2000 John Garofolo"— Presentation transcript:

1 Twenty-First Century Automatic Speech Recognition: Meeting Rooms and Beyond ASR 2000 September 20, 2000 John Garofolo John.Garofolo@NIST.gov

2 Challenges Target for the new millenium in ASR Technology: –Meeting Room Transcription and Annotation Task multiple sensors –stationary, mobile, and arrays of mics in conjunction with video input devices noise and microphone robustness speaker-independent recognition speaker identification  automatic production of usable transcriptions with speakers identified and with properly formatted, capitalized, and punctuated text.  Perfect research task to move forward the state-of-the-art Development infrastructure will require –new metrics, evaluation tools –new I/O specifications –research corpora, new methods of collecting, compiling, and annotating data

3 NIST Proposed Initiative Collaborate with ASR research community to create evaluation infrastructure Develop corpus design and transcription and ASR system output specifications Revise and update NIST SCLITE ASR scoring software to extend beyond classical word error rate measurements Collaborate with NIST Smart Space Lab to collect, transcribe, and annotate a pilot meeting room transcription corpus Sponsor Evaluations and Workshops

4 Meeting Room Pilot Corpus Meeting type: –Possible focus group discussions requiring information lookup and real consensus building Participants: –At least 4 per meeting plus moderator –Native speakers? Multi-microphones: –Head-mounted ‘control’ –Microphone array –Lapel mikes worn by, or desk-top mikes for each participant –Table/wall-mounted stationary mikes Video: –Wide-angle view positioned so that it can be correlated with mike array for source location. Possibly other views to capture faces head-on. Annotation: –Transcription (words with capitalization/punctuation) –Speaker ID –Background noise conditions –Some initial exploration of annotating dialogue, people movement, gestures, lip movement, interaction with devices

5 NIST Smart Space Test Bed Laboratory 59-mic array, assorted conventional mics Cameras/video capture Large screen display Pervasive devices –Palm tops –Tablets –Wireless LAN Data collection servers Gigabit Ethernet High-bandwidth data flow system  Well-suited for creating pilot meeting corpus Camera Element Microphone Array Beams Equipment Room Large Screen Display Camera Elements

6 Approach for 2000 - 2001 NIST will collaborate closely with a few research sites who will be the early users of the data to create the project specifications. –Via E-mail list and Web site NIST will create a pilot meeting room data collection –Data storage will be a significant issue NIST will create evaluation software for the new domain –Update SCLITE + detection-based scoring software If feasible, NIST will coordinate an experimental evaluation –Late summer/early Fall 2001 NIST will host a workshop (~October 2001) –to discuss research issues –to introduce the pilot corpus to the wider research community –to discuss evaluation metrics and the dry-run evaluation –to plan for future efforts (kickoff for larger DARPA program?)

7 21 st Century Automatic Speech Recognition: Meeting Rooms and Beyond John Garofolo John.Garofolo@nist.gov NIST Speech Group: http://www.nist.gov/speech NIST Smart Space Lab: http://www.nist.gov/smartspace/ ASR 2000 September 20, 2000


Download ppt "Twenty-First Century Automatic Speech Recognition: Meeting Rooms and Beyond ASR 2000 September 20, 2000 John Garofolo"

Similar presentations


Ads by Google