Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automatic Transcription Reconstruction System (ATRS) Serguei Pakhomov Michael Schonwetter Joan Bachenko Lernout & Hauspie Healthcare Systems Group "I can't.

Similar presentations


Presentation on theme: "Automatic Transcription Reconstruction System (ATRS) Serguei Pakhomov Michael Schonwetter Joan Bachenko Lernout & Hauspie Healthcare Systems Group "I can't."— Presentation transcript:

1 Automatic Transcription Reconstruction System (ATRS) Serguei Pakhomov Michael Schonwetter Joan Bachenko Lernout & Hauspie Healthcare Systems Group "I can't believe it’s not literal!"

2 Outline of Talk Start Demo Processing Define Problem Describe ATRS Components Display ATRS Demo Results

3 Start Demo Processing

4 Medical Transcription Operation Partial Transcriptions are the commercial product of the operation –Partial Transcripts are plentiful –Can be paired with speech files Human Transcriptionist Partial TranscriptionTel. Speech

5 Sample Partial Transcription

6 Literal Transcription Generation

7 Summary Problems Addressed –Partial Transcriptions are Available but Inadequate –Literal Transcriptions are Essential –Human Generated Literal Transcriptions are: Expensive & Error Prone Suggested Solution –Recycle Partial Transcriptions with ASR to Generate Semi-Literal Transcriptions

8 ATRS I/O –Inputs: Partial transcript (RTF) Digitized telephony speech (8KHz Mulaw) –Outputs semi-literal transcript in its several variants speech-text alignment for assisting in generating literal truth Partial Transcription Speech ATRS Semi-Literal for AM Semi-Literal for LM Aligned Semi-Literal with Digitized Speech

9 Description of DPTRS APFSM Dictionary Partial Transcript Recognizer Rec. Output Integrator Semi- Literal Transcript Speech

10 Supporting Models Probabilistic Finite State Model (PFSM) Filled Pause Model Background Model Augmented Probabilistic Finite State Model (APFSM)

11 Filled Pause Model Training Corpus with natural Filled Pauses (FP) FP distribution extractor FP distribution model Partial transcription corpus with no FP’s FP distributor Partial transcription corpus with artificial FP’s Language modeling software Filled Pause Model

12 Background Model Literal Transcriptions Corpus Partial Transcriptions Corpus Difference Extractor Corpus of phrases spoken but not transcribed (Out Of Transcription(OOT) corpus) Language modeling software Background Model

13 Generate Dictionary Reduce phonetic confusability –limit entries to those items in the transcription (and supporting models). Dynamically generate pronunciations –for items in the partial transcript which are out of vocabulary.

14 Recognition Pass Dictation processed by recognition engine using: –APFSM –Custom Dictionary –SI Acoustic Model

15 Integration Recognizer output (HYP) is compared to Partial transcript (REF). –For Acoustic Modeling: –Matches –Substitutions: Use REF portion –Insertions: Filled-Pauses, Punctuation –For Language Modeling: –Matches –Substitutions: Use REF portion –Insertions: Use ALL Insertions

16 Integrator 12345 REFHYP LABELAMLMsemi-lit thatthatMATCHthatthat she she MATCHsheshe bemeSUBSTITUTIONbebe treated treated MATCHtreatedtreated for-- DELETION--for twelvetwelve MATCHtwelvetwelve weeksweeks MATCHweeksweeks --ahINSERTIONahah -- periodINSERTIONperiodperiod --onINSERTION--on threethree MATCHthreethree --excuseINSERTION--excuse --meINSERTION--me plantar plantar MATCHplantarplantar warts wartsMATCHwartswarts

17 Semi-Literal Transcript

18 Results: Compare to Literal Transcripts (n=774) –Alignment of Partial vs. Literal –Alignment of Semi-Lit vs. Literal yields 4.4% (absolute) better alignment

19 View Demo Results

20 Contact Information Contact Info –Serguei Pakhomov Spakhomov@LHSL.com –Michael Schonwetter Mschonwetter@LHSL.com –Joan Bachenko Joan-B@LHSL.com


Download ppt "Automatic Transcription Reconstruction System (ATRS) Serguei Pakhomov Michael Schonwetter Joan Bachenko Lernout & Hauspie Healthcare Systems Group "I can't."

Similar presentations


Ads by Google