Improved Name Recognition with Meta-data Dependent Name Networks published by Sameer R. Maskey, Michiel Bacchiani, Brian Roark, and Richard Sproat presented by Irina Likhtina
Problem: General Name Transcription Improvement Solutions that use no prior knowledge require large increase in Size Size Complexity Complexity Ambiguity Ambiguity because any improvement accomplished before was possible only through an increase in lexicon used.
Paper’s Solution Use meta-data at runtime in the form of Use meta-data at runtime in the form of Caller ID string – “cname” Caller ID string – “cname” Name of mailbox owner – “mname” Name of mailbox owner – “mname” This meta-data is reasonably available due to the prevalence of caller identification given by phone companies.
Database Used Scanmail training corpus of 100 hours of voic messages from 140 employees of AT&T. Scanmail training corpus of 100 hours of voic messages from 140 employees of AT&T. Manually transcribed with “cname” and “mname” tags Manually transcribed with “cname” and “mname” tags Gender balanced Gender balanced ~12% non-native speakers ~12% non-native speakers 238 random messages for testing, everything else (~ 10,000 messages) for training 238 random messages for testing, everything else (~ 10,000 messages) for training
Approach Three steps of the algorithm: Three steps of the algorithm: create a class-based language model create a class-based language model create a name network that will give instances for the classes of the model create a name network that will give instances for the classes of the model replace the class-based language model at runtime with the name networks replace the class-based language model at runtime with the name networks
Class-Based Language Model Manual tags of mailbox name and caller name for each message replaced with “mname” and “cname” labels Manual tags of mailbox name and caller name for each message replaced with “mname” and “cname” labels “mname” and “cname” represented the 2 class tokens that can be substituted with any values in the future “mname” and “cname” represented the 2 class tokens that can be substituted with any values in the future
Name Network To get the values for “mname” and “cname”, an internal AT&T employee directory (~ 40,000 people) listing was used To get the values for “mname” and “cname”, an internal AT&T employee directory (~ 40,000 people) listing was used “cname” created from variations of static titles (Miss, Mr), full first names and nicknames (Alexander, Alex), and last names (Jones) “cname” created from variations of static titles (Miss, Mr), full first names and nicknames (Alexander, Alex), and last names (Jones)
Name Network (continued) Probability within class – training corpus Probability within class – training corpus Probability within first names – AT&T directory listing Probability within first names – AT&T directory listing
Replacement in the ASR network State of art before proposed algorithm: State of art before proposed algorithm: off-line composition, determinization and optimization of one existing grammar G and one lexicon L Proposed algorithm is impractical with off-line composition because of so many variations needed Proposed algorithm: Proposed algorithm: L by G optimization done for each class using G specific L by G optimization done for each class using G specific for that class for that class Small overhead of proposed algorithm compared to off-line optimization to off-line optimization
Experimental results Word Error Rates (WER) improvement small Word Error Rates (WER) improvement small Absolute reduction of 0.6% Absolute reduction of 0.6% Named Error Rate (NER) improvement significant Named Error Rate (NER) improvement significant Absolute reduction of 20 % Absolute reduction of 20 %
Conclusions Large reduction in NER is very critical: Large reduction in NER is very critical: Name transcription is the goal Name transcription is the goal Scanmail users expressed a strong desire for the system to recognize names correctly Scanmail users expressed a strong desire for the system to recognize names correctly Other improvements: Other improvements: Errors in OOV Errors in OOV In-vocabulary name recognition In-vocabulary name recognition No significant increase in complexity with good name coverage No significant increase in complexity with good name coverage No need for manual design when the system is moved to new environment No need for manual design when the system is moved to new environment