MCL-Collection and Voice Recognition 5 January 2007 Richard Fullard MCL Technologies.

MCL-Collection and Voice Recognition 5 January 2007 Richard Fullard MCL Technologies

Why Voice on Mobile Computers? Picking Accuracy and Efficiency !!!

DEMO  Remote Viewer from Terminal-Screen

Demonstration

The Drivers behind Voice  Increases Mobile Worker efficiency and productivity 15% minimum increase of productivity (paper to voice picking) Simply faster  Increased accuracy; 99.9% plus is not uncommon No need to constantly ‘swap’ between paper/terminal and the picking task to perform. Don’t ever lose sight of the actual task  Allows hands-free operation Mobile Worker can easily pick the products of the shelf, and easily move or drive the picking cage/pallet etc.  Allows eyes-free operation Mobile Worker can focus on other activities such as safe driving Pickers speak and listen while “on the move” (Highly developed multi-tasking skills that humans have been practicing for years)

History of Voice  Voice recognition systems have been commercially available for over 15 years but only recently ‘crossed the chasm’ (started to work well)  Early systems were Voice Independent: A sample of the population was taken as the ‘voice mean’ against which the individual's voice profile was compared Works quite well in white collar environments such as call centres Does not work in Industrial environments Works for dictating systems, Microsoft Office Products & PDA products, IBM’s Via Voice, telephone systems, etc.

History of Voice  Recently Voice-Dependent systems developed into stable and reliable solutions An individual trains the system on only the words that are used in the application Saves the individual’s voice profile in the system The application compares the voice entry against the voice profile of the relevant individual on the system ensuring high accuracy of recognition Smaller word-set, much higher accuracy Used in industrial data capture applications

History of Voice  Up until recently, only “vertical” voice solutions available  Dedicated, proprietary devices

But things have changed!  Industry wants Multimodal, it needs efficiency, but accuracy too. It doesn’t want a compromise!  RFID when it makes sense  Scan when it makes sense  Key when it makes sense  Speak when it makes sense  Eyes and Ears when it makes sense

What is MCL-Voice MCL-Voice is a fully standard MCL solution with the addition of Voice recognition and Voice synthesis capabilities

MCL- LINK / MCL- NET RS232. Modem. GSM/GPRS. Ethernet. Internet. WiFi 802.11 MCL ODBC Bridge MCL DLL Bridge MCL SAP R/3 Bridge Communication & Dispatching servers MCL – Designer MCL-Client Voice MCL … Bridge … MCL-Client + Vocollect Voice Your Host Systems

What is Voice Recognition  Voice Recognition is a technology that converts analog signal coming from a microphone (Voice) into a sequence of digital bytes (words) (Also known as “ASR” or “Speech To Text” ) “1” Voice SignalData Word

What is Voice Synthesis  Voice Synthesis is a technology that converts data words into analog Speech signal in a given language (also named “Text to Speech” or “TTS”) “1” Voice Signal Data Word “One”

Embedded Voice Engine  Voice Recognition and Voice Synthesis is done in the Terminal (No Server required)

System Requirements  Terminal Hardware Certified by MCL Technologies  Windows operating system: Microsoft® Windows® Pocket PC 2003/CE4/CE5/WM5.  > 300MHz XScale processor.  Min 64 MB memory.  Audio Interface (Jack).  MCL Certified headset (with quick release connector)  MCL-Voice Client (activated)

Certified Devices / Dec 2006 Symbol TechnologiesIntermec MC50 MC70 MC90xx WT40xx CN2B 730-751B

Terminal Accessories  Quick release connector Operator safety feature Quickly separates headset cable from the terminal. High quality release mechanism supports multiple connections and disconnections for personalized headsets.

MCL-Collection Voice  Three new MCL-Collection components: MCL-Designer Voice Add-on (for development) Requires an additional License to standard MCL-Designer MCL-Client Voice (for deployment) Requires a specific Voice Client License. MCL-Voice Manager (for deployment) Requires a specific product License.

Multimodal Approach

What Is Multimodal? Definition - MCL Multimodal access. The combination of multiple data capture technologies in one mobile worker application.  Barcode scanners.  Imagers.  Displays.  Touch screens.  Signature capture.  Keyboards.  Weigh scales.  Printers.  … and now, Voice.

What is Multimodal

Multimodal Data Capture

Voice in Multimodal Applications  Consider retail inventory application. Barcode scan item UPC or EAN. Keyboard or voice entry of quantity. Hands full of merchandise – voice more expedient.  Consider warehouse receiving application. Radio frequency identification of pallet. Voice acceptance of pallet. Voice receipt of item level goods. Hands free operation.  Mobile worker has flexibility to use whichever input method is more convenient and efficient.

Voice Recognition Application Vocabularies Voice Templates

Application Vocabularies  Unique vocabulary for each MCL-Voice application.  Action words used by mobile worker to: Enter transaction data. Navigate within the application. Control Voice Parameters  MCL-Designer builds application vocabulary word lists. Mobile worker applications typically have small vocabularies. 50 words very typical (+/- 120KB). 100 words unusually large.

Application Vocabularies  MCL-Voice vocabulary word categories. Action Words: Application words. Task specific words: data entry of transaction data. Such as “Quantity”, “1”, “2”, “3”, “4”. Navigation words: command addressed to the MCL engine Such as “Picking” to branch to the Picking function Global words. Words that you define to be valid on every screen. Commonly used for application navigation words. Such as “Next”, “Back”, “Clear”, “Previous”, “Delete”, “Enter". Control words. Commands to control the voice engine. Such as “Pause”, “Resume”, “Volume UP”, “Volume Down”.

Voice Recognition Training  Like fingerprints, everybody has a unique voice print. Different accent, different pronunciation of words.  Each individual must “train” the voice recognizer to: Understand the application vocabulary for that individual’s voice pattern. Allow worker to speak naturally and be understood. Training generates a “User Voice Template”

Voice Recognition Training  User voice template. The “User Voice Template” file contains: The selected language. The specific User’s settings (volume, speed, pitch, sensibility). All the application words and their “speaking image” Voice training is performed on the mobile computer. +/-20 minutes per worker typical to train 50 word vocabulary. Voice training result saved in unique template for each worker. Suggest using employee ID or badge ID to create unique, personalized template names. Template can be uploaded to a server and deployed to any other device.

Voice Recognition Training  How is the voice template used? Mobile worker says a word. Voice recognizer compares spoken word against template. Spoken word translated into template word it matches. The smaller the vocabulary. The faster the comparison. The more accurate the match.  Multi-User. Voice templates for several individuals may be: Saved on a mobile computer. –Allows a mobile computer to be shared by several workers. Saved and downloaded from a central server. –Allows a pool of mobile computers to be shared by all mobile workers.

 Multi-User User Voice Template File & Voice Preferences are loaded in the device (for execution) & on server (for distribution) UVT Download/Upload Server User 1 User 2 Multi-Users

Advanced Features

 Noise Cancellation on MCL Certified Headsets. MCL-Voice operates in noisy, industrial environments: Warehouses, distribution centers, dockyards, transportation hubs, assembly/manufacturing plants… Very effective noise suppression on certified headsets. High voice recognition directly proportional to ability to reduce ambient noise. Impossible task on inferior headsets. To maximize voice recognition: Perform voice recognition trainings in actual work environment. Creates most representative voice templates.

Advanced Features  Noise sampling. Adjusts microphone pickup levels to compensate for ambient noise level. Noise sampling can be performed: Automatically on boot up. On mobile worker demand.

Advanced Features  Dynamic voice training. Incremental addition of vocabulary words to existing template. On boot up. Untrained words introduced by new version of application. Prompts mobile worker to train any untrained words. Consider application vocabulary of 50 words. –New version introduces one new word. –Mobile worker trains only the new word, not all 51 words. On demand by mobile worker. Recognizer continually has trouble understanding a given word. Retrain any poorly trained words.

Advanced Features  Fast voice recognition. Explicit application vocabulary. Very efficient. Performed on the mobile computer. Voice recognition and audio feedback virtually instantaneous.

Advanced Features  Vocabulary optimization. Vocabulary subset definable on each input field. Limits template choices valid for match. Further decreases voice recognition search times. Further increases word match accuracies.

Advanced Features  Audio Feedback (Echo) Any input. Mobile worker says “five” into microphone, and immediately hears synthesized “five” in the headset. Immediate verification. Multimodal approach. Any input can be echoed: –barcode scanned data, keyboard entry data, etc. MCL-Voice TTS synthesizer sees all data the same regardless of the original multimodal input source of the data.

Advanced Features  Talk Over Experienced operators may speak the data before the prompt is given. Prompts are canceled if operator enters a valid response.

MCL-Voice Certification Guidelines and Best Practices

Why?  Voice recognition is not a “black and white” science.  How to position MCL-Voice.  Benefits of MCL-Voice.  Customer competitive advantages.

Why?  Design and implementation of a successful voice application is done by following guidelines and best practices. Technical training teaches, for example: Avoid vocabulary items like “cherry” and “sherry” on the same data entry field. Avoid words and phrases with common initial words, like “orange” and “orange peel”. Avoid vocabulary words similar to environment background noise. Consider a work area equipped with air compressors. Avoid vocabulary words that end with “S” and “Z”, like “Pass”. The voice recognition engine might interpret the background hiss as a valid vocabulary word.

Requirements  Training. Successful completion of the MCL-Collection Voice Technical Training course by at least two representatives from your company. Successful completion of the MCL-Collection Voice Sales Training course by at least one individual from your company.  Equipment. Purchase of an MCL-Collection Voice Demo Kit by your company.

Company Certification  Given to companies that satisfy all Certification Requirements.  Tied to both the company and the trained individuals.  Lost by a company when one or both trained individuals leave the company.  Not automatically given to a company that hires individuals with prior MCL-Collection Voice training. Each company must satisfy all Certification Requirements to receive the MCL-Collection Voice Certification.

Company Certification  Lists your company in an online directory.  Entitles your company to purchase MCL-Voice products.  Develops best practices technical skills to implement successful voice enabled applications.  Provides understanding of the benefits voice brings to mobile workforce deployments.  Elicits customer confidence.  Gives access to MCL Support. Benefits

MCL-Designer Voice Enabling a Project

MCL-Designer Voice Enabling a Project Voice Enabling a project At Project level … At Screen level … At Screen Object level … At Process Object level …

MCL-Designer Voice Enabling a Project At Project level All settings defined at “Project Level” are used by all programs and processes of the project.

MCL-Designer Voice Enabling a project at Project level Terminal Setup Programs Local files System & User Variables Image files Fonts & Styles Keyboard definition Voice Settings MCL Project components The “Voice settings” define the “ASR” and “TTS” parameters. This includes the language and the speed definition, the global field words etc…

MCL-Designer Voice Enabling a project at Project level  Voice settings

MCL-Designer Voice Enabling a project at Project level Settings  Parameters: Speech In Speech Out  Global Words  Control Words  Phonetic Substitution Table

MCL-Designer Voice Enabling a project at Project level Speech Out TextWill be spoken as  ABC “alpha bravo charlie”  Morning “morning”  A12“alpha one two”  A 12“alpha twelve”  25“twenty five”  123“one hundred twenty three”  AB12D“alpha bravo one two delta”

MCL-Designer Voice Enabling a project at Project level Speech Out Special symbol  #pound  $dollar  €euro  &ampersand  *asterisk  +plus  -dash .point  /slash  \back slash  =equals

MCL-Designer Voice Enabling a project at Project level  The Input timers

MCL-Designer Voice Enabling a project at Project level Settings  Parameters: Speech In Speech Out  Global Words  Control Words  Phonetic Substitution Table

MCL-Designer Voice Enabling a project at Project level Settings  Parameters: Speech In Speech Out  Global Words  Control Words  Phonetic Substitution Table Note: Words are case sensitive

MCL-Designer Voice Enabling a project At Screen level All settings defined at “Screen Level” are used by all objects and processes of the selected screen.

MCL-Designer Voice Enabling a project at Screen level MCL « Voice » program structure Process-In Process-In lines Voice object definitions Start of screen Options Clear Screen Backlight Screen Label Process-Out Process-Out lines Screen Display data, Input fields, (with «In Screen» processes) Buttons, Menu Etc… Next screen

MCL-Designer Voice Enabling a project at Screen level  Enabling or Disabling Speech In & Speech Out independently

MCL-Designer Voice Enabling a project Screen Object level All settings defined at the level of a specific screen or process object are used by this object only.

MCL-Designer Voice Enabling a project at Screen Object Level Output Screen Objects Display Text Display Variable

MCL-Designer Voice Enabling at Screen Object Level Input Screen Objects Input Barcode and Keyboard Input Spin Input List Pull Down List Check boxes Radio buttons Text buttons and Image buttons Menu Text and Menu Buttons File Browse

MCL-Designer Voice Enabling a project at Screen Object Level Voice Control Input Barcode and Keyboard, Input Spin, Input List  Focus Prompt  Words Word List 1, 2 & 3 Audio Feedback  Completion Prompt

MCL-Designer Voice Enabling a project at Screen Object Level Voice Control Pull Down List, Check box, Radio buttons  Focus Focus Word  Words Word List 1, 2 & 3 Audio Feedback

MCL-Designer Voice Enabling a project at Screen Object Level Voice Control Text buttons, Image buttons,  Focus Focus Word

MCL-Designer Voice Enabling a project at Screen Object Level Voice Control Menu Text, Menu buttons  Focus Prompt  Words Word List 1, 2 & 3 Audio Feedback

MCL-Designer Voice Enabling a project at Screen Object Level Voice Control File Browse  Focus Prompt  Words Word List 1, 2 & 3 Audio Feedback  Completion Prompt

MCL-Designer Voice Enabling a project At Process Object level All settings defined at the level of a specific screen or process object are used by this object only.

MCL-Designer Voice Enabling a project at Process Object Level Double Click Process-Out Process-In

MCL-Designer Voice Enabling a project at Process Object Level In-Screen processes

MCL-Designer Voice Enabling a project at Process Object Level Voice Processes  Speech Input  Speech Output  Play Sound Wave  Noise Sample  Voice Training  Set Voice State  Set Voice Operator  Set Recognizer parameters  Set Synthesizer parameters

MCL Voice: Principles

MCL-Client Voice Recognizer: Principle Voice Engine Settings Maximum delay between Words Default Timers For Input Fields NRT = Noise Rejection Timer For Combined Words MWT = Multiple Word Timer

MCL-Client Voice Recognizer: Principle Voice Recognizer Settings Sensitivity settings

MCL-Client Voice Recognizer: Principle ASR Voice Template Best Match Word Score Word List Threshold Data Signal +- Score

MCL-Designer Voice Enabling a project at Process Object Level Original image Same Image? Consider that the template of a word is like an image This is the “template” image This is the image with a certain level of noise YES

MCL-Designer Voice Enabling a project at Process Object Level Original image Same Image? This is the “template” image This is the image with the same level of noise NO

MCL-Designer Voice Enabling a project at Process Object Level Original image This is the “template” image Same Image? Maybe This is the image with a higher level of noise

MCL Voice Manager

 Voice-Manager Management & Tuning Software for Voice Users & Terminals Log User Voice Data Date & time Terminal ID Spoken Word Word Score Result

 Voice-Manager Management & Tuning Software for Voice Users & Terminals User Score over Time Average Score Per Hour Word & Noise Volume / Per Hour

 Voice-Manager Management & Tuning Software for Voice Users & Terminals Statistical Analyzer Score Distribution Scoring Split Analysis Words versus Noise Volume Word Details Analyzer

 Voice-Manager Management & Tuning Software for Voice Users & Terminals Analysis & Recommendation Suggested Word To Re-Train

MCL Client Installation & Activation

MCL-Client Activation The “MCL-Voice” client will generally be installed either on the Flash memory or on an SD card  Installing on the Flash memory Using Activesync, Install the “MCL-Voice Client” on the device using the.exe file. Activesync will create the necessary folders and install the “MCL-Voice” client  Installing on a SD card The.zip file must be unzipped and copied on the SD card. The card will then be placed in the terminal.

MCL-Client Activation Start MCL on the device  Define the Terminal and Subnet ID’s  Go to the Activation screen  Enter the License number and the Activation code Notes:  The MCL-Client with MCL-Voice does not support the “Demo mode”  The terminal activation uses the “Off-Line” activation procedure.  The MCL.key file is stored in the main folder of the MCL- Voice client.

MCL-Client System Menu System Menu  This screen gives access to the different options of the System Menu  The « Setup » option is used to access the « Voice » settings menu.

MCL-Collection and Voice Recognition 5 January 2007 Richard Fullard MCL Technologies.

Similar presentations

Presentation on theme: "MCL-Collection and Voice Recognition 5 January 2007 Richard Fullard MCL Technologies."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

MCL-Collection and Voice Recognition 5 January 2007 Richard Fullard MCL Technologies.

Similar presentations

Presentation on theme: "MCL-Collection and Voice Recognition 5 January 2007 Richard Fullard MCL Technologies."— Presentation transcript:

Similar presentations

About project

Feedback