Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Case for Embedded Speech Recognition Jordan Cohen CTO Voice Signal Technologies VOX 2002.

Similar presentations


Presentation on theme: "The Case for Embedded Speech Recognition Jordan Cohen CTO Voice Signal Technologies VOX 2002."— Presentation transcript:

1 The Case for Embedded Speech Recognition Jordan Cohen CTO Voice Signal Technologies VOX 2002

2 Voice Signal Technologies  VST develops efficient speech-centric multi-modal interfaces for embedded devices. –Focus on Mobile Communications –Emphasis on performance while minimizing power, footprint, and computation

3 What VST is about: Our Vision: Every handheld device of the future will have an intuitive, speech centric, multi-modal, interface. Our Mission: Harness Speech and Multimodal technology to make mobile communications devices intuitive and useable.

4

5 The potential in Telecommunications?  Handset sales are stagnant (Yankee Group)  2000 – 420M handsets sold  2001 – 395.8M handsets sold  2002 – 435M handsets to be sold  2010 – 800M handsets  What drives the business?  The Network?  Information Services?  Location Services?  Entertainment?  Messaging?

6 Serving Customers  Figure out what they want and what they need  Decide what can be delivered (at the right cost)  Find the most efficient implementation  User  Service provider  Technologist  Deliver

7 Features Requested in New Phones Wireless Data Services Attract Strong Interest Among Subscribers Upgrading to New Phones" Telephia/Harris Interactive Study 8 July 2002

8 Requests Enhanced by a Speech UI

9 Embedded Voice Enhances Applications  Voice Dialing  Current Network Implementation Speaker Independent Dial-by-Name Dial by number  With Embedded Speech Interface “Name Lookup” rather than”Voice Dialing” Dial by name or number Launch SMS application with address filled in –Enter message using buttons or speech-to-text Enter a name in a calendar application Lookup by first name, last name, employer. –Display choices using text or voice –Choose using voice or buttons

10 Embedded Speech Services  The embedded speech recognizer is part of the the handset software infrastructure  Results are available as actions or text  The handset knows its state  Language  Software  Network Status  (User)  The handset is multi-communicative  Screen  Buttons  Audio  Buzz/vibrate

11 What Speech Technologies are Available We can deliver multimodal interfaces using three speech capabilities on the device 1. Digit Recognition  PIM, Dialing, Internet, Appointments 2. Choose from a list  Phone management, PIM, Dialing(Lookup), , appointments/meetings, PDA 3. Speech-to-text  PIM, SMS, , Internet, Appointments/meetings, PDA

12 Wins for the Carrier  Stickiness  Small Support Costs  Efficiency  Real Time Services require high bandwidth support and seamless coverage  Local Data and Messaging Services require low bandwidth and occasional coverage  SMS = Secure Money Stream  Roaming

13 In the Short and Intermediate Term  Very competent digit recognition and names dialing - Sprint Voice Command Voice interactions have voice response Network connectivity required Unimodal  Multimodal Solutions use GPRS or UMTS  Carriers must install infrastructure  Multimodal capabilities require simultaneous voice and data  2.5G Network build out not before 2007 Bear Stearns Equity Research July 2002 “Overhang Still Outweighing Potential Catalyst” Tower and cell building reduced from 105,000 to 66,000 before 2007 Not enough infrastructure for national 2.5G network

14 Today’s Server-Based Speech Application  Store your phonebook  Dial your connection (but not your phone)  By number  By name (Speaker Independent) (If you are connected to the network)

15 Today’s Embedded Solutions  Dial your phone  Navigate and operate your phone by voice  Lookup your PIM information for use as dialing or as an address for SMS  Access the internet using HTTP or other text-based protocols  Manage your local data  Roam across networks without losing functionality  Save your battery for use in voice calls  Create SMS or messages using speech-to-text

16 The Global Market Billions Source, “Global Wireless Device Market,” Strategy Analytics, October 1, 2001

17 Speech and Embedded Technology If you heard it in this talk, You can see it in our laboratory today on a phone. If you wait only a few months, You will be able to buy it in your local phone store. If you don’t know that it is coming You haven’t talked with us!

18 The End Embed It! Jordan R. Cohen CTO Voice Signal Technologies


Download ppt "The Case for Embedded Speech Recognition Jordan Cohen CTO Voice Signal Technologies VOX 2002."

Similar presentations


Ads by Google