Presentation is loading. Please wait.

Presentation is loading. Please wait.

(1) VoiceXML Overview, Opportunities & Challenges Hitesh Kr. Seth Chief Technology Evangelist SeraNova, Inc OReilly Conference.

Similar presentations

Presentation on theme: "(1) VoiceXML Overview, Opportunities & Challenges Hitesh Kr. Seth Chief Technology Evangelist SeraNova, Inc OReilly Conference."— Presentation transcript:

1 (1) VoiceXML Overview, Opportunities & Challenges Hitesh Kr. Seth Chief Technology Evangelist SeraNova, Inc OReilly Conference on Enterprise Java, 2001

2 (2) Agenda Introduction History Elements Developing Voice Portals Applications Vendor Landscape Challenges Resources

3 (3) Introduction

4 (4) The Web is Ubiquitous Key Highlights HTTP Protocol HTML for Content Static, Dynamically Generated Usage Model Create Content/Scripts Publish on the Web Server Access it through a web browser

5 (5) What about Voice? Call Center, IVR based products have been around IVR Applications usually are DTMF oriented Interaction through the key pad rather than Voice Complex Infrastructure Involve huge investments in proprietary solutions Lack of integration with the Internet ASP model for deployment wasnt established Emergence of sophisticated Text-to- Speech/Voice Recognition solutions

6 (6) VoiceXML What is VoiceXML? XML based markup language which describes voice/touch-tone based interactions for development of interactive voice based applications

7 (7) Application Model

8 (8) Technical Highlights Based on XML 1.0 Supports DTMF (touch tone keys) and Voice Input Press 1 for Email; Please say your name TTS (Text-to-Speech) and Pre-Recorded Audio Output Recording of User Input Telephony Integration e.g. Connect to a Live Operator Form & field level grammars direct and (near) natural dialogs Direct: Which city would you like to go? San Jose Natural Like: What can I do for you, today? I would like to travel from San Jose, CA to Newark, NJ on 15 Nov

9 (9) Key Benefits Brings the ubiquity of Web to the ubiquitous access device – an ordinary phone Reach billion(s) of LAN and mobile phones Hands free communication for automobiles Single Platform for developing Web & Voice Applications Opens up the web to reach billions of ordinary phones worldwide Automated Customer Service Can enhance customer satisfaction (immediate response) Lower costs (lesser customer service reps. and customer waiting costs!) Can use it even in a flight!

10 (10) Hello VoiceXML Hello World!

11 (11) Demo

12 (12) History

13 (13) History 3/2/1999 AT&T, Lucent & Motorola create VXML Forum No of Members: 17 8/25/1999 VoiceXML 0.9 Preliminary Spec Released No of Members: 61 3/7/2000 VoiceXML 1.0 Spec Released No of Members: 79 5/22/2000 VoiceXML 1.0 submitted to W3C No of Members: 150 Today, there are 281 members of the VoiceXML Forum (10/5/2000)

14 (14) Earlier Works SpeechML by IBM VoxML by Motorola PhoneWeb/PML by Lucent/AT&T

15 (15) Elements

16 (16) Elements Root Form/Interaction,,,, Grammar, Events,,,, Platform Specific,, Telephony Integration,,

17 (17) Elements Language,,,,,,,,,,,, Prompt/Audio,,,,,,,,, Navigation,,,,

18 (18) Prompts TTS (Text-to-Speech) What can I do for you? Did you say 732-362-2187 Did you say Area Code (732) 362-2187 Pre-Recorded Prompts, Hitesh Rule of Thumb Use TTS sparingly (only for dynamic information) can be used for Ads or any other special announcements.

19 (19) Navigation Welcome to your Personal Portal. Email Calendar Employee Directory

20 (20) Grammars Specify utterances that a user may speak to provide corresponding string value or set of attribute-value pairs Can define a form grammar or field grammar Spec. doesnt require an implementation to support a particular format Common Grammar Formats Java Speech API Grammar Spec (JSGF) Nuance GSL Speech Recognition Grammar Spec for W3C Speech Interface Framework (Working Draft) Can be specified inline with the VoiceXML document or referenced externally using the tag

21 (21) Grammars Inline... Say the name of the person hitesh seth {1} |......... External... Say the name of the person...... mycompany.gram #JSGF V1.0; grammar mycompany; public = (hitesh seth) {1}...

22 (22) Interaction Say the name of the person (hitesh seth) {1} |......

23 (23) Interaction Hitesh Seth. Direct Phone: 732-362-2187....

24 (24) Telephony Integration element Connect the user to another phone Applications Assisted dialing Online Employee Directory! I would like to call Hitesh on his cellular phone. Connecting to (732) 433-5603 …. Switching to a human Operator Welcome to XYZ Voice Portal. At any point of time say Operator to connect to a customer service agent. Please say your name. ….

25 (25) Telephony Integration... Hiteshs direct phone is (732) 362-2187, Cellular... home | direct | cellular......

26 (26) Telephony Integration

27 (27) Extensions & Tags Implementation Specific Properties e.g. TTS Engine Parameters (gender, tone etc) Implementation Specific Components and Value Add Services e.g. Integration with the components built for the underlying ASR Engine (e.g. Nuance SpeechObjects) e.g. Component for getting an address Caller-Id Information Service Cellular Phone Location Service

28 (28) Demo

29 (29) Developing Voice Portals

30 (30) Developing What do you need? Development Tool To develop/test the application IBM WebSphere Voice Server SDK, Motorola Mobile ADK, Nuance V-Builder, Tellme Studio, … Web Server To execute the scripts/server VoiceXML content Apache, Microsoft, Netscape, … JSP, Servlets XML Parser, XSLT Processor VoiceXML Interpreter/Implementation Platform Ordinary Touch Tone Phone PC with a good Sound Card and microphone For Creating/Testing Applications using Simulators/SDKs

31 (31) Static/Dynamic Serving! Up VoiceXML Static v/s Dynamic Content Dynamic Server Scripting technologies such as JSP,Servlets to generate VoiceXML Dynamic Presentation using XML/XSLT XML represents content XSLT represents transformation of the content into presentation Use Apache Cocoon!

32 (32) XML/XSLT XML Represents Data Static XML or Dynamically Generated using Server Scripts XSLT Represents Formatting Write it yourself or Create through a tool

33 (33) Processing XML/XSLT JSP <% String xml =AddressBook.xml"; XSLTProcessor processor= XSLTProcessorFactory.getProcessor(); String xslFile = "AddressBook.xsl"; processor.process( new XSLTInputSource(xmlFile), new XSLTInputSource(xslFile), new XSLTResultTarget(out)); %> Use Sophisticated Content Management Systems Create different Style Sheets for different interfaces - VoiceXML, HTML,WML,etc.

34 (34) Deployment Infrastructure Required In Addition to Web Application Server serving VoiceXML pages, you need Telephony Interface Boards ASR Engine TTS Engine VoiceXML Interpreter Bandwidth/Incoming Lines Deployment Options Pre-packaged VoiceXML Server (all-in-one) Pick and choose VoiceXML Solution components ASR, TTS, VoiceXML Interpreter, Hardware Ports, Bandwidth Hosted Voice ASP Solutions

35 (35) Applications

36 (36) Applications Utilized Web Content/Information Stock Quotes, Weather Information, News Customer Service Order Status, Address Change, Automated Call Center, etc Commerce Banking, Stock Trading, Voice Enabled Commerce Corporate Portals Employee Directory, Employee Self Service - Human Resources, Email, Calendar, Unified Messaging Alerts [Push Model] Server Initiated Transactions (Call me when the stock price of any company in my portfolio goes up by $10)

37 (37) Corporate Portal Scenario 1 (800) – XXXXXXX Welcome to Your Corporate Portal. Please say your name. Hitesh Seth Please enter your access code **** Good Morning, Hitesh. What can I do for you? Check my mail You have 34 new messages. Is there any new message from my boss? Yes there are two message from …

38 (38) Corporate Portal (contd.) First message. Subject: Help Need in XYZ Project. Hitesh, could you please call …?. Reply I am in San Jose till 15 th of November. I could come to Phoenix on 16 th November.[#] [used ] Mail Sent When am I meeting with John today? You have a meeting with John, at 2:00 PM. Connect me to his office, please. Connecting to Johns direct number, (732)... [used ]

39 (39) Vendor Landscape

40 (40) Vendor Landscape All-in-one VoiceXML Gateways/Servers Combines ASR, TTS, VoiceXML Interpreter, Hardware Ports Lucent Speech Server, Motorola Voice Developer Gateway, VoiceGenie VoiceXML Gateway, … ASR (Advanced Speech Recognition) Engines AT&T, IBM, Nuance, Philips, SpeechWorks, … Development Tools IBM WebSphere Voice Server SDK, Motorola Mobile ADK, Nuance V-Builder, Tellme Studio, … Recording & Developing Prompts Microsoft Sound Recorder, Sonic Foundry Sound Forge, Syntrillium Software Cool Edit,...

41 (41) Vendor Landscape Text-to-Speech Engines AT&T, Fonix TTS, L&H RealSpeak, Lucent TTS Engine, Nuance Vocalizer, SpeechWorks Speechify, … Telephony Interface Boards Dialogic, Lucent,... Voice ASP Solutions BeVocal, Interactive Telesis, Tellme, VoiceGenie Technologies,,...

42 (42) Challenges

43 (43) Challenges Need Sophisticated Infrastructure Voice Recognition Quality Need to build Sophisticated Grammars for near natural language speech recognition. Your Application is as good as its grammar. TTS Quality & Customization Server Initiated VoiceXML Interactions! (Push Model) VoiceXML Application Development Tools are still maturing

44 (44) Authentication Possible Approaches User-Ids/Passwords Too cryptic for ASR Engines to recognize Usually need to spell it out, which is hard Names/Access-Codes Names may not be unique; may be good for intranets Telephone No/Access Codes Telephone No are unique (0017323622187) for International Portal, (7323622187) for a US Portal (or redirected to a US only area) Easy to Key in and/or say-aloud If available, use Caller-Id similar to persistent cookie Voice Based Authentication Voice Print/Pattern

45 (45) Performance Grammars Inline v/s External Caching! VoiceXML Documents Caching! Multiple interactions per document Audio TTS v/s Recorded Prompts Quality v/s Size

46 (46) Getting Started Take Small Steps Use DTMF Enter your 10 digit account number Press 1 for Email, 2 for calendar, 3 for employee directory Use Directed Dialogs Say the name of the person Move towards natural language conversations What can I do for you? Use TTS Sparingly for quality of voice interaction If your application incorporate ads, make sure to make them short and crisp Start Small, grow big (try regional betas/limited trials and move towards a larger audience)

47 (47) Opportunities According to Kelsey Group By 2005, Advertising and transaction from Voice Portals will produce $5 billion in revenues and $6 billion for associated hardware, software and Net service provider companies. (Adopted from Voice portal companies overshooting demand,, May 9, 2000)

48 (48) Resources

49 (49) Resources Organizations VoiceXML Forum W3C Voice Browser Activity Specs VoiceXML Specification Java Speech API Grammar Spec (JSGF) media/speech/forDevelopers/JSGF.pdf media/speech/forDevelopers/JSGF.pdf

50 (50) Resources Vendors AT&T BeVocal Dialogic IBM ware/speech ware/speech Lucent eech eech Motorola Nuance Tellme SpeechWorks http://www.speechworks. com http://www.speechworks. com VocieGenie Technologies m m Voxeo

51 (51) Questions?

52 (52) Thanks for your time.

Download ppt "(1) VoiceXML Overview, Opportunities & Challenges Hitesh Kr. Seth Chief Technology Evangelist SeraNova, Inc OReilly Conference."

Similar presentations

Ads by Google