Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.

Similar presentations


Presentation on theme: "The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch."— Presentation transcript:

1 The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch

2 Overview Voice browsers History of voice markup languages W3C Speech Interface Framework Communication Architecture VoiceXML 2.0 Grammars SALT Not WAP/WML, Voice over IP

3 Voice Browser Device (hardware and software) that interprets voice markup languages to generate voice output and interpret voice input.

4 Companies

5 History 1990s: companies developed their own markup languages: PhoneML (AT&T) PhoneML (Lucent) VoxML (Motorola) TalkML (HP Labs) SpeechML (IBM) => VoiceXML Forum : VoiceXML 1.0 1998: W3C Voice Browser Workshop

6 VoiceXML Specification History April 1999 – Initial spec – Request For Comment August 1999 – 0.9 Spec released March 2000 – 1.0 Spec released October 2001 – 2.0 Working Draft (W3C) March 2002 – next Working Draft 4th quarter 2002 – 2.0 Recommendation W3C?

7 Why Voice Markup Languages? “Voicifying” web pages by adding a few VoiceXML tags is not feasible: –basic design principles that make a good web page are very different from those that make an efficient voice interface –e.g. Raggett & Ben-Natan: “Voice Browsers” (W3C, 1998) … unless you want to create a multimodal interface (cf. SALT) ?

8 Speech Interface Framework TTS Language Understanding World Wide Web User Telephone System Dialog Manager Language Generation Media Planning Prerecorded audio player ASR DTMF tone recognizer Context Inter- pretation Lexicon Natural Language Semantics ML VoiceXML 2.0 Reusable Components Speech Synthesis ML N-gram Grammar ML Speech Recognition Grammar ML

9 Communication Architecture

10 What is VoiceXML? For creating audio dialogs that include Synthesized speech Digitized audio Recognition of spoken and DTMF key input Recording of spoken input Telephony Mixed-initiative conversations Major goal: bring the advantages of web-based development and content delivery to interactive voice response applications.

11 Advantages of VoiceXML As perceived by Motorola et al: People want a better mobile user interface while on the go Device Independent Open standards create and drive market demand Easy to program since similar to other XML- based languages Utilizes existing web infrastructure

12 Developing applications To develop VoiceXML applications you have to learn several languages: –VoiceXML –ECMAScript (JavaScript/Jscript) –a grammar format (GSL, JSGF, Speech Recognition Grammar Specification) –a back end scripting language (Perl, Java, …) Web developers are used to this kind of environment

13 Voice XML 2.0 Features Clears up grammar ambiguity SRGF – Speech Recognition Grammar Format –Grammars can be represented as an XML document. XSLT can generate. XML parsers can parse –Augmented BNF (ABNF) Similar to many current proprietary standards Speech Synthesis Specification Language –based on JSML (Sun)

14 VoiceXML Basics XML-based More structured then HTML (describes structure and semantics of data, not presentation) –Must close all tags (i.e. ) Structure of language described in a Document Type Description (DTD)

15 VoiceXML Applications An application consists of a single application root document as well as zero or more other documents The application root document is loaded whenever any other document is accessed The application root document grammars and variables are visible in other application documents Document root Document

16 VoiceXML Documents Documents can contain two types of dialogs: –forms ( ) –menus ( ) Other elements: – : metadata, defined as name/value pair – : for declaring variables – : for client-side ECMAScript – : for catching events – : transitions to other dialogs

17 Forms and menus Forms may contain zero or more elements –the user must provide a value for the field before proceeding to the next element in the form –each field may specify a grammar that defines the allowable inputs Menus may contain one or more elements –a menu presents the user with a choice of options and then transitions to another dialog

18 VoiceXML Example 01 02 03 04 05 06 07 Hello World! 08 09 10 11

19 Example with Grammar 01 02 03 04 05 Would you like coffee, tea, or juice? 06 07 [coffee tea juice] 08 09 Your 10 will be ready momentarily 11 12 13 14

20 Dynamic VoiceXML #!perl –w print "Content-type: text/x-vxml \n\n"; $HOMEBUFFER = ' Hello World '; print $HOMEBUFFER;

21 Other Markup Languages JSML: JSpeech Markup Language (Sun) Dialog ML (Dennis Heuer) SABLE (SABLE Consortium) DMML (Dialogue Moves Markup Language) SALT: Speech Application Language Tags (SALT Forum) (CallXML, Telephony Markup Language, …) Progress since March 2000 (VoiceXML 1.0) ?

22 JSML JSpeech Markup Language (Sun) XML specification for controlling text-to- speech engines includes elements that describe the structure of a document, provide pronunciation of words and phrases, and place markers in the text includes elements that control phrasing, emphasis, pitch, speaking rate, … elements borrowed by VoiceXML

23 DialogML Dialog Management Language (Dennis Heuer) open source project hosted by Sourceforge XML language for defining dialog-driven setup processes not the “Dialog ML” mentioned in the W3C notes on requirements for voice markup languages!

24 SABLE SABLE Consortium XML/SGML-based markup language for controlling text-to-speech engines evolved out of work on combining three existing text-to-speech languages (SSML, STML, JSML) (documentation hosted by Bell Labs no longer available; any progress since publication of draft specification?)

25 SALT Speech Application Language Tags (SALT Forum) SALT Forum founded by Microsoft, Intel, …; 15 October 2001 very simple set of tags for extending existing markup languages (xHTML, XML) specification available Q1 2002 specification submitted to standards body (W3C??) mid 2002


Download ppt "The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch."

Similar presentations


Ads by Google