Presentation is loading. Please wait.

Presentation is loading. Please wait.

Voice Browsers Making the Web accessible to more of us, more of the time. SDBI November 2001, Shani Shalgi GeneralMagic Demo GeneralMagic Demo.

Similar presentations


Presentation on theme: "Voice Browsers Making the Web accessible to more of us, more of the time. SDBI November 2001, Shani Shalgi GeneralMagic Demo GeneralMagic Demo."— Presentation transcript:

1 Voice Browsers Making the Web accessible to more of us, more of the time. SDBI November 2001, Shani Shalgi GeneralMagic Demo GeneralMagic Demo

2 2 What is a Voice Browser? b Expanding access to the Web b Will allow any telephone to be used to access appropriately designed Web- based services b Server-based b Voice portals

3 3 What is a Voice Browser? b Interaction via key pads, spoken commands, listening to prerecorded speech, synthetic speech and music. b An advantage to people with visual impairment b Web access while keeping hands & eyes free for other things (eg. Driving).

4 4 What is a Voice Browser? b Mobile Web b Naturalistic dialogs with Web-based services.

5 5 Motivation b Far more people today have access to a telephone than have access to a computer with an Internet connection. b Many of us have already or soon will have a mobile phone within reach wherever we go.

6 6 Motivation b Easy to use - for people with no knowledge or fear of computers. b Voice interaction can escape the physical limitations on keypads and displays as mobile devices become ever smaller.

7 7 Motivation b Many companies to offer services over the phone via menus traversed using the phone's keypad. Voice Browsers are the next generation of call centers, which will become Voice Web portals to the company's services and related websites, whether accessed via the telephone network or via the Internet.

8 8 Motivation b Disadvantages to existing methods: WAP (Cellular phones, Palm Pilots)WAP (Cellular phones, Palm Pilots) –Small screens –Access Speed –Limited or fragmented availability –Akward input –Price –Lack of user habit

9 9 b Graphical browsing is more passive due to the persistence of the visual information b Voice browsing is more active since the user has to issue commands. b Graphical Browsers are client-based, whereas Voice Browsers are server- based. Differences Between Graphical & Voice Browsing The leading role is turned over to the USER

10 10 Possible Applications b Accessing business information: The corporate "front desk" which asks callers who or what they want The corporate "front desk" which asks callers who or what they want Automated telephone ordering servicesAutomated telephone ordering services Support desksSupport desks Order trackingOrder tracking Airline arrival and departure informationAirline arrival and departure information Cinema and theater booking servicesCinema and theater booking services Home banking servicesHome banking services

11 11 Possible Applications (2) b Accessing public information: Community information such as weather, traffic conditions, school closures, directions and eventsCommunity information such as weather, traffic conditions, school closures, directions and events Local, national and international newsLocal, national and international news National and international stock market informationNational and international stock market information Business and e-commerce transactionsBusiness and e-commerce transactions

12 12 Possible Applications (3) b Accessing personal information: Voice mailVoice mail Calendars, address and telephone listsCalendars, address and telephone lists Personal horoscopePersonal horoscope Personal newsletterPersonal newsletter To-do lists, shopping lists, and calorie countersTo-do lists, shopping lists, and calorie counters

13 13 Advancing Towards Voice b Until now, speech recognition and synthesis technologies had to be handcrafted into applications. b Voice Browsers intend the voice technologies to be handcrfted directly into web servers. b This demands transformation of Web content into formats better suited to the needs of voice browsing or authoring content directly for voice browsers.

14 14 b The World Wide Web Consortium (W3C) develops interoperable technologies (specifications, guidelines, software, and tools) to lead the Web to its full potential as a forum for information, commerce, communication, and collective understanding.

15 15 WC3 Speech Interface Framework  VoiceXML b Speech Synthesis b Speech Recognition DTMF GrammarsDTMF Grammars Speech GrammarsSpeech Grammars Stochastic (N-Gram) Language ModelsStochastic (N-Gram) Language Models Semantic InterpretationSemantic Interpretation b Pronunciation Lexicon b Call Control b Voice Browser Interoperation

16 VoiceXML b VoiceXML is a dialog markup language designed for telephony applications, where users are restricted to voice and DTMF (touch tone) input. text.html text.vxml Web Server Internet Browser

17 17 Speech Synthesis b The specification defines a markup language for prompting users via a combination of prerecorded speech, synthetic speech and music. You can select voice characteristics (name, gender and age) and the speed, volume, pitch, and emphasis. There is also provision for overriding the synthesis engine's default pronunciation.

18 Speech Recognition DTMF Grammars Speech Grammars Stochastic Language Models Semantic Interpretation Touch Tone USER Speech

19 19 DTMF Grammars b Touch tone input is often used as an alternative to speech recognition. b Especially useful in noisy conditions or when the social context makes it awkward to speak. b The W3C DTMF grammar format allows authors to specify the expected sequence of digits, and to bind them to the appropriate results

20 20 Speech Grammars b In most cases, user prompts are very carefully designed to encourage the user to answer in a form that matches context free grammar rules. b Speech Grammars allow authors to specify rules covering the sequences of words that users are expected to say in particular contexts. These contexual clues allow the recognition engine to focus on likely utterances, improving the chances of a correct match.

21 21 Stochastic (N-Gram) Language Models b In some applications it is appropriate to use open ended prompts (how can I help). In these cases, context free grammars are unuseful. b The solution is to use a stochastic language model. Such models specify the probability that one word occurs following certain others. The probabilities are computed from a collection of utterances collected from many users.

22 22 Semantic Interpretation b The recognition process matches an utterance to a speech grammar, building a parse tree as a byproduct. b There are two approaches to harvesting semantic results from the parse tree: 1. Annotating grammar rules with semantic interpretation tags ( ECMAScript ). 2. Representing the result in XML.

23 23 Semantic Interpretation - Example For example (1st approach), the user utterance: "I would like a medium coca cola and a large pizza with pepperoni and mushrooms.” could be converted to the following semantic result { drink: { beverage: "coke ” drinksize: "medium ” } pizza: { pizzasize: "large" topping: [ "pepperoni", "mushrooms" ] }}

24 24 Pronunciation Lexicon b Application developers sometimes need to ability to tune speech engines, whether for synthesis or recognition. b W3C is developing a markup language for an open portable specification of pronunciation information using a standard phonetic alphabet. b The most commonly needed pronunciations are for proper nouns such as surnames or business names.

25 25 Call Control b Fine-grained control of speech (signal processing) resources and telephony resources in a VoiceXML telephony platform. b Will enable application developers to use markup to perform call screening, whisper call waiting, call transfer, and more. b Can be used to transfer a user from one voice browser to another on a competely different machine.

26 26 Voice Browser Interoperation b Mechanisms to transfer application state, such as a session identifier, along with the user's audio connections. è The user could start with a visual interaction on a cell phone and follow a link to switch to a VoiceXML application.  The ability to transfer a session identifier makes it possible for the Voice Browser application to pick up user preferences and other data entered into the visual application.

27 27 Voice Browser Interoperation (2) è Finally, the user could transfer from a VoiceXML application to a customer service agent.  The agent needs the ability to use their console to view information about the customer, as collected during the preceding VoiceXML application. The ability to transfer a session identifier can be used to retrieve this information from the customer database.

28 28 Voice Style Sheets? b Some extensions are proposed to HTML 4.0 and CSS2 to support voice browsing b Prerecorded content is likely to include music and different speakers. These effects can be reproduced to some extent via the aural style sheets features in CSS2.

29 29 Authors want control over how the document is rendered. Aural style sheets (part of CSS2) provide a basis for controlling a range of features: Voice Style Sheets!  Volume  Rate  Pitch  Direction  Spelling out text letter by letter  Speech fonts (male/female, adult/child etc.)  Inserted text before and after element content  Sound effects and music

30 30 How Does It Work? b How do I connect? b Do I speak to the browser or does the browser speak to me? b What is seen on the screen? b How do I enter input?

31 31 Problems b How does the browser understand what I say? b How can I tell it what I want?  … what if it doesn ’ t understand?

32 32 Overview on Speech Technologies b Speech Synthesis Text to SpeechText to Speech b Speech Recognition Speech GrammarsSpeech Grammars Stochastic n-gram modelsStochastic n-gram models b Semantic Interpretation

33 33 What is Speech Synthesis? b Generating machine voice by arranging phonemes (k, ch, sh, etc.) into words. b There are several algorithms for performing Speech Synthesis. The choice depends on the task they're used for.

34 34 How is Speech Synthesis Performed? b The easiest way is to just record the voice of a person speaking the desired phrases. This is useful if only a restricted volume of phrases and sentences is used, e.g. schedule information of incoming flights. The quality depends on the way recording is done.This is useful if only a restricted volume of phrases and sentences is used, e.g. schedule information of incoming flights. The quality depends on the way recording is done.

35 35 How is Speech Synthesis Performed? b Another option is to record a large database of words. Requires large memory storageRequires large memory storage Limited vocabularyLimited vocabulary No prosodic informationNo prosodic information b More sophisticated but worse in quality are Text-To-Speech algorithms.

36 36 How is Speech Synthesis Performed? Text To Speech b Text-To-Speech algorithms split the speech into smaller pieces. The smaller the units, the less they are in number, but the quality also decreases. b An often used unit is the phoneme, the smallest linguistic unit. Depending on the language used, there are about 35-50 phonemes in western European languages, i.e. we need only 35-50 single recordings. february twenty fifth: f eh b r ax r iy t w eh n t iy f ih f th

37 37 Text To Speech b The problem is, combining them as fluent speech requires fluent transitions between the elements. The intelligibility is therefore lower, but the memory required is small. b A solution is using diphones. Instead of splitting at the transitions, the cut is done at the center of the phonemes, leaving the transitions themselves intact.

38 38 Text To Speech b This means there are now approximately 1600 recordings needed (40*40). b The longer the units become, the more elements there are, but the quality increases along with the memory required.

39 39 Text To Speech b Other units which are widely used are half-syllables, syllables, words, or combinations of them, e.g. word stems and inflectional endings. b TTS is dictionary-driven. The larger the dictionary resident in the browser is, the better the quality. b For unknown words, falls back on rules for regular pronunciation.

40 40 Text To Speech b Vocabulary is unlimited!!! b But what about the prosodic information? è Pronunciation depends on the context in which a word occurs. Limited linguistic analysis is needed.  How can I help?  Help is on the way!

41 41 Text To Speech è Another example:  I have read the first chapter.  I will read some more after lunch. b For these cases, and in the cases of irregular words and name pronunciation, authors need a way to provide supplementary TTS information and to indicate when it applies.

42 42 Text To Speech b But specialized representations for phonemic and prosodic information can be off putting for non-specialist users. b For this reason it is common to see simplified ways to write down pronunciation, for instance, the word "station" can be defined as: station: stay-shun

43 43 Text To Speech b This approach encourages users to add pronunciation information, leading to an increase in the quality of spoken documents, compared to more complex and harder to learn approaches. b This is where W3C comes in: Providing a specification to enable consistent control (generating, authoring, processing) of voice output by speech synthesizers for varying speech content, for use in voice browsing and in other contexts. Providing a specification to enable consistent control (generating, authoring, processing) of voice output by speech synthesizers for varying speech content, for use in voice browsing and in other contexts.

44 44 Overview on Speech Technologies 4Speech Synthesis 4Text to Speech b Speech Recognition Speech GrammarsSpeech Grammars Stochastic n-gram modelsStochastic n-gram models b Semantic Interpretation

45 45 Speech Recognition

46 46 Speech Recognition

47 47 Speech Recognition

48 48 Speech Recognition

49 49 Speech Recognition b Automatic speech recognition is the process by which a computer maps an acoustic speech signal to text. b Speech is first digitized and then matched against a dictionary of coded waveforms. The matches are converted into text.

50 50 Speech Recognition Types of voice recognition applications: b Command systems recognize a few hundred words and eliminate using the mouse or keyboard for repetitive commands. b Discrete voice recognition systems are used for dictation, but require a pause between each word. b Continuous voice recognition understands natural speech without pauses and is the most process intensive.

51 51 Speech Recognition b A speaker dependent system is developed to operate for a single speaker. b These systems are usually easier to develop, cheaper to buy and more accurate, but not as flexible as speaker adaptive or speaker independent systems.

52 52 Speech Recognition b A speaker independent system is developed to operate for any speaker of a particular type (e.g. American English). b These systems are the most difficult to develop, most expensive and accuracy is lower than speaker dependent systems. However, they are more flexible.

53 53 Speech Recognition b A speaker adaptive system is developed to adapt its operation to the characteristics of new speakers. It's difficulty lies somewhere between speaker independent and speaker dependent systems.

54 54 Speech Recognition b Speech recognition technologies today are highly advanced. b There is a huge gap between the ability to recognize speech and the ability to interpret speech.

55 55 How is Speech Recognition Performed? b Speech recognition technology involves complex statistical models that characterize the properties of sounds, taking into account factors such as male vs. female voices, accents, speaking rate, background noise, etc. b The process of speech recognition includes 5 stages: 1. Capture and digital sampling 2. Spectral representation and analysis 3. Segmentation. 4. Phonetic Modeling 5. Search and Match

56 56 How is Speech Recognition Performed? b Speech Grammars b HMM (Hidden Markov Modelling) b DTW (Dynamic Time Warping) b NNs (Neural Networks) b Expert systems b Combinations of techniques. HMM-based systems are currently the most commonly used and most successful approach. HMM-based systems are currently the most commonly used and most successful approach.

57 57 Speech Grammars b The grammar allows a speech application to indicate to a recognizer what it should listen for, specifically:  Words that may be spoken,  Patterns in which those words may occur,  Language of the spoken words.

58 58 Speech Grammars b In simple speech recognition/speech understanding systems, the expected input sentences are often modeled by a strict grammar (such as a CFG). b In this case, the user is only allowed to utter those sentences, that are explicitly covered by the grammar. Good for menus, form filling, ordering services, etc.Good for menus, form filling, ordering services, etc.

59 59 Speech Grammars b Experience shows that a context free grammar with reasonable complexity can never foresee all the different sentence patterns, users come up with in spontaneous speech input. b This approach is therefore not sufficient for robust speech recognition/ understanding tasks or free text input applications such as dictation.

60 60 For Example b Possible answers to a question may be "Yes" or "No”, but it could also be any other word used for negative or positive response. It could be "Ya," "you betch'ya," "sure," "of course" and many other expressions. It is necessary to feed the speech recognition engine with likely utterances representing the desired response.

61 61 Speech Grammars b What is done? Beta and Pilot versionsBeta and Pilot versions Upgrade versionsUpgrade versions

62 62 Speech Grammars - Example very

63 63 Speech Grammars - Example very big pizza with and

64 64 Hidden Markov Model Notations:  T = Observation sequence length  O = {o 1,o 2,…,o T } = Observation sequence  N = Number of States (we either know or guess)  Q = {q 1 …q N } = finite set of possible states  M = number of possible observations  V = {v 1,v 2,…,v M } finite set of possible observations  X t = state at time t (state variable)

65 65 Hidden Markov Model Distributional parameters b A = {a ij } where a ij = P(X t+1 = q j |X t = q i ) (transition probabilities) b B = {b i (k)} where b i (k) = P(O t = v k | X t = q i ) (observation probabilities) b  t = P(X 0 = q i ) (initial state distribution)

66 66 Hidden Markov Model Definitions b A Hidden Markov Model (HMM) is a five-tuple (Q,V,A,B,  ).  Let = {A,B,  } denote the parameters for a given HMM with fixed Q and V.

67 67 Hidden Markov Model Problems 1. Find P(O | ), the probability of the observations given the model. 2. Find the most likely state trajectory X = {x 1,x 2,…,x T } given the model and observations. (Find X so that P(O,X | ) is maximized) X = {x 1,x 2,…,x T } given the model and observations. (Find X so that P(O,X | ) is maximized) 3. Adjust the parameters to maximize P(O | ) P(O | )

68 68 Language Models b A Language model is a probability distribution over word sequences P(“And nothing but the truth”)  0.001P(“And nothing but the truth”)  0.001 P(“And nuts sing on the roof”)  0P(“And nuts sing on the roof”)  0

69 69 The Equation Notation: W' = argmax W P(O|W) P(W)

70 70 The N-Gram (Markovian) Language Model b Hard to compute P(W) P(“And nothing but the truth”) P(“And nothing but the truth”) b Step 1: Decompose probability - P(“And nothing but the truth”) = P(“And nothing but the truth”) = P(“And”)  P(“nothing” | “and”)  P(“And”)  P(“nothing” | “and”)  P(“but” | “and nothing”)  P(“the” | “and nothing but”)  P(“truth” | “and nothing but the”) P(“but” | “and nothing”)  P(“the” | “and nothing but”)  P(“truth” | “and nothing but the”)

71 71 The Trigram Approximation b Assume each word depends only on the previous two words (three words total – tri means three, gram means writing) P(“the”|“… whole truth and nothing but”)  P(“the”|“nothing but”) P(“the”|“nothing but”) P(“truth”|“… whole truth and nothing but the”)  P(“truth”|“but the”) P(“truth”|“but the”)

72 72 N-Gram - The Markovian Model b The Markovian state machine is an automatation with statistical weights b A state represents a phoneme, diphone or word. b We do not include all options, but only those which are related to the context or subject. b We calculate all probable paths from beginning to end of phrase/word and return the one with the maximum probability.

73 73 Back to Trigrams b How do we find the probabilities? b Get real text, and start counting! P(“the” | “nothing but”) P(“the” | “nothing but”)  Count(“nothing but the”) Count(“nothing but the”) Count(“nothing but”) Count(“nothing but”)

74 74 N-grams b Why stop at 3-grams? b If P(z|…rstuvwxy)  P(z|xy) is good, then P(z|…rstuvwxy)  P(z|vwxy) is better! b 4-gram, 5-gram start to become expensive...

75 75 The N-Gram (Markovian) Language Model - Summary b N-Gram language models are used in large vocabulary speech recognition systems to provide the recognizer with an a-priori likelihood P(W) of a given word sequence W. b The N-Gram language model is usually derived from large training texts that share the same language characteristics as expected input.

76 76 Combining Speech Grammars and N-Gram Models  Using an N-Gram model in the recognizer and a CFG in a (separate) understanding component  Integrating special N-Gram rules at various levels in a CFG to allow for flexible input in specific context  using a CFG to model the structure of phrases (e.g. numeric expressions) that incorporated in a higher-level N-Gram model (class N-Grams)

77 77 Overview on Speech Technologies 4Speech Synthesis 4Text to Speech 4Speech Recognition 4Speech Grammars 4Stochastic n-gram models b Semantic Interpretation

78 78 Semantic Interpretation b We have recognized the phrases and words, what now? Problems b What does the user mean? b We have the right keywords, but the phrase is meaningless or unclear.

79 79 Semantic Interpretation b As stated before, the technologies of speech recognition exceed those of interpretation. b Most interpreters are base on key words. Sometimes this is not good enough!Sometimes this is not good enough!

80 80 Back To Voice Browsers Making the Web accessible to more of us, more of the time. Personal Browser Demo Personal Browser Demo b Now we’ll talk about voiceXML, navigation and various problems

81 81 VoiceXML - Example 1 Hello World!  The top-level element is, which is mainly a container for dialogs. There are two types of dialogs: forms and menus. Forms present information and gather input; menus offer choices of what to do next.

82 82 VoiceXML - Example 1 Hello World! b This example has a single form, which contains a block that synthesizes and presents "Hello World!" to the user. Since the form does not specify a successor dialog, the conversation ends.

83 83 VoiceXML - Example 2 b Our second example asks the user for a choice of drink and then submits it to a server script: Would you like coffee,tea, milk, or nothing? A field is an input field. The user must provide a value for the field before proceeding to the next element in the form.

84 84 VoiceXML - Example 2 b A sample interaction is: C (computer): Would you like coffee, tea, milk, or nothing? H (human): Orange juice. C: I did not understand what you said. (a platform- specific default message.) C: Would you like coffee, tea, milk, or nothing? H: Tea C: (continues in document drink2.asp)

85 VoiceXML - Architectural Model Web Server VoiceXML interpreter context may listen for a special escape phrase that takes the user to a high-level personal assistant, or for escape phrases that alter user preferences like volume or text-to-speech characteristics. The implementation platform generates events in response to user actions (e.g. spoken or character input received, disconnect) and system events (e.g. timer expiration).

86 86 Scope of VoiceXML b Output of synthesized speech (TTS) b Output of audio files. b Recognition of spoken input. b Recognition of DTMF input. b Recording of spoken input. b Control of dialog flow. b Telephony features such as call transfer and disconnect. The language provides means for collecting character and/or spoken input, assigning the input to document-defined request variables, and making decisions that affect the interpretation of documents written in the language. A document may be linked to other documents through Universal Resource Identifiers (URIs).

87 87 VoiceXML b Voice XML is intended to be analogous to graphical surfing. b There are limitations. b Excellent for menu applications. b Awkward for open dialog applications b There are other languages: VoXML, omniviewXML

88 88 Navigation b The user might be able to speak the word "follow" when she hears a hypertext link she wishes to follow. b The user could also interrupt the browser to request a short list of the relevant links.

89 89 Navigation example User: User: links? Browser: Browser: The links are: 1 company info 2 latest news 3 placing an order 4 search for product details Please say the number now User: User: 2 Browser: Browser: Retrieving latest news...

90 90 Navigation through Headings b Another command could be used to request a list of the document's headings. This would allow users to browse an outline form of the document as a means to get to the section that interests them.

91 91 Navigation to Specific URLs b Graphical Browsers allow entering a wanted URL in the browser window b How is this supported in Voice Browsers? b Think: What problems do you anticipate? Will we be able to transfer from any voice portal to any other?Will we be able to transfer from any voice portal to any other? How do we know where to go?How do we know where to go?

92 92 How Slow / Fast ? b If voice browsers are meant to replace human operator dialog, they must be fast in response. b Speech Recognition / Interpretation / Synthesis depend on implementation b When a user requests a certain document, several related documents can be downloaded for easier access.

93 93 Friendly vs. Annoying b How friendly do you want the service to be? b Friendly is sometimes time consuming. b What percentage of the time does the user talk and what percentage of the time is he listening? b What parameters can I control?

94 94 Voice and Graphics b Can I access the Voice Browser through my computer? Some sites are authored only for voice.Some sites are authored only for voice. Some will be for both. This leads to more difficulties which must be dealt with.Some will be for both. This leads to more difficulties which must be dealt with.

95 95 Inserted text b When a hypertext link is spoken by a speech synthesizer, the author may wish to insert text before and after the link's caption, to guide the user's response. b For example: Driving instruction May be offered by the voice browser using the following words: For driving instructions press 1

96 96 Inserted text  The words "For” and "Press 1" were added to the text embedded in the anchor element. b On first glance it looks as if this 'wrapper' text should be left for the voice browser to generate, but on further examination you can easily find problems with this approach.

97 97 Inserted text  For example, the text for the following element cannot be “For” Leave us a message We need to say: We need to say: To leave us a message, press 5

98 98 Inserted text b The CSS2 draft specification includes the means to provide "generated text" before and after element content. b For example: <A accesskey="5" style='cue-before: "To"; cue-after: ", press 5"' href=LeaveMessage.html>Leave us a message

99 99 Handling Errors and Ambiguities b Users might easily enter unexpected or ambiguous input, or just pause, providing no input at all. b Some examples to errors which might generate events:  When presented with a numbered list of links, the user enters a number that is outside the range presented.  The phrase uttered by the user matches more than one template rule.

100 100 Handling Errors and Ambiguities  The phrase\sound uttered doesn't match a known command.  The user looses track and the browser needs to time-out and offer assistance  “Um”s and “Err”s b Authors will have control over the browser response to selection errors and timeouts. b Other errors might be dealt with by the browser or platform.

101 101 Some Nice Demos b Email assistant demo Email assistant demo Email assistant demo b Bank service demo (cough, ambiguity) service b Financial Center Demo (“um”s) Financial Center Demo Financial Center Demo b Telectronics Demo Telectronics Demo Telectronics Demo

102 102 Who has implemented VoiceXML interpreters? b BeVocal Café BeVocal Café BeVocal Café b General Magic General Magic General Magic b HeyAnita's FreeSpeech Developer Network FreeSpeech Developer NetworkFreeSpeech Developer Network b IBM Voice Server SDK Beta Program based on VoiceXML Version 1.0 IBM Voice Server SDK Beta Program IBM Voice Server SDK Beta Program b Motorola’s Mobile Application Development Toolkit (MADK) Mobile Application Development ToolkitMobile Application Development Toolkit

103 103 Who has implemented VoiceXML interpreters? b Nuance Developer Network Nuance Developer Network Nuance Developer Network b Open VXI VoiceXML interpreter Open VXI Open VXI b PIPEBEACH’s speechWeb PIPEBEACH b Telera’s DeVXchange DeVXchange b Tellme Studio Tellme Studio Tellme Studio b VoiceGenie VoiceGenie


Download ppt "Voice Browsers Making the Web accessible to more of us, more of the time. SDBI November 2001, Shani Shalgi GeneralMagic Demo GeneralMagic Demo."

Similar presentations


Ads by Google