Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.

Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006

1 Agenda: Motivation Potentials through Multimodality Use Cases Map & Sound Logo Components & Modules for Mobile Multimodal Interaction User Perspective Challenges

2 Moore's Law: Technical capability will double approximately every 18 months. Buxton's Law: Technology designers promise functionality proportional to Moore's Law. Moore‘s Law: Growth of Technology Moore's Law: Technical capability will double approximately every 18 months. Buxton's Law: Technology designers promise functionality proportional to Moore's Law. Buxton‘s Law: Growth of Functionality Multimodality. Motivation.  The Challenge is how to deliver more functionality without breaking through the complexity barrier and making the systems so cumbersome as to be completely unusable. God’s Law (complexity barrier): Human capacity is limited and does not increase over time! God‘s Law: Growth of Human Capabiltiy (Billy Buxton)

3 Multimodality. Potentials through MMI.

4 Multimodality – New User Interfaces. Composite Usage Szenario: Map. Example: User selects a point of interest clicking with a stylus and speaking in order to focus it. „Zoom in here”

5 Multimodality – New User Interfaces. Composite Usage Szenario: SoundLogo Example: User selects a sound logo by clicking on the title with a stylus and speaking in order to hear it SoundLogo = Personalized Call Connect Signal „Play this sound logo”

6 Input n Voice n Stylus n Gesture n … ClientUserServerContent Back-End Voice Data Dialog Manageme nt Synchronisatio n Management Media Resource Management (ASR/TTS) Output n Voice n Text n Graphic n Video n … User InterfaceTypes of Multimodality n Sequential n Parallel Architecture LayerInternet / Services Multimodality – New User Interfaces. Components of multimodal end-to-end connection.

7 Recognition grammar speech ink etc. system- generated Interpretation interpretation mouse/ keyboard semantic interpretation Integration integration processor interaction manager EMM A system and environment application functions session component EMM A Multimodality – New User Interfaces. Main modules for parallel interaction. back

8 Multimodality – New User Interfaces. User Perspective: Feedback nutshell from divers previous innovation projects: “Give us speech control” Composite interaction with full prototype implementation for customer self service: 2 campaigns (SMS & Personalized Call Connect Signal) Need to actively communicate the possibilities & advantages of new multimodal interaction paradigm to user Real appreciation of speech control & good acceptance of “push-to-talk” mode Expectation: Symmetry & consistency between the interaction modes BUT: How do users really want to speak to the machine? How to provide feedback? / How to correct input errors? Great for context dependent service interaction BUT: Which mode is most suitable for which task? For whom? Under which circumstances?

9 Multimodality – New User Interfaces. Challenges: Sequential vs parallel i/o Unique interpretation of multimodal hypotheses Discourse phenomena like anaphora resolution and generation Input correction loops Encapsulation of i/o tools to achieve a generic front end Model Driven Architecture

10 Thank you for your attention!

11 Multimodality – New User Interfaces. Sequential and Parallel Input. Sequential input Multimodal applications may allow to choose between different input modalities, e.g. to speak or to click on a button Only one input channel will be interpreted, i.e. the user may speak or click on a button Multiple input channels will be interpreted sequentially as defined by the application Parallel input Also known as composite input Multimodal applications allow to use multiple input modes at nearly the same time, e.g. the user may speak and tap onto the screen The Multimodal application will combine multiple inputs and interpret them User navigates in a map and speaks “zoom in here” Select a field and then speak “My number is …” Then click only on a button Afterwards, navigate “Back to main menu”  Parallel input needs additional platform or application capabilities in order to combine (integrate) and interpret multiple inputs

12 User speaks and clicks on the screen: “Zoom in.” Recognition grammar speech ink Interpretation interpretation semantic interpretation Integration integration processor interaction manager EMM A Multimodality – New User Interfaces. Example: Composite input for voice and stylus.

13 Semantic Interpretation: action = zoom in location = x, y from stylus Recognition grammar speech ink Interpretation interpretation semantic interpretation Integration integration processor interaction manager EMM A Multimodality – New User Interfaces. Example: Composite input for voice and stylus. Interpretation: zoom_in

14 User clicks on map while speaking: x = 17 y = 54 Recognition grammar speech ink Interpretation interpretation semantic interpretation Integration integration processor interaction manager EMM A Multimodality – New User Interfaces. Example: Composite input for voice and stylus.

15 Interpretation: point 17 54 Recognition grammar speech ink Interpretation interpretation semantic interpretation Integration integration processor interaction manager EMM A Multimodality – New User Interfaces. Example: Composite input for voice and stylus.

16 Recognition grammar speech ink Interpretation interpretation semantic interpretation Integration integration processor interaction manager EMM A Multimodality – New User Interfaces. Example: Composite input for voice and stylus. Integration: <emma:emma version="1.0" xmlns:emma="http://www.w3.org/2003/04/emma"> zoom_in point 17 54

17 Interaction manager (application specific tasks) n Proof of input data n integrated input? n speech only? n ink/stylus only? n Proof of suitability of Integration results n input data compatible? (e.g. are the real number of stylus input (e.g. 2times) the same like the expected value n Mapping of recognition results from different modalities e.g. n Speech recognition error but stylus correct n Speech recognition OK but stylus incorrect n Confidence ok and stylus ok n Decision for error handling output n graphical, audio, prompt, TTS n Handling of redundant information and creation of related user reaction n prioritisation of input modalities Integration integration processor interaction manager EMM A Multimodality – New User Interfaces. Methods and functionalities: Interaction manager. back

Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.

Similar presentations

Presentation on theme: "Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.

Similar presentations

Presentation on theme: "Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006."— Presentation transcript:

Similar presentations

About project

Feedback