Didier Perroud Raynald Seydoux Frédéric Barras
Abstract Objectives Modalities ◦ Project modalities ◦ CASE/CARE Implementation ◦ VICI, Iphone, Voice recognition, Network Demonstration Conclusion
Coordination between two persons to move a ball into a labyrinth Rotation possible on the x and y axis Gates can be opened with vocal and gestural commands
Coordinate the following technologies: ◦ Augmented reality with tags ◦ Gesture detection ( with Iphone accelerometers) ◦ Voice recognition ( words) ◦ Collaborative environments ◦ Physic engine
Inputs ◦ Hand rotation in x and y axis ( one axis per player) direct manipulation of the labyrinth board ◦ Hand pumping for gates’ openings ◦ Voice recognition (words) for selecting gate to open and start the game Outputs ◦ Image on the beamer ◦ Iphone vibrations
CASE ◦ Semantic level of abstraction CARE ◦ Gesture orientation: assignment ◦ Gesture pumping/Voice selection: complementary to open a gate ◦ Voice commands: assignment Decision level fusion Fission: image, vibration
Blocks ◦ Webcam, Tag detection ◦ OpenGL, Physic engine Multimodality Management ◦ state machine Augmented reality application ◦ event based Messages from the gateway ◦ Voice events ◦ Gesture events (orientation X and Y, shake) Messages to the gateway ◦ Vibration events
Handle the UIAccelerometer interface Generate motionEvent when shaking Messages to the gateway ◦ Orientations (X or Y) ◦ Shake Messages from the gateway ◦ Vibrate
Windows speech API SDK Features: ◦ API definition files ◦ Runtime component ◦ Control Panel applet ◦ Text-To-Speech engines in multiple languages. ◦ Speech Recognition engines in multiple languages. ◦ Redistributable components ◦ Sample application code. ◦ Sample engines ◦ Documentation.
Our System A speech recognition engine A grammar <grammar xmlns=" xmlns:xsi=" xsi:schemaLocation=" xml:lang="en-EN" version="1.0"> New game Pause Exit Open gate one Open gate two Close gate one Close gate two
Recognition comparison before training / after training
Live Videos
Problems with the physic engine ◦ Coordination user moves – physic moves Voice recognition OK High-level programing Heterogeneity not a problem Functional prototype
Thank you