22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input.

22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 GUI Input Modalities How can a user provide input to the GUI? – keyboard – mouse, touch pad – joystick – touch screen – pen / stylus – speech – other… more error! harder! (?)

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Pen input Pen-based interfaces in mobile computing

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Pen input Handwriting – very general, well-developed human skill thus, make use of what users can already do! – but hard to recognize (for people & machines) Gestures – gesture alphabets Palm Pilot graffiti – editing gestures – easier to recognize

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Speech input Limited speech recognition – only allow small sets of words/phrases e.g., “one” – “nine” for phone menus Full speech recognition – again, general, well-developed skill – full standard vocabulary (1000-10000+ words) American English: 600,000 words. – specialized vocabularies (research, medical, …) “editing” vocabularies (back, delete, …)

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Handwriting & Speech Common issues – vocabulary size – individual variability speaker dependent, adaptive, independent – signal segmentation isolated words, continuous Let’s look at handwriting as an example, but almost all concepts apply to speech too – not to mention other inputs, such as eye movements!

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Off-line recognition – examine static output of handwriting, i.e., the end result of the writing On-line recognition – examine dynamic movement of handwriting, i.e., the strokes, pen up/downs involved Which is more “informed”? more useful? Off-line vs. on-line recognition

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Recognition techniques Neural networks – neurally-inspired computational models – input: bitmap, or “vectorized” strokes – output: probably characters – best for off-line recognition Hidden Markov models (HMMs) – powerful probabilistic models – input: vectorized strokes – output: full recognition of chars, words, etc. – best for on-line recognition Consider HMM-based on-line recognition… And hybrid methods!

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 On-line feature extraction On-line strokes  feature vectors – basic features: pen up/down, direction, velocity – useful features: curvature, reversal,...

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 On-line recognition Hidden Markov models (HMMs) – probabilistic models for dynamic behavior Set of N states with – a(i,j) = probability of state transition i  j – b(o,i) = probability of seeing o in state i can be discrete or continuous prob. distributions s1 b(o,1) s2 b(o,2) s3 b(o,3) a(1,1)a(2,2)a(3,3) a(1,2)a(2,3)

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Hidden Markov models Let’s say we have – M = HMM representing predicted behavior – O = observation vector sequence O Three problems – evaluation: find Pr(O|M) – decoding: find the state sequence Q that maximizes Pr(O|M,Q) – training: adjust parameters of M to increase Pr(O|M)

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Hidden Markov models HMM evaluation – find Pr(O|M) – evaluate O = … – can we do this efficiently? x =.8 y =.2 x =.5 y =.5 x =.2 y =.8 x =.6 y =.4.4.6 1.2.8

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Hidden Markov models HMM decoding (Viterbi algorithm) – find best state sequence through HMM, maximizing the probability of the sequence – Given the word “Hello” Maximize the possible paths through the model by calculating their probability of being actual words. Its less probable that I meant: “Hallo” or “Hejjo” It is more proabable that I meant: – Case 1, word matching:“Jello” – Case 2, letter matching (vector strokes): “HeIIo”

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Hidden Markov models HMM training (Baum-Welch / EM algorithm) – re-adjust a(i,j), b(o,i) to increase Pr(O|M) – iterative procedure – allows for fine-tuning of HMM parameters for particular observation sets Every time I write, “HeIIo” I really mean “Hello”. – (Increase in the Vector Stroke probability) – susceptible to just my behaviour. The machine learns the probability of my “Hello”, not your “Hello”.

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Hidden Markov models Composing HMMs – we can add “sub-” HMMs into larger HMMs, creating a model hierarchy at different levels For instance, we can create three levels – strokes – letters – words

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 On-line recognition Stroke HMMs with states – up-down loop s1: up, + curvature, hi velocity s2: down, + curvature, hi velocity – up-down cusp s1: up, + curvature,, hi velocity s2: 0 velocity s3: down, + curvature, hi velocity – up-down ramphoid s1: up, – curvature, hi velocity s2: 0 velocity s3: down, + curvature, hi velocity

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 On-line recognition Letter HMMs based on stroke HMMs ramphoidcusp loopdown-up loop (‘o’ ?) (‘w’ ?) (‘g’ ?)

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 On-line recognition Word HMMs based on letter HMMs – basic idea is straightforward – but it’s deceptively tricky — why?? HELLOPLATYPUS

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 On-line recognition Putting it all together – compacting states – taking word frequencies into account – where do frequencies come from? :-) Note: Probabilities aren’t real… H E L….07 I … A ….43.31.04.27

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Handwriting problems Speed-accuracy tradeoff – as people speed up, their handwriting degrades (uh, no duh!) Printed vs. handwritten? – often some combination of the two!! Dotting i’s, crossing t’s, … for on-line recog’n (minding your p’s and q’s?) Mixing language with graphics & gestures

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Speech recognition Same basic ideas for recognition – convert to recognizable signal (transforms) – recognize using hybrid methods and a hierarchy of phonemes, words, etc. Many similar / analogous problems – individual variability (esp. female/male voices) – mixing “real input” with command input – speed-accuracy tradeoff (?)

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Eye-movement recognition (?!) – Yet again, same idea: translate noisy signal to what people actually intended – Example: “Eye-typing” system

CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University.22 Discussion Have you used handwriting/speech systems? What are the benefits of these systems? What is handwriting/speech good for? When is it easier to use standard input?

22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input.

Similar presentations

Presentation on theme: "22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input.

Similar presentations

Presentation on theme: "22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input."— Presentation transcript:

Similar presentations

About project

Feedback