Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using CTW as a language modeler in Dasher Martijn van Veen 05-02-2007 Signal Processing Group Department of Electrical Engineering Eindhoven University.

Similar presentations


Presentation on theme: "Using CTW as a language modeler in Dasher Martijn van Veen 05-02-2007 Signal Processing Group Department of Electrical Engineering Eindhoven University."— Presentation transcript:

1 Using CTW as a language modeler in Dasher Martijn van Veen 05-02-2007 Signal Processing Group Department of Electrical Engineering Eindhoven University of Technology

2 2/21 Overview What is Dasher –And what is a language model What is CTW –And how to implement it in Dasher Decreasing the model costs Conclusions and future work

3 3/21 Dasher Text input method Continuous gestures Language model Let’s give it a try! Dasher

4 4/21 Dasher: Language Model Conditional probability for each alphabet symbol, given the previous symbols Similar to compression methods Requirements: –Sequential –Fast –Adaptive Model is trained Better compression -> faster text input

5 5/21 Dasher: Language model PPM: Prediction by Partial Match Predictions by models of different order Weight factor for each model

6 6/21 Dasher: Language model Asymptotically PPM reduces to fixed order context model But the incomplete model works better!

7 7/21 CTW: Tree model Source structure in the model, parameters memoryless KT estimator: a = number of zeros b = number of ones

8 8/21 CTW: Context tree Context-Tree Weighting: combine all possible tree models up to a maximum depth

9 9/21 CTW: tree update

10 10/21 CTW: Implementation Current implementation –Ratio of block probabilities stored in each node –Efficient but patented Develop a new implementation –Use only integer arithmetic, avoid divisions –Represent both block probabilities as fractions –Ensure denominators equal by cross-multiplication –Store the numerators, scale if necessary

11 11/21 CTW for Text Binary decomposition Adjust zero-order estimator

12 12/21 Results Comparing PPM and CTW language models –Single file –Model trained with English text –Model trained with English text and user input Input fileCTWPPMDifference Book 22.6322.8768.48 % NL4.3565.01413.12 % Input fileCTWPPMDifference GB2.8473.0516.69 % Book 22.3802.5436.41 % Book 22.2952.4486.25 % Input fileCTWPPMDifference Book 21.9792.1779.10 % NL2.3642.5105.82 %

13 13/21 CTW: Model costs What are model costs?

14 14/21 CTW: Model costs Actual model and alphabet size fixed -> Optimize weight factor alpha –Per tree -> not enough parameters –Per node -> not enough adaptivity –Optimize alpha per depth of the tree

15 15/21 CTW: Model costs Exclusion: only use Betas of the actual model Iterative process –Convergent? Approximation: To find actual model use Alpha = 0.5

16 16/21 CTW: Model costs Compression of an input sequence –Model costs significant, especially for short sequence –No decrease by optimizing alpha per depth? SymbolsAlpha 0.5 Alpha after exclusion Without model costs 1005.735.214.94 1.0004.224.073.68 10.0003.123.072.77 100.0002.332.322.13 600.0001.95 1.83

17 17/21 CTW: Model costs Symbols Alpha 0.5 Alpha after exclusion Max. probability in root Without model costs 100 0.84370.81170.81130.7022 1.000 0.62360.62130.62090.5330 10.000 0.38300.37920.37940.3276 100.000 0.26610.26520.26470.2389 600.000 0.22480.22420.22410.2098 Maximize probability in the root, instead of the probability per depth –Exclusion based on alpha = 0.5 almost optimal

18 18/21 CTW: Model costs LanguageAlpha 0.5Alpha after exclusion GB 2.012.04 NL 4.344.36 Results in Dasher scenario: Trained model –Negative effect if no user text is available Trained with concatenated user text –Small positive effect if user text added to training text, and very similar to it LanguageAlpha 0.5Alpha after exclusion GB2.302.28 NL4.124.13

19 19/21 Conclusions New CTW Implementation –Only integer arithmetic –Avoids patented techniques –New decomposition tree structure Dasher language model based on CTW –6 percent more are accurate predictions than PPM-D Decreasing the model costs –Only insignificant decrease possible with our method

20 20/21 Future work Make CTW suitable for MobileDasher –Decrease memory usage –Decrease number of computations Combine language models –Select locally best model, or weight models together Combine languages in 1 model –Models differ in structure or in parameters?

21 21/21 Thank you for your attention Ask away!

22 22/21 CTW: Implementation Store the numerators of the block probabilities


Download ppt "Using CTW as a language modeler in Dasher Martijn van Veen 05-02-2007 Signal Processing Group Department of Electrical Engineering Eindhoven University."

Similar presentations


Ads by Google