Machine Learning, Language Rules, and Statistical Strategies for Language Translation Andrew Runge Computer Systems Lab 2009-2010
Abstract Goal: Create an efficient, accurate Latin translator Methods: Language Rules Machine Learning Statistical Translation Language: Python
Introduction Language Translators Rule-based Strategies Machine Learning N-grams Statistical Translation
Background N-grams in Statistical Translation Generating theses Word tagging Tagging for word class, case, etc. Machine Learning
Figure 1: Tree of words sorted by sentence role from the assorted works of Cicero generated by the methods of McMahon and Smith
Discussion Dictionary creation Dictionary Keys and values Initial work on Machine Learning Word Tagging
Goals Second Quarter Tagging Initial Translation Third Quarter Continued Translation Apply Statistical Strategies http://www.grinningplanet.com/2003/roman-digest/ancient-rome-soldier-copyright1.gif
Figure 2: Demonstration of n-gram generation for determining word order in a sentence. Generated by Chen et al.
Results Current Results: Read from the dictionary VERY BASIC TRANSLATIONS Ex: curro ab puella => run by girl