Machine Learning, Language Rules, and Statistical Strategies for Language Translation Andrew Runge Computer Systems Lab 2009-2010.

Slides:



Advertisements
Similar presentations
Development of a German- English Translator Felix Zhang.
Advertisements

Sorting Really Big Files Sorting Part 3. Using K Temporary Files Given  N records in file F  M records will fit into internal memory  Use K temp files,
CS0004: Introduction to Programming Introduction to Programming.
AB 11 22 33 44 55 66 77 88 99 10  20  19  18  17  16  15  14  13  12  11  21  22  23  24  25  26  27  28.
©Brooks/Cole, 2001 Chapter 2 Introduction to The C Language.
Data Structures Introduction. What is data? (Latin) Plural of datum = something given.
Selection Sort
Chapter 1 Overview of Computers and Programming. Copyright ©2004 Pearson Addison-Wesley. All rights reserved.1-2 Figure 1.3 Components of a Computer.
Programming Concepts Jacques Tiberghien office : Mobile :
CHAPTER 1: INTRODUCTION TO COMPUTER SCIENCE Introduction to Computer Science Using Ruby (c) 2012 Ophir Frieder et al.
Development of a German- English Translator Felix Zhang Period Thomas Jefferson High School for Science and Technology Computer Systems Research.
Statistical Analysis of Mouse Gut Microbiota Alex Tran Computer Systems Lab
English to Latin Exercises Before you translate an English sentence into Latin, it’s important to analyze the English words. Then you can decide what form.
Syntax The ordering of the words in a sentence according to their meaning in the sentence. The boy gives roses to the girl. The Subject The Verb The Direct.
Chapter 6 An Introduction to System Software and Virtual Machines.
Selection Sort
November 1 Braded folder dueBraded folder due Imaginary Story Imaginary Story Nouns Nouns Singular/ plural Singular/ plural Grammar Wkbk. Pg. 65-rules.
A method to restrict the blow-up of hypotheses... A method to restrict the blow-up of hypotheses of a non-disambiguated shallow machine translation system.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Machine Language Computer languages cannot be directly interpreted by the computer – they are not in binary. All commands need to be translated into binary.
M ULTIPLE P ATHWAYS T O U NDERSTAND See It (Models, Demonstrations, Visuals, Posted Directions & Vocabulary, Transparent Visual Design – e.g. graphic organizers)
Formal Languages and Automata FORMAL LANGUAGES FINITE STATE AUTOMATA.
Unit 1 test review.
CSCE 210 Data Structures and Algorithms
Why don’t programmers have to program in machine code?
A Predictive Blissymbolic to English Translation System
Search in Google's N-grams
The History of ARM and Microcontrollers Chapter 1
English-Korean Machine Translation System
Topic: Programming Languages and their Evolution + Intro to Scratch
Chapter 5- Assembling , Linking, and Executing Programs
Introduction to programming
Course 1 Introduction to Formal Languages and Automata Theory (part 1)
Using Algorithms Copyright © 2008 by Helene G. Kershner.
Natural Language Processing (NLP)
Presentation by Julie Betlach 7/02/2009
and Executing Programs
Formal Language.
Programming languages and software development
Speaker: Jim-an tsai advisor: professor jia-lin koh
Using Algorithms Copyright © 2008 by Helene G. Kershner.
Simulation of Marketing Mix – Placement of Business
Tagging and Statistically Translating Latin Sentences
High Level Programming Languages
خشنه اتره اهورهه مزدا شيوۀ ارائه مقاله 17/10/1388.
Automated MS Word and PowerPoint Translator
Tools for Processing Big Data Jinan Al Aridhee and Christian Bach
Hello World! Syntax.
شاید کتاب شما انتخاب شود
Figure 3 Serum phosphate level is associated with
Unit 3 lesson 2-5 The Need For Algorithms- Creativity in Algorithms – Simple Commands - Functions Day 18.
ICT Programming Lesson 1:
The Purpose of this Course
Nominative Case Lingua Latina I.
Critically reviewing the literature
Computing Introduction.
Natural Language Processing (NLP)
Simulation of Marketing Mix – Placement of Business
Effective Communication
Design and Analysis of Algorithms
J. Byun et al. In Secure Data Management, LNCS 4165,
Reasons To Study Programming Languages
Statistical Machine Translation Part VI – Phrase-based Decoding
Web Content Extraction Based on Maximum Continuous Sum of Text Density
Figure:
Chunking Believe it or not, this is a real linguistic term (also used in computer programming). It refers to reading by grouping portions of text into.
Natural Language Processing (NLP)
Critically reviewing the literature
CS 791Graduate Topics in Computer Science [Software Engineering]
Presentation transcript:

Machine Learning, Language Rules, and Statistical Strategies for Language Translation Andrew Runge Computer Systems Lab 2009-2010

Abstract Goal: Create an efficient, accurate Latin translator Methods: Language Rules Machine Learning Statistical Translation Language: Python

Introduction Language Translators Rule-based Strategies Machine Learning N-grams Statistical Translation

Background N-grams in Statistical Translation Generating theses Word tagging Tagging for word class, case, etc. Machine Learning

Figure 1: Tree of words sorted by sentence role from the assorted works of Cicero generated by the methods of McMahon and Smith

Discussion Dictionary creation Dictionary Keys and values Initial work on Machine Learning Word Tagging

Goals Second Quarter Tagging Initial Translation Third Quarter Continued Translation Apply Statistical Strategies http://www.grinningplanet.com/2003/roman-digest/ancient-rome-soldier-copyright1.gif

Figure 2: Demonstration of n-gram generation for determining word order in a sentence. Generated by Chen et al.

Results Current Results: Read from the dictionary VERY BASIC TRANSLATIONS Ex: curro ab puella => run by girl