Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Translation with Scarce Resources The Avenue Project.

Similar presentations


Presentation on theme: "Machine Translation with Scarce Resources The Avenue Project."— Presentation transcript:

1 Machine Translation with Scarce Resources The Avenue Project

2 Scarce Resources Not much text in electronic form. Very few linguists who can write computational rules. No standard orthography –Kudaw, kusaw (work) (Mapudungun, Chile) –Not even sure of pronunciation: EH-nvelope, AH-nvelope (envelope) (English, US, not a language with scarce resources)

3 Our Approach Learn rules from a controlled corpus. Corpus is elicited from bilingual speakers. The informant only needs to translate and align words.

4 AVENUE Project New Ideas Use machine learning to learn translation rules from native speakers who are not trained in linguistics or computer science. Multi-Engine translation architecture can flexibly take advantage of whatever resources are available. Research partnerships with indigenous communities in Latin America and Alaska ( Mapudungun (Chile), Siona (Colombia), Inupiaq (Alaska)) Carnegie Mellon University, Language Technologies Institute: L. Levin, J. Carbonell, A. Lavie, R. Brown Impact Rapid and low-cost development of machine translation for languages with scarce resources. Policy makers can get input from indigenous people. Indigenous people can participate in government and internet. Schedule Year 1: Seeded Version Space learning– first version Year 2: Example-Based Machine Translation of Mapudungun (Chile). Year 3: Multi-Engine Mapudungun system (EBMT and partially learned transfer rules) Interface for data elicitation

5 Elicitation Interface

6 Elicitation Corpus: example English : I fell. Spanish: Caí Mapudungun: Tranün English: I am falling. Spanish: Estoy cayendo Mapudungun: Tranmeken

7 Elicitation Corpus: example English: You (John) fell. Spanish: Tu (Juan) caiste Mapudungun: Eymi tranimi (Kuan) English: You (Mary) fell. Spanish: Tu (María) caiste Mapudungun: Eymi tranimi (Maria) English: The rock fell. Spanish: La piedra cayó Mapudungun: Trani chi kura

8


Download ppt "Machine Translation with Scarce Resources The Avenue Project."

Similar presentations


Ads by Google