Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS-0612791. Solution: The Chinese Room Conclusions.

Similar presentations


Presentation on theme: "The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS-0612791. Solution: The Chinese Room Conclusions."— Presentation transcript:

1 The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS-0612791. Solution: The Chinese Room Conclusions What is Machine Translation? Related Work Machine Translation (MT) is the process of automatically converting text from one human language to another (Ex: Chinese to English) MT is performed by algorithms that extract statistical translation rules from millions of human generated translation pairs (sentences with the same meaning in both Chinese and English) Uses of MT: People that want to read text in an unknown foreign language People who are barely proficient with a language can use it to learn Businesses want to translate documents into other languages We focus on the first case, though our work could easily be extended to the other cases as well Idea: We propose a collaborative approach between users, who have good world knowledge and writing skills, and the machine, which is good at processing large amounts of data into useful linguistic resources. We have created an interactive visualization of these linguistic resources that enables the user to explore alternative translations in order to better understand and correct machine translations. Design was based on iterative improvement with expert users Promising preliminary results on pilot study “The Chinese Room” is an interface that allows users to explore and interact with linguistic resources as they attempt to understand poor automatic translations Many remaining challenges, including integrating other forms of information, and exploiting uncertain sources of information Our tool can manageably expose a variety of resources and a huge amount of data to the user, allowing monolingual speakers to determine the most likely translation without any knowledge of the foreign language. Future Work Displays the original characters, automatically segmented into approximate words Displays the mapping (given by the MT system) between Chinese and English words English words are clustered together based on these alignments. Displays the English translation generated by the MT system for the selected Chinese sentence Represents definitions for words (first column) and individual Chinese characters (second column) Definitions are aligned horizontally with the word or character that they define Shows the automatically generated grammatical structure of the source sentence Colors correspond to different parts of speech (blue for verbs, red for nouns, etc) Other resources are displayed as text in the rightmost pane: N-Best Re-Translations: This is a list of candidate English sentences (or phrases) that the Machine Translation system (in this case, Google) was considering for the phrase selected by the user. The Problem With Machine Translation Machine translated sentences are often difficult or impossible to understand. Example machine translation: He utter eyes and not the slightest attention As leakage. Intended meaning: His eyes were wide apart; nothing in their field of vision escaped. Errors are caused by the machine’s lack of world knowledge and its inability to form coherent sentences or understand ideas. DerivTool – An interface for observing the inner workings of a specific MT system. Required knowledge of both languages and an in-depth knowledge of how MT works. [DeNeefe et. al, 2005] Design cues from systems such as TreeJuxtaposer [Munzner, 2003], and from Envisioning Information [Tufte, 1990] Further applications of this basic collaborative approach (language education, end-user understanding, commercial translation processes, MT design and more) Extending the tool to other language pairs (shown to the left working with Arabic) Further efforts in usability and ease-of-use could be very beneficial Other resources (manually created translation rules, incorporation of translation memory) might be helpful to the user. Visualization: Interaction: Clicking on English words allows them to be edited Dragging English words allows the user to visually experiment with different word orders Mousing over the definitions highlights the corresponding Chinese character or word Clicking on the Chinese Syntax Tree lines causes that section of the sentence to collapse (or expand if clicked again later), allowing the user to better focus on difficult parts of the sentence Clicking and dragging selects a Chinese phrase (and begins the search for similar example translations) Clicking on an example search result puts that sentence in the main view for more detailed inspection. Clicking on the translation tab requests N-Best translations Clicking on a sentence in the document view selects it as the current sentence Clicking on the edit tab allows the user to type and directly modify the translated text Chinese Text: Word Alignments Chinese Syntax Tree English Text: Translation Dictionary: Additional Resources: Document View: Every sentence in the document can be seen at once, giving a better sense of the meaning in the context of the document. Edit Area: The English translation can be edited in a small text area so that users can quickly edit and annotate the sentence. Example Search: Search results are displayed in the rightmost column, with the matches shown in pink, and are sorted by relevance. By interacting with the various components, the users can better understand the original meaning of the Chinese text. Screenshot of the Chinese Room Josh Albrecht, Rebecca Hwa, and G. Elisabeta Marai {jsa8,hwa,marai}@cs.pitt.edu Department of Computer Science, University of Pittsburgh


Download ppt "The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS-0612791. Solution: The Chinese Room Conclusions."

Similar presentations


Ads by Google