Presentation is loading. Please wait.

Presentation is loading. Please wait.

May 31, 2002HCIL Open House Supporting Interactive Exploration of Foreign Language Collections Jianqiang Wang and Douglas W. Oard College of Information.

Similar presentations


Presentation on theme: "May 31, 2002HCIL Open House Supporting Interactive Exploration of Foreign Language Collections Jianqiang Wang and Douglas W. Oard College of Information."— Presentation transcript:

1 May 31, 2002HCIL Open House Supporting Interactive Exploration of Foreign Language Collections Jianqiang Wang and Douglas W. Oard College of Information Studies and Institute for Advanced Computer Studies University of Maryland (wangjq,oard)@glue.umd.edu

2 Why Find Foreign-Language Information? Former U.S. President Jimmy Carter told Cubans in a nationally broadcast speech on Tuesday that the U.S. should take the first step in reconciliation with Cuba. But he also pointed out to Cubans that the country's economic problems cannot all be blamed on the U.S. trade embargo against Cuba. - CNN NEWS 05/14/2002 - What’s the reaction of Cubans? –You may still want to know about it even if you don’t know Spanish (official language in Cuba)

3 Why Interactive? Sample search results of “Jimmy Carter visits Cuba” returned by AltaVista (language is specified as Spanish): But do you need all of these?

4 Why Translation? Part of a page in the second web page found: …Cualquiera de estas áreas problema podrían frustrar los esfuerzos para crear un tribunal, dicen los líderes defensores de los derechos humanos. Muchos de los problemas tienen que ver con cuestiones de soberanía, a la que la mayoría de los países detestan renunciar. Esto resulta especialmente cierto en Estados Unidos, según el ex Presidente Jimmy Carter, uno de los que proponen el tribunal, quien, de todos modos, se muestra escéptico de que sea aprobado. Estados Unidos no ha ratificado varios tratados importantes en relación con los derechos humanos, dice, por una serie de razones, entre ellas temas de soberanía y desacuerdos sobre la pena de muerte. Now imagine that you don’t know Spanish …

5 Does Translation Help Selection? English translation by SYSTRAN: … Anyone of these areas problem could frustrate the efforts to create a court, say the defending leaders of the human rights. Many of the problems have to do with sovereignty questions, to which most of the countries they detest to resign. This is specially certain in the United States, according to ex- President Jimmy Carter, one of whom proposes the court, who, anyway, is to skeptic from whom is approved. The United States has not ratified several important treaties in relation to the human rights, says, by a series of reasons, among them subjects of sovereignty and discords on the capital punishment. But what if a machine translation system is not available?

6 Two Translation Techniques Term-for-term gloss translation (Gloss) –Easily built for a wide range of language pairs (days) Look up terms in bilingual term list Select the translation that occurs most often Backoff to stems when surface form is untranslatable –Low accuracy & fluency Machine translation (MT) –Developing new language pairs is expensive (years) SYSTRAN is only available for 9 language pairs –Syntactic/semantic constraints improve accuracy & fluency

7 Research Questions Could the effectiveness of document selection achieved using MT be achieved using Gloss? Is document selection based on Gloss better than random document selection?

8 Experiment Design Searcher tasks –Read description of one of the four search topics 2 broad topics: e.g., Conference on birth control 2 narrow topics: e.g., Bush fire near Sydney – Make relevance judgment for each document in a ranked list Newswire stories from French newspaper Nominated by an automatic retrieval system Translated into English Evaluation –Comparison of searcher judgments with official judgments –Effectiveness measure the more official relevant documents and the less official non-relevant documents, the better effectiveness

9

10 Experiment results: MT vs. Gloss MT is almost always better –Significant with one-tail t-test on mean (p<0.05) Gloss-based selection usually better than random selection |-------- Broad topics ----------||-------- Narrow topics ---------|

11 Takeaway Messages Machine translation would be preferred –If available for your language pair, and –If you’re willing to pay for it Gloss translation can help document selection –Very fast and inexpensive to built –Easily plugged into your existing system

12 Next Steps Our next user study is next week! –And we’re still looking for participants!!! –Contact wangjq@glue.umd.edu Three questions: –Query translation assistance –Document selection –Iterative search strategies


Download ppt "May 31, 2002HCIL Open House Supporting Interactive Exploration of Foreign Language Collections Jianqiang Wang and Douglas W. Oard College of Information."

Similar presentations


Ads by Google