Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cross Language Information Exploitation of Arabic Dr. Elizabeth D. Liddy Center for Natural Language Processing School of Information Studies Syracuse.

Similar presentations


Presentation on theme: "Cross Language Information Exploitation of Arabic Dr. Elizabeth D. Liddy Center for Natural Language Processing School of Information Studies Syracuse."— Presentation transcript:

1 Cross Language Information Exploitation of Arabic Dr. Elizabeth D. Liddy Center for Natural Language Processing School of Information Studies Syracuse University

2 Why Cross-Language Systems Matter There are approximately 4,500 living languages 32 million Americans switch from English to another language when they get home from work (U.S. Census 1990) Internationally, some who wish to do harm to the US communicate in other languages There are too few intel analysts who know the languages of interest

3 Internet Language Statistics http://global-reach.biz/globstats/

4 Internet Language Statistics (2) http://www.glreach.com/globstats/evol.htm l

5 How Cross-Language Retrieval Works User who speaks just one language asks their question of the system in that language Cross-language retrieval system: Will have indexed documents (e.g. foreign reports, emails, message traffic) written in other languages Translates user query into language of the documents Matches translated query against document index Produces a ranked list of relevant documents that are automatically translated into user’s language User then reads documents in their own language User can now make more fully informed decisions

6 SU’s Cross-Language Retrieval Research Have produced systems for French, Spanish, Japanese –DARPA, Intel, & corporate funding of $3.5 million Currently working in Dutch and Chinese –2 nd demo is of cross-language English-Chinese on a patent database from China Today’s funding announcement will enable us to specialize our current cross-language retrieval capabilities for Arabic Future work on information extraction and visualization in Arabic is of keen interest

7 1. LIVIA – English/English IR System Accepts users’ natural language expressions of complex information needs Provides precise retrieval against government compiled documents about terrorist activities Core technology funded by DARPA and Syracuse Research Corporation Demo’d by Ozgur Yilmazel

8

9

10 2. English-Chinese Retrieval Demo Cross – Language Retrieval of English queries against a Chinese patent database –Development funded by Unilever Corp, a multinational corporation which owns 140 companies in more than 100 countries Jiang Ping Chen –PhD student in School of Information Studies

11 Look to the Future Incorporate the next level of sophistication in Information Exploitation into Arabic Here seen in English –Adds Information Extraction as next step to Information Retrieval Seek your ongoing support for its extension into Arabic

12

13

14

15

16 Thank You! Questions? Care to try a query on LIVIA!


Download ppt "Cross Language Information Exploitation of Arabic Dr. Elizabeth D. Liddy Center for Natural Language Processing School of Information Studies Syracuse."

Similar presentations


Ads by Google