Presentation on theme: "Far-reaching Impact of MMT Zhendong Dong Center of Computer Language Center of Computer Language Information Engineering, CAS Information."— Presentation transcript:
Far-reaching Impact of MMT Zhendong Dong Center of Computer Language Center of Computer Language Information Engineering, CAS Information Engineering, CAS Panel on MMT MT Summit X Phuket, Thailand
What I wrote in 2000 … Chinese source text of my article: 10 7 English translation by a commercial MT in China (with no post-editing): During this period, China joined the collaborative project researched and developed by the machine translation of the Five countries in Asia which Japan initiates. Domestic unit nearly 10 has participated in this 7 -year long international project. The large- scale cooperation this time, for training talents, spread technology, accumulating resources (such as dictionary), and make the machine translation of China study and go to the world, all have far-reaching influence.
What I wrote recently … In the late 80s, in Makoto Nagaos frequent academic visits to China, I translated for him and learned a great deal from his rich experience in building semantic dictionary, especially the principles for semantic classification of nouns and verbs. From 1987 to 1992 when I was the chief technical leader of Chinese team participating in the machine translation project among Japan and other four Asian countries, I learned quite a lot from various kinds of MT dictionaries of Japanese IT companies and labs, especially from Japans EDR concept dictionary.
A training course of NLP researchers Fundamental and Comprehensive Dictionary construction (Fujitsu / Toshiba) Text analysis (NEC) Text generation ( Hitachi ) Input and output systems (Sharp / Oki) Interlingua
A MT laboratory Dictionary making C programming Technologies covering most of NLP phases Full-parsing (cf. Partial parsing) Full-phrases (cf. Base-NP)
An accumulation of NLP resources
An incubator of MT industry (1) Directly-nurtured China National Software & Service Co. Ltd. -- E=C, C-J products CCID Chinese Information Processing Labs -- E=C products GE-soft Co., Ltd. -- Multilingual Machine-aided products Nanjing University -- J=C products
An incubator of MT industry (2) Encouraged Huajian Group– Chinas biggest MT company -- Invited speech by Dr. Heyan Huang MT labs, Harbin University of Technology -- Exhibition Xiamen University Institute of Automation -- Invited speech by Dr.Bo Xu
An approach to international area of R&D
What we should learn? The resources and techniques of classical MT is useful; It would be dangerous and harmful to give up them; Be on the alert to the recently clamor of SMT; Be careful not to depict a new MT beauty to users (old ugly grandma again in the end!)
Strength of SMT Strength Aligned resources easy to share Easy and somewhat fast to shift to a new language pair Easy to accumulate extra-linguistic knowledge
Weakness of SMT Not so easy to meet wide demands of commercialization; In the foreseeable future, high-quality translation is also impossible; Speed and computing power;
Translation quality SEOUL, South Korea -- North Korea has accepted the idea of working toward restraint in its missile program, U.S. officials said Tuesday, citing progress on a critical issue dividing the two countries as they explore reconciliation after 50 years. Secretary of State Madeleine Albright ended her historic talks with North Korean leader Kim Jong Il struck by the improbability of it all -- a cordial visit to a Stalinist land that the United States until recently called a rogue state.
Translation by SMT,,, 50, improbability,,,
Translation by classical MT
One more interesting example -- probable SMTs deadly point Source text: … Translation by a SMT system: The young athletes of China … said the head of Singapore sports committee.
Technically on MMT Interlingua is not completely successful Interlingua for vocabulary is not successfully dealt with in MMT Interlingua in terms of relations among words is OK, but not sufficient in MMT Interlingua is difficult for multilingual processing