Presentation is loading. Please wait.

Presentation is loading. Please wait.

Overcoming language barriers in patent information search Sep. 2010, Geneva Daeshik Jeh Director General, Information Policy Bureau Korean Intellectual.

Similar presentations


Presentation on theme: "Overcoming language barriers in patent information search Sep. 2010, Geneva Daeshik Jeh Director General, Information Policy Bureau Korean Intellectual."— Presentation transcript:

1 Overcoming language barriers in patent information search Sep. 2010, Geneva Daeshik Jeh Director General, Information Policy Bureau Korean Intellectual Property Office (KIPO)

2 Contents Introduction 1 1 KIPOs Activities 2 2 Global Efforts 3 3 Conclusion 4 4 1/34

3 1. Introduction Background Convertibility of information based on automatic translation or interpretation may shake up everything from employment and the organization of the office, to the role of literacy in daily life… - Power Shift by Alvin Toffler 2/34

4 1. Introduction Background As the world continues to come together in forms such as the UN, WTO, WIPO, EU, BRICs, NAFTA, and APEC, it has become increasingly important to exchange, convert and analyze information across various languages. The EU secretariat has approximately 4,000 translators and interpreters on its payroll, which consumed around 800 million Euros in This translates to 1% of its total budget and 40% of its administrative budget. In spite of all this effort, there still remains difficulties in multi-lingual translations (e.g., Finnish English Hungarian). * Source : EU Website 3/34

5 PCT Application 70% ,000 1,600 1, % 60% 50% 40% 1. Introduction Necessities – Patent examination Patent Application # of patent applications: a 26% increase from 2001 to 2007 # of patent applications by non-residents: continuously increasing; reached 43.3% of the total # of applications filed in ,460,536 1,491,494 1,701,179 1,854, , , , ,853 58% 57.4% 56.7% 58.3% 42% 41.7% 42.6% 43.3% * Source: WIPO website 4/34

6 Patent Application * Source: WIPO website - English: US, EP, GB, CA, AU - Non-English: JP, KR, CN, DE, RU Introduction Necessities – Patent examination PCT Application PCT applications: a 48% increase from 2001 to 2007 PCT applications in non-native English speaking countries: gradually increasing The PCT has now regulated its official languages to include: English, French, German, Japanese, Russian, Chinese, Spanish, Arabic, Korean and Portuguese 56.0% 47.3% 52.5% 55.2% 108, , , ,953 5/34

7 1. Introduction Necessities – Patent examination Patent Application Patent applications: a 26% increase from 2001 to 2007 Patent applications by non-residents: continuously increasing; reaching 43.3% of the total # of applications filed in 2007 PCT applications: a 48% increase from 2001 to 2007 PCT applications in non-native English speaking countries: gradually increasing The PCT has now regulated its official languages to include: English, French, German, Japanese, Russian, Chinese, Spanish, Arabic, Korean and Portuguese PCT Application Consequently, during patent examinations, it has now become necessary to cite and refer to foreign documents as much as to domestic documents. 6/34

8 1. Introduction Necessities – R&D As technologies become further developed and enhanced, they become globalized beyond an enterprises nationality and the conventional features of an area/region. Improve R&D projects Make it mandatory for prior art searches of patent databases to be included in the planning and evaluation of R&D projects Patent information should be widely used in R&D activities and the recent advent of Open Innovation has made it more necessary, now than ever, to refer to foreign patent information 7/34

9 1. Introduction How to overcome language barriers Study the language of target country Hire multilingual search-personnel Use a machine translation system faster and high quality prior art searches takes a long time to learn and be fluent in a foreign language more understandable translation and flexible management of human resources bad prior art searches due to the lack of expert knowledge of such personnel many translations in a short time low quality translations, big initial investment is required Merits Demerits Fast and Cost-effective! 8/34

10 Commercial Machine Translation Services Lots of commercial MT services including Google are available to the public. Diverse services such as translation of web pages, translation toolbar etc. 1. Introduction MT Provider Languages supported ServiceRemark Google57 languages Free online service Translation of web pages Cross lingual retrieval system Google toolbar and translator toolkit Statistics-based translation service Convenient user feedback Yahoo BABEL FISH 12 languages Free online service (max. 150 words) Translation of web pages Yahoo toolbar Technologies offered by SYSTRAN based on English and French SYSTRAN52 languages Fee-based service Translation service for use in multinational corporations Available at the USPTO and web portals such as Yahoo, Lycos, and Altavista based on English and French World Lingo 33 languages Free line translation services Fee-based service: web sites, translation API Available at the EPO Machine translation services for enterprises including Microsoft 9/34

11 Demerits Use of Commercial Machine Translation Services Since commercial MT services are being continuously extended to cover many languages, almost all patent documents in the world can be translated through them. There are many free services available to the public. As they cover general sentences, they can be applied to both patent and non- patent literature. Since commercial MT services are being continuously extended to cover many languages, almost all patent documents in the world can be translated through them. There are many free services available to the public. As they cover general sentences, they can be applied to both patent and non- patent literature. Merits 1. Introduction 10/34

12 Merits Demerits Use of Commercial Machine Translation Services Prior art searches through commercial MT services do not provide convenience in editing search queries. More so, search queries/results have to be copied and pasted one by one. Since commercial services are designed to support broad areas, they may be inefficient for a specialized area like patents. Prior art searches through commercial MT services do not provide convenience in editing search queries. More so, search queries/results have to be copied and pasted one by one. Since commercial services are designed to support broad areas, they may be inefficient for a specialized area like patents. Demerits Many IPOs including KIPO, EPO, and JPO either have customized commercial translation engines or in-house developed ones. 1. Introduction 11/34

13 Machine Translation Service Status of Some Major Countries in Asia Patent specific MT services targeting non-native English speaking countries such as China, Japan, and Korea KIPO and JPO have customized commercial translation engines, while SIPOs was developed in-house. 1. Introduction MT ProviderLanguages supportedService Sirius (Commercial Service Provider) Korean English Japanese Korean K-PION: Korean patent-utility model gazettes and examination information in English KOMPASS: English/Japanese documents in Korean targeting KIPO examiners KIPRIS: Overseas documents targeting the Korean public (English/Japanese into Korean) Toshiba (Commercial Service Provider) Japanese English Japanese Chinese AIPN: Japanese patent information in English targeting oversea examiners IPDL: Japanese patent information in English for the public Chinese Patent Information Center Chinese English CPMT (China Patent Machine Translation): free public service for translating specifications and claims of gazettes into English. 12/34

14 Introduction KIPOs Activities Global Efforts 3 3 MT Services 2.1 Patent Information Search 2.2 Conclusion /34

15 2. KIPOs Activities – MT Services Status of KIPOs MT Services ENGLISH JAPANESE KOREAN J2K Translation ENGLISH J2K Translation Service Launched in 2000 PL / NPL written in Japanese for KIPOs examiners PL written in Japanese for the general public Launched in 2000 PL / NPL written in Japanese for KIPOs examiners PL written in Japanese for the general public 14/34

16 2. KIPOs Activities – MT Services Status of KIPOs MT Services ENGLISH JAPANESE KOREAN ENGLISH JAPANESE K2E Translation K2E Translation Service Launched in 2005 For examiners of foreign IPOs and KIPO Korean patent documents Launched in 2005 For examiners of foreign IPOs and KIPO Korean patent documents 37 IPOs K-PION Service 15/34

17 2. KIPOs Activities – MT Services Status of KIPOs MT Services ENGLISH JAPANESE KOREAN JAPANESE K2E TranslationE2K Translation K2E Translation Service Launched in 2005 For examiners of foreign IPOs and KIPO Korean patent documents Launched in 2005 For examiners of foreign IPOs and KIPO Korean patent documents 37 IPOs K-PION Service E2K Translation Service Launched in 2008 PL/NPL written in English for KIPOs examiners PL written in English for the general public Launched in 2008 PL/NPL written in English for KIPOs examiners PL written in English for the general public 16/34

18 Specialized Machine Translation Services for Patent Documents To improve the quality of machine translation engines, the following issues have been considered: Linguistic features - Word order (Korean and Japanese have same word order Subject + Object + Verb phrase; while for Chinese and English, its Subject + Verb phrase + Object.) - Letters (English, German, and French originated from Latin characters; while Korean, Japanese and Chinese have their own characters) Digitization of patent documents - Accuracy in digitizing patent documents through OCR greatly influences the quality of machine translations. 2. KIPOs Activities – MT Services 17/34

19 Specialized Machine Translation Services for Patent Documents To improve the quality of machine translation engines, the following issues have been considered: Building of a patent-specific terminology dictionary Use of markup documents such as XML - e.g., KIPO has published patent gazettes in XML since February KIPOs Activities – MT Services Service type~ Total K2E3,200,000300,0003,500,000 E2K3,000,000300,0003,300,000 J2K1,200,000300,0001,500,000 18/34

20 2. KIPOs Activities – MT Services Methods of improving translation quality Features of Patent documents Abstract: usually a single long sentence and thus has a high possibility of error when machine translated Specification: brief explanation of the drawing is written in a simple sentence and the other parts, in general descriptive sentences. Claims: has a hierarchical tree structure made of independent and dependent claims. Written in a noun phrase Korean Patent Gazette 19/34

21 2. KIPOs Activities – MT Services Methods of improving translation quality Features of Patent documents In XML documents, the tags help users to identify the different sections as described in the previous slide. Different translation protocols depending on the tag information of the patent gazette Name Others Abstract, Summary Description Drawings Claims 20/34

22 XML of Korean Patent Gazette REQ_HNM_KE REQ_KE REQ_ABS_KE REQ_DRDES_KE REQ_CLAIM_KE 2. KIPOs Activities – MT Services Example – Korean Patent Gazette Oh Eun Young … This invention… 1.. Drawing 1 is a….. Methodology of… 21/34

23 2. KIPOs Activities – MT Services Applicability to Patent Documents Produced by Other IPOs A consistent pattern depending on each item Patterns distinguished in markup documents such as XML IPOsAbstractDescriptionClaims EPO Short sentences of less than 150 words Low possibility of errors when translated since it is comprised of short sentences and general statements Tree structure with independent and dependent claims written in noun phrases or clauses USPTO JPO Summarized in less than 400 words Brief description of drawings is written in short sentences. The entire Description is comprised of general statements. SIPO Concise statement with a single sentence or described respectively Brief description of drawings is written in short noun phrases without commas or periods. Other parts of Description is written in general statements. 22/34

24 Patent Information Search using MT engines To use MT engines for patent information search, the following issues have been considered: Target users and objectives of MT services - internal examiners or foreign examiners 2. KIPOs Activities – Patent Information Search Building of a database - original documents or machine translated documents Formulation of search queries (e.g., operators, terminology dictionary) Screen layout / organization U sers DB (Original docs.) Machine Translator U sers DB (Original docs.) DB (Machine translated docs.) Machine Translator * In terms of cost-benefit analysis, the former is better for low frequency of using foreign docs. while the latter is better for high frequency of using foreign docs. 23/34

25 2. KIPOs Activities – Patent Information Search KOMPASS (Korean Multifunctional Patent Search System) KOMPASS targets KIPO examiners and supports patent information search in English and Japanese. It conducts integrated search in Korean, English, and Japanese, respectively. Korean integrated search function targets Korean and Japanese documents English integrated search function targets all kinds of data retrieved from English documents and the search results can be translated into Korean. Japanese integrated search function targets all kinds of data retrieved from Japanese documents and the search results can be translated into Korean (only for patents and utility models) (Japanese documents: database built from machine-translated documents) 24/34

26 2. KIPOs Activities – Patent Information Search KOMPASS (Korean Multifunctional Patent Search System) KOMPASS targets KIPO examiners and supports patent information search in English and Japanese. It conducts integrated searches in Korean, English, and Japanese, respectively. Korean integrated search function targets Korean and Japanese documents English integrated search function targets all kinds of data retrieved from English documents and the search results can be translated into Korean. Japanese integrated search function targets all kinds of data retrieved from Japanese documents and the search results can be translated into Korean (only for patents and utility models) (Japanese documents: database built from machine-translated documents) U sers DB (Original docs.) Machine Translator -Japanese gazettes were previously searchable through machine translation. -Due to the rapid increase of its use by KIPO examiners, the search speed has been getting slower. 25/34

27 2. KIPOs Activities – Patent Information Search KOMPASS (Korean Multifunctional Patent Search System) KOMPASS targets KIPO examiners and supports patent information searches in English and Japanese. It conducts integrated searches in Korean, English, and Japanese, respectively. Korean integrated search function targets Korean and Japanese documents English integrated search function targets all kinds of data retrieved from English documents and the search results can be translated into Korean. Japanese integrated search function targets all kinds of data retrieved from Japanese documents and the search results can be translated into Korean (only for patents and utility models) (Japanese documents: database built from machine-translated documents) -In 2009, for faster search, all the Japanese gazettes were machine-translated and used to build a database. -KIPO examiners convenience has been greatly improved. U sers DB (Original docs.) DB (Machine translated docs.) Machine Translator 26/34

28 2. KIPOs Activities – Patent Information Search KOMPASS (Korean Multifunctional Patent Search System) KOMPASS targets KIPO examiners and supports patent information searches in English and Japanese. It conducts integrated searches in Korean, English, and Japanese, respectively. Korean integrated search function targets Korean and Japanese documents English integrated search function targets all kinds of data retrieved from English documents and the search results can be translated into Korean. Japanese integrated search function targets all kinds of data retrieved from Japanese documents and the search results can be translated into Korean (only for patents and utility models) (Japanese documents: database built from machine-translated documents) Korean Search Korean keyword search of Japanese documents (using J2K database) 27/34

29 2. KIPOs Activities – Patent Information Search K-PION is a free search service for helping foreign examiners better understand Korean patent information (examinations, gazettes etc). K-PION (Korean Patent Information Online Network) K-PION Patent Information Retrieval Translate Search results into English It also supports an English keyword search service. service for retrieving Korean patent and utility model gazettes and examination information from original and machine-translated documents an English keyword search service for KPAs service for Korean industrial designs and trademarks including PCT related documents an English keyword search service for Korean patent and utility model gazettes Search Korean gazettes Extended to Korean synonyms Automatically translated into Korean Keywords Input English Keywords Applicant Foreign Examiners Foreign Examiners 28/34

30 Introduction Global Efforts KIPOs Activities 2 2 IP5 Foundation Project on Mutual Machine Translation 3.1 Cross-Lingual Information Retrieval 3.2 Conclusion /34

31 3. Global Efforts IP5 Foundation Project on Mutual Machine Translation IP 5 offices will improve the quality of machine translation (MT) services and harmonize MT services among themselves. Achieved by: (Improvement of the quality of MTs) Joint quality review of non-English to English MTs by English speaking Offices MT system upgrade based on the quality review results Reduction of errors in original documents (Harmonization of MT services) Harmonization of the contents of MT services Regarding searches, this project will help each office to better understand the prior art documents of other offices and to use them in citations 30/34

32 3. Global Efforts WIPOs CLIR (Cross-Lingual Information Retrieval) CLIR has been newly added to the PATENTSCOPE and the beta version is currently under test by the public. When searching PCT and national application data, inputted keywords can be extended into other languages such as English, French, German, Japanese, and Spanish. Linked to Google translation service; search results are available in all the languages it supports. Available in over 1.7 million published international patent applications (PCT) and in more than 3 million when patent documents from Regional and National collections are included. 31/34

33 Introduction Conclusion KIPOs Activities 2 2 Global Efforts /34

34 4. Conclusion Considering the tremendous amount of global patent information, machine translation services will be the most practical and efficient way to search patent information of other IPOs. There are many ways to implement a patent search system using an MT engine. In selecting a specific methodology, each IPO should consider the frequency of use, budget, and linguistic features. For improving the performance of MT and search systems, each IPO may consider some options such as building of a machine-translated database, patent-specific terminology dictionary, and state-of-the-art IT technologies such as XML. International cooperation among IPOs is very important for the improvement of MT quality. KIPO has done its utmost in order to overcome language barriers and enable non-Korean speakers to better access Korean patent information. KIPO will continue to collaborate with other IPOs in this regard. 33/34

35 34/34


Download ppt "Overcoming language barriers in patent information search Sep. 2010, Geneva Daeshik Jeh Director General, Information Policy Bureau Korean Intellectual."

Similar presentations


Ads by Google