The PATENTSCOPE search system: CLIR February 2013 Sandrine Ammann Marketing & Communications Officer.

Slides:



Advertisements
Similar presentations
Changes to the PCT Regulations which entered into force on 1 April 2006 The Smart Patenting Solution.
Advertisements

PATENTSCOPE What’s new?
WIPO Patent Information Services
PATENTSCOPE WIPOs flagship online patent system Geneva, June 2013 Iustin Diaconescu Head, Patent Databases Service.
The PATENTSCOPE search system 2013 Retrospective Cyberworld December 2013 Sandrine Ammann Marketing & Communications Officer.
Complex queries in the PATENTSCOPE search system Cyberspace September 2013 Sandrine Ammann Marketing & Communications Officer.
Translation tools Cyberworld June 2014 Sandrine Ammann Marketing & Communications Officer.
Complex queries Cyberworld November 2014 Sandrine Ammann Marketing & Communications Officer.
Modern Language Association (MLA) International Bibliography Hosted by Gale Cengage Welcome to our Guided Tour Tour takes about 7 minutes. The show will.
PATENTSCOPE Analysis and translation tools September 2014 Sandrine Ammann Marketing & Communications Officer.
PATENTSCOPE Result list & translation tools March 2015 Sandrine Ammann Marketing & Communications Officer.
Module 1 Dictionary skills Part 1
The PATENTSCOPE search system Cyberspace October 2013 Sandrine Ammann Marketing & Communications Officer.
PATENTSCOPE WEBINAR Advanced search Cyber world January 2013 Sandrine Ammann Marketing & Communications Officer.
Advanced search PATENTSCOPE search system Cyberworld February 2015 Sandrine Ammann Marketing & Communications Officer.
Search Engines and Information Retrieval
Presentation Title Presentation Subtitle and/or Conference Name Place Day Month Year First Name Last Name Job Title.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Important Task in Patents Retrieval Recall is an Important Factor Given Query Patent -> the Task is to Search all Related Patents Patents have Complex.
Funded under the EU ICT Policy Support Programme Automated Solutions for Patent Translation John Tinsley Project PLuTO WIPO Symposium of.
PATENTSCOPE Overview Cyber world June 2015 Sandrine Ammann Marketing & Communications Officer.
Overview of the PATENTSCOPE® search service Jerusalem 21 July 2010 Alex Riechel Associate Officer, Innovation and Technology Support Section.
August 21, 2002Szechenyi National Library Support for Multilingual Information Access Douglas W. Oard College of Information Studies and Institute for.
Search Engines and Information Retrieval Chapter 1.
IATE EU tool for translation-oriented terminology work
PATENTSCOPE Patent Search Strategies and Techniques Andrew Czajkowski Head, Innovation and Technology Support Section Centurion September 11, 2014.
Lecture Four: Steps 3 and 4 INST 250/4.  Does one look for facts, or opinions, or both when conducting a literature search?  What is the difference.
A Study on Query Expansion Methods for Patent Retrieval Walid MagdyGareth Jones Centre for Next Generation Localisation School of Computing Dublin City.
MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.
Chapter 6: Information Retrieval and Web Search
PATENTSCOPE Result list and analysis tools Web September 2015 Sandrine Ammann Marketing & Communications Officer.
UA in ImageCLEF 2005 Maximiliano Saiz Noeda. Index System  Indexing  Retrieval Image category classification  Building  Use Experiments and results.
Web- and Multimedia-based Information Systems Lecture 2.
Complex queries in PATENTSCOPE Web November 2015 Sandrine Ammann Marketing & Communications Officer.
How to search using PATENTSCOPE Online October 2015 Sandrine Ammann Marketing & Communications Officer.
Customization in the PATENTSCOPE search system Cyberworld November 2013 Sandrine Ammann Marketing & communications officer.
The PATENTSCOPE search system 2015 Retrospective Cyberworld December 2015 Sandrine Ammann Marketing & Communications Officer.
The Cross Language Image Retrieval Track: ImageCLEF Breakout session discussion.
Cross Language Information Exploitation of Arabic Dr. Elizabeth D. Liddy Center for Natural Language Processing School of Information Studies Syracuse.
Overview of PATENTSCOPE Internet January 2016 Sandrine Ammann Marketing & Communications Officer.
Feature Assignment LBSC 878 February 22, 1999 Douglas W. Oard and Dagobert Soergel.
CLIR PATENTSCOPE search system Cyberworld February 2016 Sandrine Ammann Marketing & Communications Officer.
Cross Lingual Patent Retrieval Issues in Korean Language Minah Kim Korea Institute of Patent Information.
CLIR Cyberworld April 2014 Sandrine Ammann Marketing & Communications Officer.
PATENTSCOPE Overview Cyber world January 2014 Sandrine Ammann Marketing & Communications Officer.
PATENTSCOPE search system: Advanced search Cyberspace February 2014 Sandrine Ammann Marketing & Communications Officer.
PATENTSCOPE Result list and analysis tools Web May 2016 Sandrine Ammann Marketing & Communications Officer.
Using the Automatic Captions Feature. Objectives Learn how to use the Automatic Captions feature in YouTube  Edit the generated captions  Extract the.
PATENTSCOPE Patent Search Strategies and Techniques Andrew Czajkowski Head, Innovation and Technology Support Section.
Search Tools and Strategies Andrew Czajkowski Head, Innovation & Technology Support Section.
PATENTSCOPE Browse menu July 2016 Sandrine Ammann Marketing & Communications Officer.
PATENTSCOPE Translation tools
PATENTSCOPE Result list and analysis tools
PATENTSCOPE Translation tools
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
CLIR PATENTSCOPE search system
Retrospective of 2016 & plans for 2017
Options & Help menus Sandrine Ammann
Committee of Experts World Intellectual Property Organization
Retrospective 2017 & Future Plans
IPC & PATENTSCOPE Sandrine Ammann Marketing & Communications Officer
Search Techniques and Advanced tools for Researchers
CLIR PATENTSCOPE search system
Overview of PATENTSCOPE® search service Webinar September 2010
PATENTSCOPE: For beginners
Overview of PATENTSCOPE
PATENTSCOPE: For Beginners
PATENTSCOPE Translation tools
How to search with the PATENTSCOPE search system
Active AI Projects at WIPO
Presentation transcript:

The PATENTSCOPE search system: CLIR February 2013 Sandrine Ammann Marketing & Communications Officer

To the PATENTSCOPE search system webinar CLIR

Agenda CLIR Definition History Search with CLIR Usefulness Golden rules Technicalities Q & A session

CLIR Cross-Lingual Information Retrieval Finds synonyms in different domains Translates those found synonyms + original query into different languages

CLIR – 12 languages available NON-ASIAN Dutch English French German Italian Portuguese Russian Spanish Swedish ASIAN Chinese Japanese Korean

History

Lower language barriers in patent search First language tool developed in-house

CLIR: the interface

CLIR: precision vs recall Precision = the ability to retrieve the most precise results. Trying to find only precisely relevant items (high precision) = miss important items because they don't use quite the same vocabulary. Recall = the ability to retrieve as many documents as possible that match or are related to a query. Trying to find all the relevant items (high recall) = often get a lot of junk.

CLIR: precision vs recall

Example: precision

Example: recall

Example: ARM

CHIP

CLIR: supervised mode 2 modes: automatic and supervised Automatic: 1 step Supervised: 4 steps

Cross-Lingual Expansion (CLIR)

Result : the query from “container” to:

Supervised mode: 1 of 4 steps

Supervised mode : 2 of 4 steps

Supervised mode : 3 of 4 steps

Crowdsourcing "is the practice of obtaining needed services, ideas, or content by soliciting contributions from a large group of people and especially from the online community rather than from traditional employees or suppliers. […] Crowdsourcing is different from an ordinary outsourcing since it is a task or problem that is outsourced to an undefined public rather than a specific body." source:

Supervised mode : 4 of 4 steps

First: select languages

Second: select parameters

Stemming Process that removes common ending from words by English Porter algorithm electric¦al = electric electric¦ity = electric electron¦ics = electron

Third: check variants

Second: check variants

Editing

Checking: IPC

Supervised mode: results

Search examples: clothes for sport Entering “sports clothing” in the Simple search interface will return 168 results Entering “sports clothing” in the CLIR interface (in automatic mode) will return 5,449 results Entering “sports clothing” in the CLIR interface (in supervised mode) will return 1,023 results

Why use CLIR? A)Search full text collections simultaneously in many foreign languages B)Improve significantly the number of relevant results without increasing significantly the number of irrelevant results 485 results in English titles or abstracts for “sports clothing” 575 results obtained with CLIR searching in titles or abstracts in all languages C)Have confidence in your searches: No black box: users have access to the CLIR generated Boolean queries (albeit complex) and have the full control on them D)Have a responsive system even for complex queries

Golden rules Expansion modes Keyword very specific with only 1 meaning AUTO For any other queries, SUPERVISED is recommended Variants/synonyms Select words that you would like to appear in your search results If you have too much noise in the result list, remove generic variant

Golden rules Parameters 1. Title and abstract: unconstrained distance 2. Claims: sentence/paragraph distance 3. Description: sentence/paragraph distance Stemming recommended

Technicalities Compilation of a long list of titles in language pairs Creation of in-house extraction methodology Tool learns statistical bilingual dictionaries of titles EN FR ZH DE KO ES

Technicalities Quality of dictionaries: no human intervention The more title available, the better the coverage ChineseKoreanDutch EnglishPortugueseItalian FrenchRussianSwedish GermanSpanish Japanese

Technicalities Disambiguation: process of identifying the sense of a word in a sentence. Disambiguation is applied to keywords: 1.Technical domains based on the IPC 2.Synonyms selection

Future plans Improve terminology coverage of already supported languages Add other languages: over 200’000 titles and abstracts with associated high quality translations in English

Slides and recording +

mulumesc