The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number.

Slides:



Advertisements
Similar presentations
Genome Annotation: A Protein-centric Perspective.
Advertisements

WIPO Patent Information Services
Title slide European Patent Office The Master Classification Database Jürgen Rampelmann IPC Forum, Geneva 13 February 2006.
5 EBI is an Outstation of the European Molecular Biology Laboratory. Master title International Molecular Exchange Consortium - IMEx Sandra Orchard EMBL-EBI.
The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number.
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Overview of PubWEST Patent and Trademark Depository Library Training Seminar April 2006.
February 2012 Presentation to the Biotechnology/Chemical/Pharmaceutical Customer Partnership Introduction to the Cooperative Patent Classification (CPC)
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Pfam(Protein families )
SciVal Experts & SciVal Funding Information Sessions.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
Oceanic Flight 815. Oceanic Flight 815 Regulatory banking reporting at Banco de España Some functions require the collection of data from credit.
Chapter 2 Sequence databases A list of the databases’ uniform resource locators (URLs) discussed in this section is in Box 2.1.
KIPO’s Activities for ST.96. Topics of Discussion II. Activities I. Background III. Future Plan.
UniProt - The Universal Protein Resource
1 Substantive Patent Harmonization and Japan’s Stance Shinjiro ONO Deputy Commissioner Japan Patent Office 2002 High Technology Protection Summit.
1 Unity of Invention: Biotech Examples TC1600 Special Program Examiner Julie Burke (571)
USPTO PCT Task Force Public Hearing January 13, 2010 Lawrence T. Welch Assistant General Patent Counsel Eli Lilly & Co.
Electronic questionnaires between Customs statistics and PSIs for data checking Anne Oikarinen, Finnish Customs, Statistics Unit.
Pattern databasesPattern databasesPattern databasesPattern databases Gopalan Vivek.
The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number.
XBRL Formula in use: Improving the quality of data Mark Montoya (FDIC) Víctor Morilla (Central Bank of Spain)
Biological Sequences and Patents Chemical compounds and Patents Agenda Acknowledgements: FELICS is funded by the European.
Dr. Michael Berger, European Patent Attorney © Michael Berger Intellectual Property (IP): Patents for Inventions.
Cooperation between SSO and JPO through use of standardization documents in patent examination Yu Kochi Deputy Director Administrative Division Japan Patent.
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number.
Recent developments in patents statistics and data bases at EPO and OECD EPIP – Bocconi February 24-25, 2006 Dominique Guellec OECD.
AIPLA 2012 Annual Meeting Washington 25 October 2012 Worksharing, utilisation and the CPC Niclas Morey Director, International Organisations, Trilateral.
Faculty at the Catalan public universities. Retributive system and mobility ACUP Seminar : “Faculty at the Catalan public universities. Retributive system.
IST E-infrastructure shared between Europe and Latin America Biomedical Applications in EELA Esther Montes Prado CIEMAT (Spain)
European Patent Office PCT Minimum Documentation EPO views on a new definition Gérard Giroud, Principal Director PD Tools European Patent Office WIPO,Geneva.
U.S. Patent and Trademark Office Technology Center 1600 Michael P. Woodward Unity of Invention: Biotech Examples.
NCBI Review Concepts Chuong Huynh. NCBI Pairwise Sequence Alignments Purpose: identification of sequences with significant similarity to (a)
A Survey of Patent Search Engine Software Jennifer Lewis April 24, 2007 CSE 8337.
UniProt Non-redundant Reference Cluster (UniRef) Databases Swiss Institute of Bioinformatics (SIB) European Bioinformatics Institute (EMBL-EBI)
Regions of Knowledge WP and calls for proposals 2008 Warszawa, European Commission DG RTD/B4 Anna Rémond.
OARE Module 5A: Scopus (Elsevier). Table of Contents About Scopus (Elsevier) Using Scopus Search Page Results/Refine Search Pages Download, PDF, Export,
Assignee Name Harmonization Efforts at the U.S. Patent and Trademark Office US Patent and Trademark Office Office of Electronic Information Products Patent.
PPH from the JPO Point of View Yutaka Niidome Deputy Director Japan Patent Office AIPLA PPH Users Meeting May 19, 2010.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Christian Chace Patent Process Reengineering Team December 8, 2010 Patent Process Reengineering Team and Patent End-To-End Processing Team The Biotechnology.
Organizing information in the post-genomic era The rise of bioinformatics.
1 IP Infrastructure for Promotion of Work Sharing - Japan’s Perspective - Koichi MINAMI Deputy Commissioner Japan Patent Office WIPO Global Symposium of.
PIRSF Classification System PIRSF: Evolutionary relationships of proteins from super- to sub-families Homeomorphic Family: Homologous proteins sharing.
1 EMBL Outstation — The European Bioinformatics Institute Removing redundancy in SWISS-PROT and TrEMBL.
Copyright OpenHelix. No use or reproduction without express written consent1.
Measuring patent quality and radicalness: new indicators
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
Patent Searching Basics Patrick M. Torre, Ph.D. November 18, 2015.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
InterPro Sandra Orchard.
Global Dossier Mr. Mark Powell Deputy Commissioner
Welcome to the Protein Database Tutorial. This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Introducing EPO PATSTAT EPO Worldwide Patent Statistical Database James Rollinson.
1 The Patent Prosecution Highway A Brief History and Current Status Mark R. Powell Director, TC 2600 USPTO.
European Agency for Development in Special Needs Education Project updates Marcella Turner-Cmuchal.
Recent developments at the EPO and its co-operation in Africa Nicholas Körnig August 2016Administrator International Relations.
“COMPETITIVE INTELLIGENCE” USING INTELLECTUAL PROPERTY INFORMATION
Timeliness of patent data OECD nowcasting exercise
How Can REDCap Help my Research?
WIPO IPAS Juneho Jang Senior Regional Manager
Accelerating your Patent Prosecution in Mexico
UniProt: Universal Protein Resource
Milena Lonati PD Quality Management DG2, European Patent Office
BLAST.
Identify D. melanogaster ortholog
Wide World of Espacenet
Presentation transcript:

The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number (Integrating Activity) Non-Redundant Patent Sequence Databases Ana Richart de la Torre & Irina Benediktovich

500 identical results. Too much to analyze! Current Situation: Search process needs to be accelerated

The same sequence can appear multiple times in the database due to: 1) The same invention is filed multiple times in different offices 2) Different Inventors use the same sequence in different contexts GM CS017585ACQ13114DI AAR79155 DD649656ADA % identical sequences Invention A Invention B HB EPWOEP USJP Simple Family Why we can have 500 identical hits?

USPTO JPO KIPO EPO We expect more redundancy in the near future, since other National Offices will participate in the data exchange. The Trilateral patent offices exchange and publish their biological sequences, through the public database providers (INSDC) NO International Cooperation KOBIC

PROJECT OVERVIEW

Architecture of the Sequence Data capture application DATA CAPTURE

Sequence detection algorithm: Detects the presence of sequences in the patent application, using a multi- scanning process with different detection levels Data management workflows: Increase the database coverage without creating more redundancy DATA CAPTURE

PROJECT OVERVIEW

2 types of NR databases Statistics Sept 2010 NR DatabasesAbbreviationCoverageNumber of entries Redundancy before NR Patent Nucleotides Level1 NRNL1EMBL-Bank patents (17,526,371 entries ) 10,077, NR Patent Nucleotides Level2 NRNL2EMBL-Bank patents (17,526,371 entries ) 14,612, NR Patent Proteins Level1 NRPL1EPO+JPO+KIPO+USPTO (4,947,423 entries) 2,124,7982,33 NR Patent Proteins Level2 NRPL2EPO+JPO+KIPO+USPTO (4,947,423 entries) 3,372,1141,47 Non-Redundant Patent Sequence Databases

00003f38f f 4a536583d92c caggc.... gatcc 2. caggc.... gatcc 3. caggc.... gatcc caggc.... gatcc 00003f38f f4 a536583d92c240 A) caggc.... gatcc from Umbrella Corp. B) caggc.... gatcc from SuperGen Ltd. C) caggc.... gatcc from GeneTech S.A. 1) We calculate a "fingerprint" per sequence (checksum), since it is faster to compare checksums than sequences. 2) We merge in the same entry, all the sequences with the same 'fingerprint' and belonging to the same invention (simple family)

L2 Links to Family members Earliest Priority in Family Earliest PD in Family L1 all Families Cluster Members (from SEQ-DB)

PROJECT OVERVIEW

Correction of Publication Numbers and kind Codes

Identical Sequences stemming from the same invention (same family), very often have different annotations. In the NR databases at Level 2, we have merged all the annotations in a single record, but still keeping the links to the original entries.

Earliest PR First publication in the Sequence Databases Biological annotations Sequence and checksum (MD5) 5 cluster members with publication corrections Example: The user would have to analyze 5 entries Only 1 ENTRY has to be checked with the Non-redundant database!!! Final Result

The Non-Redundant databases are publicly available through the EBI

For more Information:

Similarity and Homology sequence searches against a Non-redundant database, are faster and more sensible, since less entries need to be scanned in the search process. These databases are the first non-redundant collection that takes both, sequence and family concepts into consideration. The Publication data corrections, significantly increases the data quality. The earliest publication date availability, provides a direct link to track the patent history. The collation of all the biological features in a single record, provides a significant improvement for the proper understanding of the biological context the sequence is being used. The joint efforts and collaboration of the patent offices and the applicants, on providing sequences with high quality biological annotations, is beneficial for all the users of the public services. CONCLUSIONS

Thank you Ana Richart de la Irina