Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Protein Identifier Cross-Reference (PICR) service.

Similar presentations


Presentation on theme: "The Protein Identifier Cross-Reference (PICR) service."— Presentation transcript:

1 The Protein Identifier Cross-Reference (PICR) service

2 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk Overview The problem… What is PICR? Access via web and web services

3 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk The problem… -No direct comparison of the results can be done. -Both groups used different Protein DB to report their results. Protein list A (DB Search vs. IPI) IPI000001 IPI000002 IPI000003 … Protein list B (DB Search vs. UniProt) P00001 P00002 P00003 …

4 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk The problem… Protein list A (DB Search vs. IPI) IPI000001 IPI000002 IPI000003 … Protein list B (DB Search vs. UniProt) P00001 P00002 P00003 … -No direct comparison of the results can be done. -Both groups used different Protein DB to report their results. We would need to use the PICR tool to be able to make a direct comparison

5 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk Databases are all different

6 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk Databases evolve

7 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk Why do you need ID mapping Merging datasets to a common identifier space Finding all aliases/synonyms for an identifier (data integration – submissions!) Mapping from secondary IDs to more recent primary IDs (data “freshness”) Preparing data sets for specific tools Querying in various primary databases (data format requirements)

8 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk Protein identifier mapping is hard The basic problem: the same protein sequence is referred to by multiple accession numbers assigned by multiple databases. No universal identifier scheme Redundant databases – multiple identifiers for the same sequence in the same database Unstable identifiers (ex: gi numbers) Obsolete and deleted identifiers (hypothetical proteins) Different production cycles for major databases Tools exist, but are limited in important their database and species coverage and in their usability and availability.

9 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk PICR Home Page Submit accessions OR sequences (FASTA) with 500 entry interactive limit (no batch limit) Select output format Select one or many databases to map to in one request Limit search by taxonomy (pessimistic) Choose to return all mappings or only active ones Run search BLAST functionality for protein fragments

10 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk PICR Result Page – simple view Logical xref (hyperlinked) Inactive xref Secondary Identifier Active xref (hyperlinked)

11 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk PICR Result Page – detailed view

12 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk PICR Result Page – XLS view

13 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk PICR services PICR offers both SOAP and REST web service interfaces. Documentation is available online: SOAP: http://www.ebi.ac.uk/Tools/picr/WSDLDocumentation.dohttp://www.ebi.ac.uk/Tools/picr/WSDLDocumentation.do REST: http://www.ebi.ac.uk/Tools/picr/RESTDocumentation.dohttp://www.ebi.ac.uk/Tools/picr/RESTDocumentation.do Sample client code and URL examples are provided from the PICR website.

14 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk Do you want to know more? Wein et al., NAR, 2012

15 EBI Roadshow Rotterdam, 12 June 2012 Juan A. Vizcaíno juan@ebi.ac.uk Questions?


Download ppt "The Protein Identifier Cross-Reference (PICR) service."

Similar presentations


Ads by Google