Presentation is loading. Please wait.

Presentation is loading. Please wait.

InterPro An Introduction

Similar presentations


Presentation on theme: "InterPro An Introduction"— Presentation transcript:

1 InterPro An Introduction
2nd IMPACT workshop 5-6 May, 2010 InterPro An Introduction European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

2 European Bioinformatics Institute Wellcome Trust Genome Campus
Overview What InterPro is Where it came from What the vision was Has it evolved in line with that vision? Is it still fit for purpose? European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

3 European Bioinformatics Institute Wellcome Trust Genome Campus
What is InterPro? According to the User Manual: “InterPro is an integrated documentation resource for protein families, domains & sites. InterPro combines a number of databases that use different methodologies & a varying degree of biological information on well-characterised proteins to derive protein signatures. By uniting the member databases, InterPro capitalises on their individual strengths, producing a powerful integrated database & diagnostic tool.” European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

4 European Bioinformatics Institute Wellcome Trust Genome Campus
Where did it come from? The concept of an integrated protein family database emerged almost 20 years ago! at the 1991 BCA spring meeting in Sheffield Amos Bairoch had a poster on PROSITE I had one on a ‘fingerprint’ database… We recognised that our approaches were under-pinned by similar philosophies to provide meaningful biological information to provide high quality manual annotation European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

5 European Bioinformatics Institute Wellcome Trust Genome Campus
Where did it come from? European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

6 European Bioinformatics Institute Wellcome Trust Genome Campus
Where did it come from? European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

7 European Bioinformatics Institute Wellcome Trust Genome Campus
Where did it come from? European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

8 Where did it come from? PROSITE & PRINTS were different
but somehow also the same… most importantly, they were complementary In combination, we gain powerful structural & functional insights European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

9 European Bioinformatics Institute Wellcome Trust Genome Campus
Where did it come from? So where next? we had created 30 family fingerprints PROSITE documented 375 families & functional sites PROSITE was way ahead! we were still on the starting blocks… Nevertheless, we decided to apply for an EU grant to unite the databases …seemed like a good idea at the time! European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

10 European Bioinformatics Institute Wellcome Trust Genome Campus
What was the vision? Naïvely, we wanted to make life easier! We aimed to simplify & rationalise protein family analysis ensuring that entries & their linked signatures pointed to related information on the same biological object centralise & streamline the annotation process reduce manual annotation burdens facilitate automatic functional annotation of uncharacterised proteins European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

11 European Bioinformatics Institute Wellcome Trust Genome Campus
How has it evolved? The EU proposal was submitted in 1992 and was promptly declined! Later, in 1995, the EBI was established at Hinxton Visiting Fellowship in 1997 to help integrate my work more closely with that of EBI Rolf, Amos & I decided to try again for an EU grant by then, Profiles, ProDom & Pfam had also been created so it made sense to include them too With the bigger picture, the grant succeeded - InterPro was born! European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

12 European Bioinformatics Institute Wellcome Trust Genome Campus
How has it evolved? Prosite Release 0.1 beta was made in October 1999 It contained 2,423 entries 1,370 PROSITE entries 1,465 Pfam entries 1,157 PRINTS entries 241 preliminary profiles Based on Swiss-Prot 38 & TrEMBL 11 ProDom ProDom PRINTS InterPro Profiles Pfam European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

13 European Bioinformatics Institute Wellcome Trust Genome Campus
How has it evolved? “Various factors rendered a step-wise approach to the development of InterPro desirable. First, the scale of the task of amalgamating just the first 3 databases was immense. The rational merging of apparently equivalent database entries that in fact simultaneously define a specific family, domains within that family, or even repeats within those domains, presented an enormous challenge.” European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

14 European Bioinformatics Institute Wellcome Trust Genome Campus
How has it evolved? domain family super-family families sub-families Unravelling the biological relationships is vital! European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

15 European Bioinformatics Institute Wellcome Trust Genome Campus
How has it evolved? Clearly, the task of integration was hard understanding the biological relationships being represented within member databases, let alone between them, was proving to be a significant challenge Rather than making our lives easier, it was probably making them much harder! …& that was just with 3 databases! Today, with 11 sources, life is harder still… European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

16 European Bioinformatics Institute Wellcome Trust Genome Campus
How has it evolved? Release 26.0, March 2010 It contains 20,329 entries 1,023 Gene3D entries 620 HAMAP entries 2,234 Panther entries 2,744 PIRSF entries 1,975 PRINTS entries 1,291 PROSITE regexs 836 PROSITE profiles 11,056 Pfam entries 803 SMART entries 1,095 SUPERFAMILY entries 3,689 TIGRFams Release 0.1 beta was made in October 1999 It contained 2,423 entries 1,370 PROSITE entries 1,465 Pfam entries 1,157 PRINTS entries 241 preliminary profiles Based on Swiss-Prot 38 & TrEMBL 11 European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

17 Is InterPro still fit for purpose?
The database has grown almost 10-fold in ~11 years Why was it created in the first place? to simplify & rationalise protein family analysis ensuring that entries & their linked signatures pointed to related information on the same biological object to centralise & streamline the annotation process & reduce manual annotation burdens to facilitate automatic functional annotation of uncharacterised proteins to make life easier!! European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

18 Is InterPro still fit for purpose?
European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

19 Is InterPro still fit for purpose?
European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019 European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019 19

20 Is InterPro still fit for purpose?
European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

21 Is InterPro still fit for purpose?
European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

22 Is InterPro still fit for purpose?
Why separate out structurally & functionally relevant information? Remember this? European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

23 European Bioinformatics Institute Wellcome Trust Genome Campus
What is InterPro? A reminder: “InterPro is an integrated documentation resource for protein families, domains & sites. InterPro combines a number of databases that use different methodologies & a varying degree of biological information on well-characterised proteins to derive protein signatures. By uniting the member databases, InterPro capitalises on their individual strengths, producing a powerful integrated database & diagnostic tool.” European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

24 Is InterPro still fit for purpose?
Integration = greater than the sum of the parts - a perfect example… This integrated view is incredibly powerful & informative! European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

25 Is InterPro still fit for purpose?
European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

26 Is InterPro still fit for purpose?
European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

27 Is InterPro still fit for purpose?
What does it mean? European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

28 Is InterPro still fit for purpose?
They’re still not the same! Let’s see what the alignments actually look like - consider just the first 3 TM domains… They’re not the same! European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

29 Is InterPro still fit for purpose?
In the process of growing bigger, InterPro has grown massively in complexity Its internal convolutions now challenge us to ask, “What does it mean?” what does it all mean to end users?! & what does it all mean to computers?! European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

30 Has it evolved in line with its vision?
With IMPACT, yes, InterPro has an opportunity to realise its original vision it can rationalise protein family analysis it can help to streamline the annotation process it can facilitate functional annotation of proteins it can make life easier but it can only do these things if we’re prepared to empathise, collectively, with its growing pains! That’s why this workshop is important European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

31 Is InterPro still fit for purpose?
“There is a tremendous amount of information regarding evolutionary history and biochemical function implicit in each sequence and the number of known sequences is growing explosively. We feel it is important to collect this significant information, correlate it into a unified whole and interpret it.” Margaret O. Dayhoff to C.Berkley, February 27th, 1967 That is still InterPro’s unique opportunity! “To kill an error is as good a service as, and sometimes even better than, the establishing of a new truth or fact.” Charles Darwin, 1879 This remains IMPACT’s imperative! European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

32 European Bioinformatics Institute Wellcome Trust Genome Campus
A workshop 5-6 May, 2010 Day 1 Registration Domestic InterPro, an introduction (Terri) Single-motif signatures: pros, cons & added-value to InterPro (Nicolas) Multiple-motif signatures: pros, cons & added-value to InterPro (Alex) Coffee Domain-based signatures: pros, cons & added-value to InterPro (Rob) Structural annotation: pros, cons & added-value to InterPro (Corin) InterPro today [including GO mapping] (Sarah) Lunch How InterPro is used to add functional annotation to UniProt (Claire) Hands-on examples Coffee Open discussion/feedback Dinner European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019

33 European Bioinformatics Institute Wellcome Trust Genome Campus
A workshop 5-6 May, 2010 Day 2 Issues with integrating different signatures: domains Issues with integrating different signatures: families and subfamilies Meaningful terms to group signatures and name entries Coffee 11:30-12:00 Unexpected sequences in match lists & how to reconcile them Improving InterPro’s interface to better visualise, integrate & maintain data Open discussions Lunch ??? Format/outline/organisation of November outreach event Future funding Reviewer feedback Review of EoY deliverables – status report & action plan AOB European Bioinformatics Institute Wellcome Trust Genome Campus 2/24/2019


Download ppt "InterPro An Introduction"

Similar presentations


Ads by Google