Presentation is loading. Please wait.

Presentation is loading. Please wait.

Corrections. N-linked glycosylation (GlcNac): Look at the Swiss-Prot annotation (in a random ‘glycosylated’ entry)

Similar presentations


Presentation on theme: "Corrections. N-linked glycosylation (GlcNac): Look at the Swiss-Prot annotation (in a random ‘glycosylated’ entry)"— Presentation transcript:

1 Corrections

2

3 N-linked glycosylation (GlcNac): Look at the Swiss-Prot annotation (in a random ‘glycosylated’ entry)

4 Query: annotation:(type:carbohyd "N-linked (GlcNAc...)" confidence:experimental) reviewed:yes

5 Taxonomic distribution

6 TPNLINDTME

7 Multiple alignment (ClustalW) -[LAPIQ]-N-[HAYRCS]-[ST]-[KLESGM]

8

9

10

11 N-glycosylation does not occur in Bacteria: …false positive !

12 301 protein (within the set of 1000 proteins) are N-glycosylated according to the UniProtKB annotation…!

13

14 Scan Prosite with the official pattern The official pattern also match with bacteria sequences (false positives)

15

16

17 PRATT pattern with 20 sequences D-K-T-G-T-[IL]-T-x(3)-[ILMV]-x-[FILV]

18

19 AT31_HUMAN: SIMILARITY: Belongs to the cation transport ATPase (P-type) family. Type V subfamily. The pattern is a discriminator for ATP ase family (Cation-transporting )

20

21

22

23

24

25

26

27 C-x(2,4)-C-x(3)-[LIVMFYWC]-x(8)-H-x(3,5)-H

28 Pattern scan

29

30

31 The pattern missed some Zn finger in the same protein i.e. Q24174 Pattern Profile Not found with the pattern

32 The pattern: C - X(2,4) - C - X(3) - [LIVMFYWC] - X(8) - H - X(3,5) – H Should includes: YRCVLCGTVAKSRNSLHSHMSrQHRGIST C-X(2,4)-C-X(3)-[LIVMFYWCA]-X(8)-H-X(3,5)-H

33

34 Yes ! But: The pattern becomes less restrictive. You get more sequences which should not be here. (As the results are limited to 1000, the number of hits is not the same…)

35 Discriminators (Signatures, descriptors) for the Zinc finger C2H2 type domain can be found in Prosite (Pattern and Profile) and Pfam (HMM)

36

37 Step 1: scan UniProtKB/Swiss-Prot with the pattern Use the ‘scanprosite’ tool at http://www.expasy.org/tools/scanprosite/

38

39 Step 2: Retrieve the matched human entries @ UniProt (go at the end of the Scan Prosite result page: click on ‘Matched UniProtKB entries’)

40 Step 3: Retrieve the sequences annotated as being ‘phosphorylated on a Thr’

41 -> 19 candidates to be manually checked …. Step 3: Retrieve the sequences annotated as being ‘phosphorylated on a Thr’

42

43 InterPro scan results

44 InterPro : other shema (Graphical view from UniProtKB)

45 InterPro shema PFAM Graphical view

46 Prosite Graphical view

47 Blast @ NCBI against Swiss-Prot NCBI: Color key for alignment scores

48 NCBI Swiss-Prot does not contain the alternative sequences (i.e. P28175-2) – !! NCBI gives the ‘version number’ of the Swiss-Prot sequence (i.e. Q8BU25.2)….

49 UniProt: Color code for identity scores (not alignment !)

50

51

52 ProDom database List of proteins sharing at least a common domain…

53

54 1) BLAST at www.uniprot.org

55

56

57

58 2) PROSITE tools

59

60 You are lucky: domains are rarely not annotated in the different domain/family databases !

61 3) Construct a profile with My hits at SIB Use PSI Blast

62 Do a PSI BLAST against UniProtKB

63

64 Select sequence with a E value > 0.001 and do a second cycle

65 Look at the MSA

66

67 Construct a profile with the MSA

68

69

70

71 The profile

72 The profile hits

73 Construct a HMM with the MSA

74 The HMM

75 The HMM hits

76 - Look at the Goloco data in InterPro. How many proteins (and/or hits) are found by the different methods ?

77 http://www.ebi.ac.uk/interpro/

78 According to InterPro: Goloco domain is described by at least one of the different methods (PFAM, Prosite, Smart) PFAM: 167 proteins Prosite: 192 proteins SMART: 1 proteins These different numbers are the consequence of the interval between the different releases of the different databases (including the sequence databases (UniProtKB). It may also be due to the different methods used (HMM, profile…)

79 Look for the HMM for the Goloco domain in PFAM

80

81 Download the HMM matrix

82 the HMM matrix

83


Download ppt "Corrections. N-linked glycosylation (GlcNac): Look at the Swiss-Prot annotation (in a random ‘glycosylated’ entry)"

Similar presentations


Ads by Google