Presentation is loading. Please wait.

Presentation is loading. Please wait.

Napovedovanje imunskega odziva iz peptidnih mikromrež Mitja Luštrek 1 (2), Peter Lorenz 2, Felix Steinbeck 2, Georg Füllen 2, Hans-Jürgen Thiesen 2 1 Odsek.

Similar presentations


Presentation on theme: "Napovedovanje imunskega odziva iz peptidnih mikromrež Mitja Luštrek 1 (2), Peter Lorenz 2, Felix Steinbeck 2, Georg Füllen 2, Hans-Jürgen Thiesen 2 1 Odsek."— Presentation transcript:

1 Napovedovanje imunskega odziva iz peptidnih mikromrež Mitja Luštrek 1 (2), Peter Lorenz 2, Felix Steinbeck 2, Georg Füllen 2, Hans-Jürgen Thiesen 2 1 Odsek za inteligentne sisteme, Institut Jožef Stefan 2 Univerza v Rostocku

2 1.Introduction 2.Immune response prediction 3.Interpretation

3 1.Introduction 2.Immune response prediction 3.Interpretation

4 Peptide = part of protein = short sequence of amino acids Image taken from EMBL website

5 Peptide = part of protein = short sequence of amino acids SNDIVLT = string of letters from 20-letter alphabet (1 letter = 1 amino acid, 20 standard amino acids) Image taken from EMBL website

6 Epitope Antigen protein Antibody binding Antibody

7 Epitope Antibody binding Antibody Epitope Antigen protein

8 Epitope Peptide Antigen protein

9 Epitope Antigen protein

10 Epitope Antigen protein Antibody binding Antibody

11 Epitope Antigen protein Antibody binding Antibody

12 Epitope Antigen protein Antibody binding Antibody

13 Epitope Antigen protein

14 Epitope Antigen protein

15 Peptide arrays Peptide array Peptides (15 amino acids) Glass slide

16 Peptide arrays Peptide array IVIg antibody mixture Peptides (15 amino acids) Glass slide

17 Peptide arrays Peptide array IVIg antibody mixture Red = epitopes (bind antibodies) Black = non-epitopes Peptides (15 amino acids) Glass slide

18 Peptide arrays Red = epitopes (bind antibodies) Black = non-epitopes Peptide Antibody Antibody against antibody + dye Glass slide

19 Peptide arrays Red = epitopes (bind antibodies) Black = non-epitopes Peptide Class PGIGFPGPPGPKGDQ non-ep. PNMVFIGGINCANGK non-ep. DGIGGAMHKAMLMAQ non-ep. REDNLTLDISKLKEQ non-ep. TPLAGRGLAERASQQ non-ep. DQVHPVDPYDLPPAG non-ep.... RRMISRMPIFYLMSG epitope LPPGFKRFTCLSIPR epitope EFSQMESYPEDYFPI epitope...

20 1.Introduction 2.Immune response prediction 3.Interpretation

21 Our task Peptide RRKGGLEEPQPPAEQ SEDLENALKAVINDK EDHVKLVNEVTEFAK GEKIIQEFLSKVKQM ILVSRSLKMRGQAFV YTCQCRAGYQSTLTR...

22 Our task Peptide RRKGGLEEPQPPAEQ SEDLENALKAVINDK EDHVKLVNEVTEFAK GEKIIQEFLSKVKQM ILVSRSLKMRGQAFV YTCQCRAGYQSTLTR... Peptide Class RRKGGLEEPQPPAEQ non-ep. SEDLENALKAVINDK non-ep. EDHVKLVNEVTEFAK non-ep. GEKIIQEFLSKVKQM non-ep. ILVSRSLKMRGQAFV epitope YTCQCRAGYQSTLTR epitope... Machine learning

23 Our task Peptide RRKGGLEEPQPPAEQ SEDLENALKAVINDK EDHVKLVNEVTEFAK GEKIIQEFLSKVKQM ILVSRSLKMRGQAFV YTCQCRAGYQSTLTR... Peptide Class RRKGGLEEPQPPAEQ non-ep. SEDLENALKAVINDK non-ep. EDHVKLVNEVTEFAK non-ep. GEKIIQEFLSKVKQM non-ep. ILVSRSLKMRGQAFV epitope YTCQCRAGYQSTLTR epitope... Machine learning Training set: 13,638 peptides (3,420 epitopes) Test set: 13,640 peptides (3,421 epitopes) Balanced until the final testing

24 Machine learning Peptide Class PGIGFPGPPGPKGDQ non-ep. / epitope

25 Machine learning Peptide Class PGIGFPGPPGPKGDQ non-ep. / epitope Attribute 1Attribute 2...Class value 1value 2non-ep. / epitope Attribute representation

26 Machine learning Peptide Class PGIGFPGPPGPKGDQ non-ep. / epitope Attribute 1Attribute 2...Class value 1value 2non-ep. / epitope ML Attribute representation Classifier Proability for epitope p

27 Machine learning Peptide Class PGIGFPGPPGPKGDQ non-ep. / epitope Attribute 1Attribute 2...Class value 1value 2non-ep. / epitope ML Attribute representation Classifier Proability for epitope p

28 Machine learning Peptide Class PGIGFPGPPGPKGDQ non-ep. / epitope Attribute representation 1 Attribute representation 8 Classifier 1Classifier 8... ML

29 Machine learning Peptide Class PGIGFPGPPGPKGDQ non-ep. / epitope Attribute representation 1 Attribute representation 8 Classifier 1Classifier 8... Probabilities for epitopeClass p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p8 non-ep. / epitope ML Meta classifier ML Final proability for epitope p

30 Machine learning Peptide Class PGIGFPGPPGPKGDQ non-ep. / epitope Attribute representation 1 Attribute representation 8 Classifier 1Classifier 8... Probabilities for epitopeClass p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p8 non-ep. / epitope ML Meta classifier ML Final proability for epitope p SVM (SMO), Logistic regression Linear regression

31 Attribute representation 1 RRMISRMPIFYLMSG Count ofACDEFGHIKLMNPQRSTVWY 112131321 Amino-acid counts

32 Attribute representation 2 RRMISRMPIFYLMSG Amino-acid count differences Difference in counts ofF–GF–IF–LF–MF–PF–RF–SF–YG–FG–I... 0–10–20 –100

33 Attribute representation 3 Count ofRRRMMI...RRMRMIMIS...ACDE...ACDEF... 12111100 RRMISRMPIFYLMSG Subsequence counts

34 Attribute representation 4 Amino-acid class counts Count oftinysmalllargebasicacidicneutral... 31113012 l l l l t l l s l l l l l t t RRMISRMPIFYLMSG bbnnnbnnnnnnnnn

35 Attribute representation 5 Amino-acid class subsequence counts l l l l t l l s l l l l l t t RRMISRMPIFYLMSG bbnnnbnnnnnnnnn Count oflllttllssltt...bbbnnbnn... 82111112110

36 Attribute representation 6 Amino-acid pair counts Rationale: antibodies may bind in two places due to their two- chain structure. Antibody Peptide

37 Attribute representation 6 RRMISRMPIFYLMSG Amino-acid pair counts Rationale: antibodies may bind in two places due to their two- chain structure. Count of pairs at distance(R,R) at 1(R,M) at 2(R,I) at 3...(A,C) at 1(A,C) at 2... 11200 12 3 3 Antibody Peptide

38 Attribute representation 7 Amino-acids at distances from first + first amino acid Rationale: antibodies may bind in two places, first amino acid most accesible on the peptide array. Antibody Peptide

39 Attribute representation 7 R RMISRMPIFYLMSG Amino-acids at distances from first + first amino acid Rationale: antibodies may bind in two places, first amino acid most accesible on the peptide array. Count of at distance...R at 1...M at 2...A at 3C at 3...First 1100R Antibody Peptide

40 Attribute representation 8 RRMISRMPIFYLMSG Average amino-acid properties HydrophobicitySizePolarityFlexibilityAccesibility... 0.4480.5960.3060.2310.376

41 Attribute representation 9 (not used) RRMISRMPIFYLMSG Amino-acid counts with a difference RRMISRMPIWYLMSG Equivalent for epitope prediction?

42 Attribute representation 9 (not used) RRMISRMPIFYLMSG Amino-acid counts with a difference RRMISRMPIWYLMSG Equivalent for epitope prediction? Count F as: 1 F 0.8 W 0.4 Y... Count W as: 1 W 0.7 F 0.3 Y...

43 Attribute representation 9 (not used) Amino-acid substitution matrix ACD...FWY A1 C1 D1 F10.80.4 W0.710.3 Y1

44 Attribute representation 9 (not used) Amino-acid substitution matrix ACD...FWY A1 C1 D1 F10.80.4 W0.710.3 Y1 Optimize with a genetic algorithm to maximize classification accuracy

45 Results – training set Attribute representationAUCAccuracy Amino-acid counts0.87080.7 % Amino-acid count differences0.86880.3 % Subsequence counts0.86780.5 % Amino-acid class counts0.87381.2 % Amino-acid class subsequence counts0.86680.5 % Amino-acid pair counts0.86580.6 % Amino acids at distances from the first0.87381.2 % Average amino-acid properties0.86380.3 %

46 Results – training set Attribute representationAUCAccuracy Amino-acid counts0.87080.7 % Amino-acid count differences0.86880.3 % Subsequence counts0.86780.5 % Amino-acid class counts0.87381.2 % Amino-acid class subsequence counts0.86680.5 % Amino-acid pair counts0.86580.6 % Amino acids at distances from the first0.87381.2 % Average amino-acid properties0.86380.3 % Combined0.88183.3 %

47 Results – test set Attribute representation / datasetAUCAccuracy Best single / training set0.87381.2 % Combined / training set0.88183.3 % Combined / test set0.88383.7 %

48 Results – test set Attribute representation / datasetAUCAccuracy Best single / training set (balanced)0.87381.2 % Combined / training set (balanced)0.88183.3 % Combined / test set (balanced)0.88383.7 % Combined / test set (original)0.88485.9 % Epitope : non-epitope = 1 : 1 Epitope : non-epitope = 1 : 3

49 Results – test set Attribute representation / datasetAUCAccuracy Best single / training set (balanced)0.87381.2 % Combined / training set (balanced)0.88183.3 % Combined / test set (balanced)0.88383.7 % Combined / test set (original)0.88485.9 % EL-Manzalawy / test set (balanced)0.86882.0 % EL-Manzalawy / test set (original)0.87483.9 % State of the art: SVM + string kernel (EL-Manzalawy et al., 2008) Trained and tested on our data.

50 Results – test set Our results Balanced: 0.883 / 83.7 % Original: 0.884 / 85.9 % EL-Manzalawy Balanced: 0.868 / 82.0 % Original: 0.874 / 83.9 %

51 1.Introduction 2.Immune response prediction 3.Interpretation

52 Rules Interpretable classifier: Interpretable attributes (frequencies, properties of amino acids) RIPPER (JRip) to induce rules

53 Rules PropertyLow/highApplies to peptides AromaticityHigh53.8 % If a peptide has a high aromaticity, it binds antibodies. This applies to 53.8 % of peptides that bind antibodies. (Aromaticity is the percentage of aromatic amino acids in the peptide.) Interpretable classifier: Interpretable attributes (frequencies, properties of amino acids) RIPPER (JRip) to induce rules

54 Rules PropertyLow/highApplies to peptides AromaticityHigh53.8 % PolarityLow27.7 % Frequency of tyrosineHigh26.2 % HydrophobicityLow22.5 % Frequency of arginineHigh19.7 % Summary factor 2High16.7 % AcidityLow11.4 % Preference for  -sheets Low4.3 % Summary factor 5High3.0 %

55 Epitope propensity Frequency in peptides with epitopes, divided by frequency in peptides without epitopes

56 Epitope propensity Aromatic

57 Epitope propensity Non-polar

58 Epitope propensity Tyrosine

59 (Un)classifiable peptides Simplified classifier: Interpretable attributes (frequencies, properties of amino acids) Logistic regression to train the classifier PeptidesAUCAccuracy All0.86083.0 %

60 (Un)classifiable peptides Simplified classifier: Interpretable attributes (frequencies, properties of amino acids) Logistic regression to train the classifier PeptidesAUCAccuracy All0.86083.0 % Classifiable Unclassifiable Classified correctly Classified incorrectly

61 (Un)classifiable peptides Simplified classifier: Interpretable attributes (frequencies, properties of amino acids) Logistic regression to train the classifier PeptidesAUCAccuracy All0.86083.0 % Classifiable0.99998.8 % Unclassifiable0.95691.5 % Expected Strange?

62 (Un)classifiable – rules Attribute ClassifiableUnclassifiable L/hAppliesL/hApplies AromaticityHigh74.3 %Low53.3 % PolarityLow58.7 %High27.5 % Frequency of arginineHigh31.5 %Low34.0 % Frequency of tyrosineHigh20.7 %Low16.9 % Summary factor 5High15.1 %Low15.2 % AntigenicityHigh7.3 %Low8.7 % HydrophobicityLow4.7 %High6.5 % Frequency of histidineLow3.9 % Frequency of cysteineLow10.4 % Preference for reverse turnsHigh10.4 % Occurrence in turnsLow10.4 % Frequency of alanineHigh8.7 %

63 (Un)classifiable – rules Attribute ClassifiableUnclassifiable L/hAppliesL/hApplies AromaticityHigh74.3 %Low53.3 % PolarityLow58.7 %High27.5 % Frequency of arginineHigh31.5 %Low34.0 % Frequency of tyrosineHigh20.7 %Low16.9 % Summary factor 5High15.1 %Low15.2 % AntigenicityHigh7.3 %Low8.7 % HydrophobicityLow4.7 %High6.5 % Frequency of histidineLow3.9 % Frequency of cysteineLow10.4 % Preference for reverse turnsHigh10.4 % Occurrence in turnsLow10.4 % Frequency of alanineHigh8.7 % All: 53.8 % All: 27.7 %

64 (Un)classifiable – epitope propensity

65 (Un)classifiable peptides Simplified classifier: Interpretable attributes (frequencies, properties of amino acids) Logistic regression to train the classifier PeptidesAUCAccuracy All0.86083.0 % Classifiable0.99998.8 % Unclassifiable0.95691.5 % Strange? Not really! Inevitable or does it mean something?

66 2 nd degree (un)classifiable peptides Unclassifiable peptides only Simplified classifier PeptidesAUCAccuracy All unclassifiable0.95691.5 %

67 2 nd degree (un)classifiable peptides Unclassifiable peptides only Simplified classifier PeptidesAUCAccuracy All unclassifiable0.95691.5 % Classifiable unclassifiable Unclassifiable unclassifiable Classified correctly Classified incorrectly

68 2 nd degree (un)classifiable peptides Unclassifiable peptides only Simplified classifier PeptidesAUCAccuracy All unclassifiable0.95691.5 % Classifiable unclassifiable0.99297.8 % Unclassifiable unclassifiable0.68365.0 %

69 2 nd degree (un)classifiable peptides PeptidesAUCAccuracy All unclassifiable0.95691.5 % Classifiable unclassifiable0.99297.8 % Unclassifiable unclassifiable0.68365.0 % (Un)classifiable peptides PeptidesAUCAccuracy All0.86083.0 % Classifiable0.99998.8 % Unclassifiable0.95691.5 % Inevitable or does it mean something? Not inevitable!

70 2 nd degree (un)cl. – epitope propensity

71 Conclusions Epitopes have common characteristics

72 Conclusions Epitopes have common characteristics – Epitopes are parts of antigens that bind antibodies Our peptides mostly did not come from known antigens Probably partly general and partly antibody-specific binding

73 Conclusions Epitopes have common characteristics – Epitopes are parts of antigens that bind antibodies Epitope characteristics are not unexpected Our peptides mostly did not come from known antigens Probably partly general and partly antibody-specific binding

74 Conclusions Epitopes have common characteristics – Epitopes are parts of antigens that bind antibodies Epitope characteristics are not unexpected Two groups of epitopes: – around 80 % “typical” (classifiable) – around 20 % “atypical” (unclassifiable) Our peptides mostly did not come from known antigens Probably partly general and partly antibody-specific binding

75 Conclusions Epitopes have common characteristics – Epitopes are parts of antigens that bind antibodies Epitope characteristics are not unexpected Two groups of epitopes: – around 80 % “typical” (classifiable) – around 20 % “atypical” (unclassifiable) Our peptides mostly did not come from known antigens Probably partly general and partly antibody-specific binding Mostly general- purpose antibodies? Mostly antigen- specific antibodies?


Download ppt "Napovedovanje imunskega odziva iz peptidnih mikromrež Mitja Luštrek 1 (2), Peter Lorenz 2, Felix Steinbeck 2, Georg Füllen 2, Hans-Jürgen Thiesen 2 1 Odsek."

Similar presentations


Ads by Google