Presentation is loading. Please wait.

Presentation is loading. Please wait.

Immunological bioinformatics Ole Lund, Center for Biological Sequence Analysis (CBS) Denmark.

Similar presentations


Presentation on theme: "Immunological bioinformatics Ole Lund, Center for Biological Sequence Analysis (CBS) Denmark."— Presentation transcript:

1 Immunological bioinformatics Ole Lund, Center for Biological Sequence Analysis (CBS) Denmark.

2 World-wide Spread of SARS Status as of July 11, 2003: 8437 Infected, 813 Dead

3 SARS First severe infectious disease to emerge in the post-genomic era Modern societies are vulnerable to epidemics Classical containment strategies has been successful in controlling the epidemic, but – SARS may resurface (e.g. be seasonal) – Suggested existence of an animal reservoir could compromise the containment strategy Need to develop a vaccine strategy Biotechnology has provided new tools to analyze genome/proteome information and guide vaccine development. The causative virus, the SARS corona virus (SARS CoV), has been isolated and full-length sequenced.

4 Main scientific achievements Discovery of causative agent Genome(s) 3D Structure of main proteinase Seromics – Fragments->Antibodies T cell epitopes VSV-SARS(spike) Co-transfection – Pseudovirus-light when entry

5 Main scientific achievements Discovery of causative agent Genome(s) 3D Structure of main proteinase Origin – Similar virus found in from Himalayan palm civets and other animals, including a raccoon- dog, and in humans working at an animal market in Guangdong, China (Guan et al., Sep 4, 2003). Himalayan (Masked) palm civet Ferret-Badger Raccoon-dog http://biobase.dk/~david-c/uk-dk-mammmal-list.htm

6 Discovery of causative agent Random Arbitrarily Primed (RAP) PCR method Kochs postulates (modified by rivers for viruses 1937) fulfilled 1. Isolation of virus from diseased hosts 2. Cultivation in host cells 3. Proof of filterability 4. Infection course similar disease in original or related host species (Macaques) 5. Re-isolation of virus 6. Detection of specific immune response to the virus Clinical samples – SARS CoV found in 75% of SARS patients – Human metapneumovirus found in 12% of patients – Other agents only sporadically found Source: Albert Osterhaus, Beijing June, 2003; Nature May 2003; Lancet July, 2003

7 Transmission – No symptoms- – Early period+ – Very ill+++ – 10 days post fever- 41 flights -> 25 transmitted cases Prevention – Early tracking of contacts Origin – Serovonversion (Guan, 2003) Animal traders 40% Vegetable traders5% Source: Claus Stohr, Beijing June, 2003

8 New corona viruses 1978Porcine Epidemic diarrhea virus (PEDV) Probably from humans 1984Porcine Respiratory Coronavirus 1987Porcine Reproductive and Respiratory Syndrome (PRRS) 1993Bovine corona virus 2003SARS Source: Michael Buchmeier, Beijing June, 2003

9 Will it be back? When? – Every year?, Like the flu. – Every few years? Like measles used to. – Sporadic? Like Ebola – Never? Lab safety: The patient, a 27-year-old virologist, worked on the West Nile virus in a biosafety level 3 lab at the Environmental Health Institute, where the SARS coronavirus was also studied (Enserink, 2003)

10 How does the immune system “see” a virus?

11 The immune system The innate immune system – Found in animals and plants – Fast response – Complement, Toll like receptors The adaptive Immune system – Found in vertebrates – Stronger response 2nd time – B lymphocytes Produce antibodies (Abs) recognizes 3D shapes Neutralize virus/bacteria outside cells – T lymphocytes Cytotoxic T lymphocytes (CTLs) - MHC class I – Recognize foreign protein sequences in infected cells – Kill infected cells Helper T lymphocytes (HTLs) - MHC class II – Recognize foreign protein sequences presented by immune cells – Activates cells

12 SARS, a corona virus

13

14 Vaccines concerns Enhancement – Inactivated RSV, Measles vaccines can lead to a more severe disease – Infection with on (of four, 40-65% identity) serotypes of Dengue virus leads so more severe disease if later infected with another serotype Test – Erasmus university monkey model – “Mobile” vaccine efficacy clinical test protocols that can move to site of outbreak

15 The SARS Genome 29,736 nt Single stranded RNA+ genome

16 Weight matrices (Hidden Markov models) YMNGTMSQV GILGFVFTL ALWGFFPVV ILKEPVHGV ILGFVFTLT LLFGYPVYV GLSPTVWLS WLSLLVPFV FLPSDFFPS CVGGLLTMV FIAGNSAYE A2 Logo

17 Protein sequence information content Entropy – Average Uncertainty in the random variable – H = -  p i log 2 p i range: 0 to log 2 (20) = 4.3 – Logo height I = log 2 (20) + H Relative entropy (Kullback Leibler distance) – D =  p i log 2 (p i /q i )range: 0 to infinity Mutual information – Reduction in uncertainty due to knowledge of another random variable (corresponds to correlation) – M =  p ij log 2 (p ij /p i p j )

18 Prediction of MHC binding specificity Simple Motifs – Allowed (non allowed) amino acids Extended motifs – Amino acid preferences Structural models – Limitations: precision of force field, and speed of calculations Neural networks – Can take correlations into account

19 Log odds ratios Used for scoring Alignments (BLAST), HMMs, Matrix methods Odds ratio of observing given amino acids – Relative probability of observing amino acid i in motif position j – O j = p(aa i at pos j )/p(aa i ) Assumption of independence => – Odds for observing sequence = O 1 O 2 … O n Log odds ratio – LO = log(O 1 O 2 … O n ) = log(O 1 )+log(O 2 )+…log(O n ) – LO in half bits = 2 LO/log(2)

20 A F C G

21 Evaluation of prediction accuracy Coverage = TP/actual_positive Reliability = TP/predicted_positive

22 A*1101 performance 154 peptides, 9 Binders

23 From Bill Paul, ”Fundamental Immunology”, 4th Ed The MHC gene region

24 Human Leukocyte antigen (HLA=MHC in humans) polymorphism - alleles Human Leukocyte antigen (HLA=MHC in humans) polymorphism - alleles A total of 229 HLA-A 464 HLA-B 111 HLA-C class I alleles have been named, a total of 2 HLA-DRA, 364 HLA-DRB 22 HLA-DQA1, 48 HLA-DQB1 20 HLA-DPA1, 96 HLA-DPB1 class II sequences have also been assigned. As of October 2001 (http://www.anthonynolan.com/HIG/index.html)

25 HLA polymorphism - supertypes Each HLA molecule within a supertype essentially binds the same peptides Nine major HLA class I supertypes have been defined HLA-A1, A2, A3, A24,B7, B27, B44, B58, B62 Sette et al, Immunogenetics (1999) 50:201-212

26 SupertypesPhenotype frequencies CaucasianBlackJapaneseChineseHispanicAverage A2,A3, B2783 %86 %88 %88 %86 %86% +A1, A24, B44100 %98 %100 %100 %99 %99 % +B7, B58, B62100 %100 %100 %100 %100 %100 % Sette et al, Immunogenetics (1999) 50:201-212 HLA polymorphism - frequencies

27

28

29

30

31

32 Conclutions We suggest to – split some of the alleles in the A1 supertype into a new A26 supertype – split some of the alleles in the B27 supertype into a new B39 supertype. – the B8 alleles may define their own supertype – The specificities of the class II molecules can be clustered into nine classes, which only partly correspond to the serological classification Lund O, Nielsen M, Kesmir C, Petersen AG, Lundegaard C, Worning P, Sylvester-Hvid C, Lamberth K, Roder G, Justesen S, Buus S, Brunak S. Definition of supertypes for HLA molecules using clustering of specificity matrices. Immunogenetics. 2004 Feb 13 [Epub ahead of print]

33 MHC class I binding of SARS peptides Predictions for all supertypes – Broad population coverage Allele specific neural networks – Peptides with associated measured binding affinity – A1 (A0101), A2 (A0204), A3 (A1101+A0301), B7 (B0702) Weight matrices – Peptides from public databases (Sypfeithi, MHCpep) – A24, B27, B44, B58 and B62

34 Super type weight matrices B27 B62B58 B44

35 Proteasomal cleavage

36

37 Epitope predictions Binding to MHC class I High probability for C-terminal proteasomal cleavage No sequence variation

38

39 Inside out: 1.Position in RNA 2.Translated regions (blue) 3.Observed variable spots 4.Predicted proteasomal cleavage 5.Predicted A1 epitopes 6.Predicted A*0204 epitopes 7.Predicted A*1101 epitopes 8.Predicted A24 epitopes 9.Predicted B7 epitopes 10.Predicted B27 epitopes 11.Predicted B44 epitopes 12.Predicted B58 epitopes 13.Predicted B62 epitopes

40 Christina Sylvester-Hvid, University of Copenhagen, July, 2003 SARS- Experimental validation  Peptides are synthesized, first one received May 8  Peptide preparation (5400 tubes)  Peptides are validated for purity and correct sequence  Peptides are analyzed for peptide binding affinity method used: The quantitative ELISA technique  Calculation af binding affinity, K D

41 Christina Sylvester-Hvid, University of Copenhagen, July, 2003 Development 2m2m2m2m Heavy chain peptide Incubation Peptide-MHC complex Strategy for the quantitative ELISA assay C. Sylvester-Hvid, et al., Tissue antigens, 2002: 59:251 Step I: Folding of MHC class I molecules in solutionStep I: Folding of MHC class I molecules in solution Step II: Detection of de novo folded MHC class I molecules by ELISAStep II: Detection of de novo folded MHC class I molecules by ELISA

42 Summery of peptide binding assays #tested#binding <500nM A11513 A21512 A31514 A240- B71510 B27132 B440- B581513 B621412

43 New epitopes 12 Poor C-term cleavage 8 Cleavage within 31 Linker length 12 Initial polytope (19 HIV epitopes)

44 New epitopes 1 Weak C-term cleavage 3 Cleavage within 7 Linker length 37 Optimized polytope

45

46

47 MHC class II Molecule

48 Virtual matrices HLA-DR molecules sharing the same pocket amino acid pattern, are asumed to have identical amino acid binding preferences.

49 MHC Class II binding Virtual matrices – TEPITOPE: Hammer, J., Current Opinion in Immunology 7, 263-269, 1995, – PROPRED: Singh H, Raghava GP Bioinformatics 2001 Dec;17(12):1236-7 Web interface http://www.imtech.res.in/raghava/propred http://www.imtech.res.in/raghava/propred Prediction Results

50 MHC class II prediction Complexity of problem – Peptides of different length – Weak motif signal Alignment crucial Gibbs Monte Carlo sampler RFFGGDRGAPKRG YLDPLIRGLLARPAKLQV KPGQPPRLLIYDASNRATGIPA GSLFVYNITTNKYKAFLDKQ SALLSSDITASVNCAK PKYVHQNTLKLAT GFKGEQGPKGEP DVFKELKVHHANENI SRYWAIRTRSGGI TYSTNEIDLQLSQEDGQTIE

51 Class II binding motif RFFGGDRGAPKRG YLDPLIRGLLARPAKLQV KPGQPPRLLIYDASNRATGIPA GSLFVYNITTNKYKAFLDKQ SALLSSDITASVNCAK PKYVHQNTLKLAT GFKGEQGPKGEP DVFKELKVHHANENI SRYWAIRTRSGGI TYSTNEIDLQLSQEDGQTI Random ClustalW Gibbs sampler Alignment by Gibbs sampler

52 MHC class II predictions Allele DRB1_0401 Accuracy

53

54

55 Epitope based genetic vaccines Advantages: Epitope vaccines can – be controlled better – induce subdominant epitopes, for example against tumour antigens where there is tolerance against dominant epitopes – target multiple conserved epitopes in rapidly mutating pathogens like HIV and HCV – be analogued to break tolerance – be protective in animal models (Rodriguez, 2001) Ishioka (1999)

56 Genetic vaccines Stimulate synthesis only in cells Advantages – Stimulate cellular immune responses – Standardized method of production Disadvantages – Needs boosting Ellis, RW 1999

57 Polytope construction NH2 COOH Epitope Linker M C-terminal cleavage Cleavage within epitopes New epitopes cleavage

58 Summery of SARS study We have combined bioinformatics and immunology to perform a proteome-wide scan for cytotoxic T cell epitopes directed against SARS and restricted to one of the nine human HLA supertypes (covering >99% of all major human populations). For each HLA supertype, the 15 top-candidates were tested in biochemical binding assays. 75% of epitopes tested thus far bind with an affinity of better than 500nm More than 112 potential vaccine candidates have been identified thus far. They may be tested in SARS survivors and then included in future vaccine design.

59

60

61 Prediction of Antibody epitopes Linear – Hydrophilicity scales (average in ~7 window) Hoop and Woods (1981) Kyte and Doolittle (1982) Parker et al. (1986) – Other scales & combinations Pellequer and van Regenmortel Alix Discontinuous – Protrusion (Novotny, Thornton, 1986) Neural networks (In preparation)

62 Secondary structure in epitopes Sec struct:HTBESGI. Log odds ratio -0.190.300.21-0.270.24-0.040.000.17 H: Alpha-helix (hydrogen bond from residue i to residue i+4) G: 310-helix (hydrogen bond from residue i to residue i+3) I: Pi helix (hydrogen bond from residue i to residue i+5) E: Extended strand B: Beta bridge (one residue short strand) S:Bend (five-residue bend centered at residue i) T:H-bonded turn (3-turn, 4-turn or 5-turn). : Coil

63 Amino acids in epitopes Amino Acid GAVLIMPFWS e/E 0.09 0.070.050.080.040.020.060.030.010.08.0.070.080.070.100.060.030.05 0.020.07 Amino acid CTQNHYEDKR e/E0.030.080.04 0.020.040.060.07 0.04.0.030.060.040.050.020.030.04 0.050.04 Fre

64 Dihedral angles in epitopes Z-scores for number of dihedral angle combinations in epitopes vs. non epitopes Phi\Psi123456789101112 1-0.470.44-0.580.450.460.00 -0.73-0.790.00-0.831.42 2-0.01-0.12-1.820.521.750.00 1.42-0.820.00 31.82-2.26-1.570.480.100.00-0.770.451.770.00-0.820.99 41.761.15-0.340.750.00 0.970.160.381.030.00 5-0.850.45-1.090.570.00 0.131.520.001.02-0.79 60.601.281.301.730.00 1.32-0.89-0.760.00 70.27-0.911.67-0.510.00 -1.02-1.090.00 80.931.21-0.23-3.630.490.00 -0.190.31-0.82 90.000.28-0.670.330.01-0.830.00 0.870.230.00 100.000.951.71-0.700.00 1.291.080.001.000.00 110.00 1.020.00 0.86-0.750.00 120.420.830.281.680.00 1.03-0.21-0.790.93

65 Immunological bioinformatics Classical experimental research – Few data points – Data recorded by pencil and paper/spreadsheet New experimental methods – Sequencing – DNA arrays – Proteomics Need to develop new methods for handling these large data sets Immunological Bioinformatics/Immunoinformatics

66 Acknowledgements CBS, Technical University of Denmark Søren Brunak (Director of CBS) Morten Nielsen (Epitope prediction) Peder Worning (Genome atlases) Claus Lundegaard (Data bases) Mette Børgesen (CTL prediction) Jesper Schantz (Polytope optimization) IMMI, University of Copenhagen Søren Buus (Professor) Christina Sylvester-Hvid (Experimental coordinator) Kasper Lamberth (Peptide bank, Quality control) Erland Johansson, Jeanette Nielsen (Preparations of peptides) Hanne Møller (ELISA binding assay)


Download ppt "Immunological bioinformatics Ole Lund, Center for Biological Sequence Analysis (CBS) Denmark."

Similar presentations


Ads by Google