Immunological bioinformatics Ole Lund, Center for Biological Sequence Analysis (CBS) Denmark.

Slides:



Advertisements
Similar presentations
Pathogenesis and Control of Viral Infections Chapter 30.
Advertisements

Bioinformatical design of a vaccine against influenza virus N1 subtype Bonaccorsi, Irene; Clausen, Martin Bau; Høj, Leif Howalt; Kjær, Jesper and Sayyad,
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Sequence information, logos and Hidden Markov Models Morten Nielsen, CBS, BioCentrum,
HeleneAndersenMayaBondeAndersenSimonCarlsen MortenAhlgreenGronemann&MadsChristianHjortsø Figure 4, Alignment of the NS3 protein: Alignment of NS3 performed.
Understanding biology through structuresCourse work 2006 Understanding Immune Recognition.
Office of Infectious Diseases Computational Challenges for Infectious Diseases Michael Shaw, PhD OID/Office of the Director.
Gibbs sampling Morten Nielsen, CBS, BioSys, DTU. Class II MHC binding MHC class II binds peptides in the class II antigen presentation pathway Binds peptides.
A Few More Things About B Cell Development
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU T cell Epitope predictions using bioinformatics (Neural Networks and hidden.
Prediction of B cell epitopes Pernille Haste Andersen Immunological Bioinformatics CBS, DTU
HOW VACCINE PROVIDES PROTECTION STIMULATORY MOLECULE SECRETIONS B CELL MACROPHAGE Step1 Macrophage takes in antigen by phagocytosis Step2 Macrophages display.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Department of Systems Biology Technical University of Denmark Immunological Bioinformatics Introduction to the.
An Analysis of “Coronavirus 3CL pro proteinase cleavage sites: Possible relevance to SARS virus pathology” Connie Wu.
Vaccine Design. Need for new vaccine technologies The classical way of making vaccines have in many cases been tried for the pathogens for which no vaccines.
Regulatory Motifs. Contents Biology of regulatory motifs Experimental discovery Computational discovery PSSM MEME.
Computer Aided Vaccine Design Dr G P S Raghava. Concept of Drug and Vaccine Concept of Drug Concept of Drug –Kill invaders of foreign pathogens –Inhibit.
MHC Polymorphism Ole Lund. Objectives What is HLA polymorphism? What is it good for? How does it make life difficult for vaccine design? Definition of.
Computational Immunology An Introduction Rose Hoberman BioLM Seminar April 2003.
The branch that breaks Is called rotten, but Wasn’t there snow on it? Bartolt Brecht Haiti after a hurricane.
“Theoretical and Experimental description of Peptide-MHC binding”
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Department of Systems Biology Technical University of Denmark Immunological Bioinformatics Processing, combined.
Biological sequence analysis and information processing by artificial neural networks Morten Nielsen CBS.
MHC Polymorphism. MHC Class I pathway Figure by Eric A.J. Reits.
Class I pathway Prediction of proteasomal cleavage and TAP binidng Morten Nielsen, CBS, BioCentrum, DTU.
Informatics Support for Vaccine Projects Using and extending the UCSC bioinformatics infrastructure.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Technical University of Denmark - DTU Department of systems biology Biopeople Tutorial 2011 Immunological Bioinformatics.
Immunological databases on the web Ole Lund Center for Biological Sequence Analysis BioCentrum-DTU Technical University of Denmark
Epitope Selection Rational Vaccine design. Why? Therapeutic vaccines Therapeutic vaccines Treatment of viral infections (e.g., HIV, HCV), and resistant.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU B cell epitopes and B cell epitope predictions Morten Nielsen, CBS, BioCentrum,
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Immunological Bioinformatics Ole Lund Center for Biological Sequence Analysis BioCentrum-DTU Technical University of Denmark
Chapter 19.1 & 19.3: Genetics of Viruses and Bacteria
MHC and its functions Review: Class I/peptide TCR/CD8 cytotoxic function Class II/peptide TCR/CD4 Helper function TH1 Macrophages TH2 B cells Strong selective.
The Major Histocompatibility Complex And Antigen Presentation
Chapter 3 -- Genetics Diversity Importance of Genetic Diversity Importance of Genetic Diversity -- Maintenance of genetic diversity is a major focus of.
Methods MHC class-I T cell epitope prediction for Nef Consensus and ancestral sequences of the Nef protein for the different HIV-1 subtypes were obtained.
Selection of T Cell Epitopes Using an Integrative Approach Mette Voldby Larsen cand. scient. in Biology PhD in Immunological Bioinformatics.
1 WHO Communicable Diseases, Surveillance & Response SARS Diagnostics and Laboratory Needs: the WHO Perspective C.E. Roth Dangerous and New Pathogens Global.
1 Computer-aided Subunit Vaccine Design G.P.S. Raghava, Institute of Microbial Technology, Chandigarh  Understanding immune system  Breaking complex.
Protein Secondary Structure Prediction Based on Position-specific Scoring Matrices Yan Liu Sep 29, 2003.
Using Comparative Genomics to Explore the Genetic Code of Influenza Sangeeta Venkatachalam.
Biotechnology and Genetic Engineering. Human Cloning-The Science In The News.
INTRA Proteasome TAP MHC I Golgi Calnexin Calreticulin Tapasin CD8 T C EXTRA Li MHC II Golgi Vesicle CLIP HLA-DM CD4 T H Summary.
Telling self from non-self: Learning the language of the Immune System Rose Hoberman and Roni Rosenfeld BioLM Workshop May 2003.
Protein Secondary Structure Prediction G P S Raghava.
1 Web Site: Dr. G P S Raghava, Head Bioinformatics Centre Institute of Microbial Technology, Chandigarh, India Prediction.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS BiC BioCentrum-DTU Technical University of Denmark Cleavage sites and binding affinities.
Introduction H5N1 is an avian influenza. It was detected in humans for the first time in 1997 in Hong Kong. Since then the spread to humans has been limited.
MAJOR HISTOCOMPATIBILITY COMPLEX. MAJOR HISTOCOMPATIBILITY COMPLEX (MHC): Is a segment of the short arm (p) of chromosome 6 containing several genes These.
Lecture 1: Immunogenetics Dr ; Kwanama
Immunology B cells and Antibodies – humoral
Specific Defenses of the Host Part 2 (acquired or adaptive immunity)
Bioinformatics in Vaccine Design
Lecture 19 November 16 th 2010 Quiz 2 scheduled for November 23 rd not November 18th.
Immune system Haixu Tang School of Informatics. Human lymphoid organs.
Prediction of T cell epitopes using artificial neural networks Morten Nielsen, CBS, BioCentrum, DTU.
Lecture 13 Immunology and disease: parasite antigenic diversity.
CATEGORY: VACCINES & THERAPEUTICS HIV-1 Vaccines Shokouh Makvandi-Nejad, University of Oxford, UK HIV-1 Vaccines © The copyright for this work resides.
TIGER * Biosensor for Emerging Infectious Disease Surveillance *Triangulation Identification for Genetic Evaluation of Risks Ranga Sampath David Ecker.
Immune Responses to HIV
T cell receptor & MHC complexes-Antigen presentation
T Cell Receptor (TCR) & MHC Complexes-Antigen Presentation
Adaptive immunity antigen recognition Y Y Y Y Y Y Y Y Y invading
Immunology and disease: parasite antigenic diversity
Ligand Docking to MHC Class I Molecules
Experimental methods in classic epitope discovery
Telling self from non-self: Learning the language of the Immune System
Morten Nielsen, CBS, BioSys, DTU
Deep Learning in Bioinformatics
Presentation transcript:

Immunological bioinformatics Ole Lund, Center for Biological Sequence Analysis (CBS) Denmark.

World-wide Spread of SARS Status as of July 11, 2003: 8437 Infected, 813 Dead

SARS First severe infectious disease to emerge in the post-genomic era Modern societies are vulnerable to epidemics Classical containment strategies has been successful in controlling the epidemic, but – SARS may resurface (e.g. be seasonal) – Suggested existence of an animal reservoir could compromise the containment strategy Need to develop a vaccine strategy Biotechnology has provided new tools to analyze genome/proteome information and guide vaccine development. The causative virus, the SARS corona virus (SARS CoV), has been isolated and full-length sequenced.

Main scientific achievements Discovery of causative agent Genome(s) 3D Structure of main proteinase Seromics – Fragments->Antibodies T cell epitopes VSV-SARS(spike) Co-transfection – Pseudovirus-light when entry

Main scientific achievements Discovery of causative agent Genome(s) 3D Structure of main proteinase Origin – Similar virus found in from Himalayan palm civets and other animals, including a raccoon- dog, and in humans working at an animal market in Guangdong, China (Guan et al., Sep 4, 2003). Himalayan (Masked) palm civet Ferret-Badger Raccoon-dog

Discovery of causative agent Random Arbitrarily Primed (RAP) PCR method Kochs postulates (modified by rivers for viruses 1937) fulfilled 1. Isolation of virus from diseased hosts 2. Cultivation in host cells 3. Proof of filterability 4. Infection course similar disease in original or related host species (Macaques) 5. Re-isolation of virus 6. Detection of specific immune response to the virus Clinical samples – SARS CoV found in 75% of SARS patients – Human metapneumovirus found in 12% of patients – Other agents only sporadically found Source: Albert Osterhaus, Beijing June, 2003; Nature May 2003; Lancet July, 2003

Transmission – No symptoms- – Early period+ – Very ill+++ – 10 days post fever- 41 flights -> 25 transmitted cases Prevention – Early tracking of contacts Origin – Serovonversion (Guan, 2003) Animal traders 40% Vegetable traders5% Source: Claus Stohr, Beijing June, 2003

New corona viruses 1978Porcine Epidemic diarrhea virus (PEDV) Probably from humans 1984Porcine Respiratory Coronavirus 1987Porcine Reproductive and Respiratory Syndrome (PRRS) 1993Bovine corona virus 2003SARS Source: Michael Buchmeier, Beijing June, 2003

Will it be back? When? – Every year?, Like the flu. – Every few years? Like measles used to. – Sporadic? Like Ebola – Never? Lab safety: The patient, a 27-year-old virologist, worked on the West Nile virus in a biosafety level 3 lab at the Environmental Health Institute, where the SARS coronavirus was also studied (Enserink, 2003)

How does the immune system “see” a virus?

The immune system The innate immune system – Found in animals and plants – Fast response – Complement, Toll like receptors The adaptive Immune system – Found in vertebrates – Stronger response 2nd time – B lymphocytes Produce antibodies (Abs) recognizes 3D shapes Neutralize virus/bacteria outside cells – T lymphocytes Cytotoxic T lymphocytes (CTLs) - MHC class I – Recognize foreign protein sequences in infected cells – Kill infected cells Helper T lymphocytes (HTLs) - MHC class II – Recognize foreign protein sequences presented by immune cells – Activates cells

SARS, a corona virus

Vaccines concerns Enhancement – Inactivated RSV, Measles vaccines can lead to a more severe disease – Infection with on (of four, 40-65% identity) serotypes of Dengue virus leads so more severe disease if later infected with another serotype Test – Erasmus university monkey model – “Mobile” vaccine efficacy clinical test protocols that can move to site of outbreak

The SARS Genome 29,736 nt Single stranded RNA+ genome

Weight matrices (Hidden Markov models) YMNGTMSQV GILGFVFTL ALWGFFPVV ILKEPVHGV ILGFVFTLT LLFGYPVYV GLSPTVWLS WLSLLVPFV FLPSDFFPS CVGGLLTMV FIAGNSAYE A2 Logo

Protein sequence information content Entropy – Average Uncertainty in the random variable – H = -  p i log 2 p i range: 0 to log 2 (20) = 4.3 – Logo height I = log 2 (20) + H Relative entropy (Kullback Leibler distance) – D =  p i log 2 (p i /q i )range: 0 to infinity Mutual information – Reduction in uncertainty due to knowledge of another random variable (corresponds to correlation) – M =  p ij log 2 (p ij /p i p j )

Prediction of MHC binding specificity Simple Motifs – Allowed (non allowed) amino acids Extended motifs – Amino acid preferences Structural models – Limitations: precision of force field, and speed of calculations Neural networks – Can take correlations into account

Log odds ratios Used for scoring Alignments (BLAST), HMMs, Matrix methods Odds ratio of observing given amino acids – Relative probability of observing amino acid i in motif position j – O j = p(aa i at pos j )/p(aa i ) Assumption of independence => – Odds for observing sequence = O 1 O 2 … O n Log odds ratio – LO = log(O 1 O 2 … O n ) = log(O 1 )+log(O 2 )+…log(O n ) – LO in half bits = 2 LO/log(2)

A F C G

Evaluation of prediction accuracy Coverage = TP/actual_positive Reliability = TP/predicted_positive

A*1101 performance 154 peptides, 9 Binders

From Bill Paul, ”Fundamental Immunology”, 4th Ed The MHC gene region

Human Leukocyte antigen (HLA=MHC in humans) polymorphism - alleles Human Leukocyte antigen (HLA=MHC in humans) polymorphism - alleles A total of 229 HLA-A 464 HLA-B 111 HLA-C class I alleles have been named, a total of 2 HLA-DRA, 364 HLA-DRB 22 HLA-DQA1, 48 HLA-DQB1 20 HLA-DPA1, 96 HLA-DPB1 class II sequences have also been assigned. As of October 2001 (

HLA polymorphism - supertypes Each HLA molecule within a supertype essentially binds the same peptides Nine major HLA class I supertypes have been defined HLA-A1, A2, A3, A24,B7, B27, B44, B58, B62 Sette et al, Immunogenetics (1999) 50:

SupertypesPhenotype frequencies CaucasianBlackJapaneseChineseHispanicAverage A2,A3, B2783 %86 %88 %88 %86 %86% +A1, A24, B44100 %98 %100 %100 %99 %99 % +B7, B58, B62100 %100 %100 %100 %100 %100 % Sette et al, Immunogenetics (1999) 50: HLA polymorphism - frequencies

Conclutions We suggest to – split some of the alleles in the A1 supertype into a new A26 supertype – split some of the alleles in the B27 supertype into a new B39 supertype. – the B8 alleles may define their own supertype – The specificities of the class II molecules can be clustered into nine classes, which only partly correspond to the serological classification Lund O, Nielsen M, Kesmir C, Petersen AG, Lundegaard C, Worning P, Sylvester-Hvid C, Lamberth K, Roder G, Justesen S, Buus S, Brunak S. Definition of supertypes for HLA molecules using clustering of specificity matrices. Immunogenetics Feb 13 [Epub ahead of print]

MHC class I binding of SARS peptides Predictions for all supertypes – Broad population coverage Allele specific neural networks – Peptides with associated measured binding affinity – A1 (A0101), A2 (A0204), A3 (A1101+A0301), B7 (B0702) Weight matrices – Peptides from public databases (Sypfeithi, MHCpep) – A24, B27, B44, B58 and B62

Super type weight matrices B27 B62B58 B44

Proteasomal cleavage

Epitope predictions Binding to MHC class I High probability for C-terminal proteasomal cleavage No sequence variation

Inside out: 1.Position in RNA 2.Translated regions (blue) 3.Observed variable spots 4.Predicted proteasomal cleavage 5.Predicted A1 epitopes 6.Predicted A*0204 epitopes 7.Predicted A*1101 epitopes 8.Predicted A24 epitopes 9.Predicted B7 epitopes 10.Predicted B27 epitopes 11.Predicted B44 epitopes 12.Predicted B58 epitopes 13.Predicted B62 epitopes

Christina Sylvester-Hvid, University of Copenhagen, July, 2003 SARS- Experimental validation  Peptides are synthesized, first one received May 8  Peptide preparation (5400 tubes)  Peptides are validated for purity and correct sequence  Peptides are analyzed for peptide binding affinity method used: The quantitative ELISA technique  Calculation af binding affinity, K D

Christina Sylvester-Hvid, University of Copenhagen, July, 2003 Development 2m2m2m2m Heavy chain peptide Incubation Peptide-MHC complex Strategy for the quantitative ELISA assay C. Sylvester-Hvid, et al., Tissue antigens, 2002: 59:251 Step I: Folding of MHC class I molecules in solutionStep I: Folding of MHC class I molecules in solution Step II: Detection of de novo folded MHC class I molecules by ELISAStep II: Detection of de novo folded MHC class I molecules by ELISA

Summery of peptide binding assays #tested#binding <500nM A11513 A21512 A31514 A240- B71510 B27132 B440- B B621412

New epitopes 12 Poor C-term cleavage 8 Cleavage within 31 Linker length 12 Initial polytope (19 HIV epitopes)

New epitopes 1 Weak C-term cleavage 3 Cleavage within 7 Linker length 37 Optimized polytope

MHC class II Molecule

Virtual matrices HLA-DR molecules sharing the same pocket amino acid pattern, are asumed to have identical amino acid binding preferences.

MHC Class II binding Virtual matrices – TEPITOPE: Hammer, J., Current Opinion in Immunology 7, , 1995, – PROPRED: Singh H, Raghava GP Bioinformatics 2001 Dec;17(12): Web interface Prediction Results

MHC class II prediction Complexity of problem – Peptides of different length – Weak motif signal Alignment crucial Gibbs Monte Carlo sampler RFFGGDRGAPKRG YLDPLIRGLLARPAKLQV KPGQPPRLLIYDASNRATGIPA GSLFVYNITTNKYKAFLDKQ SALLSSDITASVNCAK PKYVHQNTLKLAT GFKGEQGPKGEP DVFKELKVHHANENI SRYWAIRTRSGGI TYSTNEIDLQLSQEDGQTIE

Class II binding motif RFFGGDRGAPKRG YLDPLIRGLLARPAKLQV KPGQPPRLLIYDASNRATGIPA GSLFVYNITTNKYKAFLDKQ SALLSSDITASVNCAK PKYVHQNTLKLAT GFKGEQGPKGEP DVFKELKVHHANENI SRYWAIRTRSGGI TYSTNEIDLQLSQEDGQTI Random ClustalW Gibbs sampler Alignment by Gibbs sampler

MHC class II predictions Allele DRB1_0401 Accuracy

Epitope based genetic vaccines Advantages: Epitope vaccines can – be controlled better – induce subdominant epitopes, for example against tumour antigens where there is tolerance against dominant epitopes – target multiple conserved epitopes in rapidly mutating pathogens like HIV and HCV – be analogued to break tolerance – be protective in animal models (Rodriguez, 2001) Ishioka (1999)

Genetic vaccines Stimulate synthesis only in cells Advantages – Stimulate cellular immune responses – Standardized method of production Disadvantages – Needs boosting Ellis, RW 1999

Polytope construction NH2 COOH Epitope Linker M C-terminal cleavage Cleavage within epitopes New epitopes cleavage

Summery of SARS study We have combined bioinformatics and immunology to perform a proteome-wide scan for cytotoxic T cell epitopes directed against SARS and restricted to one of the nine human HLA supertypes (covering >99% of all major human populations). For each HLA supertype, the 15 top-candidates were tested in biochemical binding assays. 75% of epitopes tested thus far bind with an affinity of better than 500nm More than 112 potential vaccine candidates have been identified thus far. They may be tested in SARS survivors and then included in future vaccine design.

Prediction of Antibody epitopes Linear – Hydrophilicity scales (average in ~7 window) Hoop and Woods (1981) Kyte and Doolittle (1982) Parker et al. (1986) – Other scales & combinations Pellequer and van Regenmortel Alix Discontinuous – Protrusion (Novotny, Thornton, 1986) Neural networks (In preparation)

Secondary structure in epitopes Sec struct:HTBESGI. Log odds ratio H: Alpha-helix (hydrogen bond from residue i to residue i+4) G: 310-helix (hydrogen bond from residue i to residue i+3) I: Pi helix (hydrogen bond from residue i to residue i+5) E: Extended strand B: Beta bridge (one residue short strand) S:Bend (five-residue bend centered at residue i) T:H-bonded turn (3-turn, 4-turn or 5-turn). : Coil

Amino acids in epitopes Amino Acid GAVLIMPFWS e/E Amino acid CTQNHYEDKR e/E Fre

Dihedral angles in epitopes Z-scores for number of dihedral angle combinations in epitopes vs. non epitopes Phi\Psi

Immunological bioinformatics Classical experimental research – Few data points – Data recorded by pencil and paper/spreadsheet New experimental methods – Sequencing – DNA arrays – Proteomics Need to develop new methods for handling these large data sets Immunological Bioinformatics/Immunoinformatics

Acknowledgements CBS, Technical University of Denmark Søren Brunak (Director of CBS) Morten Nielsen (Epitope prediction) Peder Worning (Genome atlases) Claus Lundegaard (Data bases) Mette Børgesen (CTL prediction) Jesper Schantz (Polytope optimization) IMMI, University of Copenhagen Søren Buus (Professor) Christina Sylvester-Hvid (Experimental coordinator) Kasper Lamberth (Peptide bank, Quality control) Erland Johansson, Jeanette Nielsen (Preparations of peptides) Hanne Møller (ELISA binding assay)