Presentation is loading. Please wait.

Presentation is loading. Please wait.

Phylogenetic Analysis of the SARS virus Zhang Louxin Dept of Math, NUS Dept of Math, NUS.

Similar presentations


Presentation on theme: "Phylogenetic Analysis of the SARS virus Zhang Louxin Dept of Math, NUS Dept of Math, NUS."— Presentation transcript:

1 Phylogenetic Analysis of the SARS virus Zhang Louxin Dept of Math, NUS Dept of Math, NUS

2 Our Study  Genome-wide analysis of the SARS virus: a) Comparison to other coronaviruses a) Comparison to other coronaviruses b) Comparison between different strains of the SARS viruses. b) Comparison between different strains of the SARS viruses.  Objective: a) Search for clues to where the SARS virus came from a) Search for clues to where the SARS virus came from b) Provide a technique in tracking the step-by-step b) Provide a technique in tracking the step-by-step transmission of the SARS. transmission of the SARS.

3 Genome Structure of the SARS virus (Marra et al., Rota et al., Ruan et al., 2003) The RNA genome contains about 30k bps, having five major open reading frames (ORFs): ORF1a and ORF1b: replicase polyprotein (13149, 7887 bps) S: spike glycoprotein (3768 bps) E: small envelope protein (231 bps) M: membrane glycoproteins (666 bps) N: nucleocapsid protein (1269 bps) and 7 unknown ORF’s X’s (total 2595 bps)

4 Part 1: Comparison to other cornaviruse  In comparing the sequences of SARS virus and each known coronavirus, we did a) pairwise alignment between genomes a) pairwise alignment between genomes of SARS-Cov and the coronavirus. of SARS-Cov and the coronavirus. b) the quality of a) was assessed using b) the quality of a) was assessed using VISTA which measuring the match rate f(x) VISTA which measuring the match rate f(x) in the window [x-99, x] in the window [x-99, x]

5 Reference: Sars_HK Conserve level: 60% Highlighted segments: ORF1a, ORF1b, S, M, N Cow vs SARS_HK

6 Reference: SARS_HK Conservation Level: 50% Highlighted Segments: ORF1a, ORF1b, S, M, N Human(229E) Pig Turkey Mouse Cow Bird

7 Observations and Questions  The SARS coronavirus is not a recombinant between known coronaviruses. (Marra et al., Rota et al., Ruan et al., 2003).  It probably evolved from an ancestor of the coronavirus family (in an unidentified host) for a long time before infecting humans in (in an unidentified host) for a long time before infecting humans in Question 1 : What is the host? Civet cat or cow ? Question 1 : What is the host? Civet cat or cow ? Question2 : How did the SARS-COV evolve? Did it acquire genes from its host, and/or exchange genetic information with other viruses? Question2 : How did the SARS-COV evolve? Did it acquire genes from its host, and/or exchange genetic information with other viruses?

8 Only Bioinformatics is not Enough  Searches of the GeneBank and genomic databases (with BLAST) indicated there is no significant sequence matches to ORF1a, indicated there is no significant sequence matches to ORF1a, and other ORF’s in the last one thirds of the genome. and other ORF’s in the last one thirds of the genome. This motivates us to consider the other ORF’s X1-X7 that encode putative proteins, which are located in the last one thirds of the genome. This motivates us to consider the other ORF’s X1-X7 that encode putative proteins, which are located in the last one thirds of the genome. We conducted a) BLAST searches a) BLAST searches b) Analysis of nucleotide base difference among different strains of the SARS-COV. b) Analysis of nucleotide base difference among different strains of the SARS-COV. Our analysis shows clearly that they are very unique to the SARS-COV.

9 BLAST Searches: No significant hits.  X1:  gi| |ref|NM_ | Mus musculus slit homolog 2 (D  gi| |ref|XM_ | Mus musculus slit homolog 2 (D  gi| |gb|AF |AF Mus musculus SLIT2 (Slit  gi| |dbj|AK | Mus musculus 0 day neonate lung  gi| |gb|AF |AF Mus musculus neurogenic e  gi| |dbj|AK | Mus musculus 12 days embryo emb  X2:  gi| |gb|U |DMU23446 Drosophila melanogaster tri  gi| |emb|AL | Mouse DNA sequence from clone R  gi| |emb|AL | Mouse DNA sequence from clone  X3:  gi| |emb|AL | Mouse DNA sequence from clone R  gi| |gb|U | Caenorhabditis elegans cosmid D  gi| |ref|NM_ | Caenorhabditis elegans putativ  gi| |emb|AL | Human DNA sequence from clone  gi| |gb|AC | Homo sapiens chromosome 8, clone  gi| |emb|AL | Mouse DNA sequence from clone R  gi| |gb|AY | Rhingia campestris 28S ribosomal  gi| |gb|AY | Homo sapiens Kell blood group (K  gi| |gb|AC | Mus musculus clone RP24-456N19,  X4:  gi| |gb|AC | Homo sapiens BAC clone RP11-1N  gi| |ref|NM_ | Arabidopsis thaliana formin ho  gi| |gb|AC | Mus musculus chromosome 5 clone  gi| |gb|AC | Homo sapiens BAC clone RP11-402J  gi| |gb|AC | Homo sapiens BAC clone RP11-562F  gi| |gb|AC | Homo sapiens BAC clone RP11-438K  gi| |emb|AL |HS279F22 Homo sapiens chromosome  gi| |emb|AL | Human DNA sequence from clone R  gi| |emb|AL | Human DNA sequence from clone

10 BLAST Searches (Con’t)  X5:  gi| |gb|AC | Mus musculus chromosome 6 clone  gi| |gb|AC | Dictyostelium discoideum chromos  gi| |gb|AC | Homo sapiens X BAC RP11-804N7 (R  gi| |gb|AY | Dictyostelium discoideum nucleot  gi| |ref|NM_ | Arabidopsis thaliana pyruvate  gi| |gb|AC | Equus caballus clone CH I  gi| |gb|AY | Arabidopsis thaliana clone  gi| |gb|AF | Homo sapiens chromosome X clone  gi| |gb|AY | Homo sapiens Cockayne syndrome  gi| |gb|AC | Mus musculus chromosome 10 clone  X6  gi| |gb|AC |AC citb_188_b_12, complete s  gi| |gb|AE | Xanthomonas campestris pv. campe  gi| |emb|AL | Zebrafish DNA sequence from cl  gi| |gb|AC | Homo sapiens BAC clone RP11-588K  gi| |gb|AC | Homo sapiens BAC clone RP11-289J  gi| |gb|AC |AC Homo sapiens BAC clone RP  gi| |gb|AC |AC Homo sapiens chromosome  gi| |dbj|AK | Mus musculus 0 day neonate eyeb  gi| |emb|AL | Mouse DNA sequence from clone  X7  gi| |gb|AE | Drosophila melanogaster chromoso  gi| |ref|XM_ | Rattus norvegicus similar to h  gi| |gb|AC | Mus musculus chromosome 7 clone  gi| |gb|AC | Drosophila melanogaster X BAC RP  gi| |gb|AC | Drosophila melanogaster X BAC RP  gi| |emb|AL |HSJ395C13 Human DNA sequence fro

11 Variations among the SARS-Cov strains: Nucleotide base differences Vietnam_CDC tagtggggtt cagtttcgtg ccattccgta ccct Singapore_2774 tagcggggtt cagtctcgaa tcattttgta ctct Toronto_02 tagcggggtt cagtctcgta ctatgctata ctct HongKong cagcaccgtt caagttcata ctattctgaa ttct CUHK tatcggggcc cagtcgtgtg ctactctgta ctcc Beijing_01 tagcgggtct tcgtcgcgta ctgctctgtc cctt Gunagzhou_01 tagcgggtcc cagtcgcgta ctactctgta ctcc Taiwan_01 tggcggggtt cagtctcgta ctattctgta ctct Summary: a) Differences occur in 34 positions; b) There are 10 positions where the sequence variants appear in >=1 sequences; c) The base differences could be sequencing errors, mutational noises, and mutational sites in the SARS genomes.

12 Simple Statistics StrainsThe Number of Nucleotide Base Differences ORF1aORF1bSThe rest VietnamCDC1311 Singapore0310 Toronto0111 HongKong14402 CUHK3311 Beijing4111 Guangzhou3111 Taiwan1000 Means Deviations All differences appear in 78 positions Reference: Their consensus sequence

13 No Significant Differences in X’s Total length is 2595 bps (8.65%). The expected differences in these Regions are about 5 to 6. Therefore, those X’s could be useful as markers for inferring the evolutionary history of SARS virus. Positions CDCCDC GISGIS TORTOR HKHK CUHKCUHK BJBJ GZGZ TWTW X125298ggaggggg 25569tttatttt X2 X327243ccccctcc X4 X5 X6 X7

14 Part2: Phylogenetic analysis in tracking the step-by-step transmission  One unusual property of viruses is that they evolves rapidly. Such a property is an unfortunate one from the perspective of Such a property is an unfortunate one from the perspective of creating a vaccine. creating a vaccine. However, their rapid evolution enables fine-scale phylogenetic However, their rapid evolution enables fine-scale phylogenetic analysis. For example, it has been used for a) finding the origins of the HIV viruses ( Gao et al.’99, Hahn et al.’00) a) finding the origins of the HIV viruses ( Gao et al.’99, Hahn et al.’00) b) tracking man-to-man transmission of the HIV virus ( Ou et al.’92) b) tracking man-to-man transmission of the HIV virus ( Ou et al.’92) c) deriving the transmission model of influenza c) deriving the transmission model of influenza (Fitch et al.’97, Bush et al.’99 ) (Fitch et al.’97, Bush et al.’99 )

15 Phylogenetic Analysis A molecular phylogeny summaries the genetic variations of the molecular sequences being studied. (Marra et al, 2003) The rationale behind disease tracking: If A infects B and then B infects C, then the virus in C is more similar to B than A.

16 Phylogenetic Analysis of 8 Strains a) a)Compute pairwise alignemnts over ORF1b region. b) Find pairwise distances using Jukes and Cantor model; c) c)Draw the phylogeny using neighboring drawing method

17 Phylogenetic Analysis of 8 Strains (Con’t) The phylogeny was constructed over S-E-M-N regions using the parsimony method.

18 Phylogenetic Analysis of 8 Strains (Con’t) Our work indicates that it is possible to track the step-by-step transmission of the SARS using molecular data. However, a lots of work need to be done in the future. For example, we need to study the effects of mutational noise on the phylogeny analysis.

19 Conclusion a) Our genome-wide analysis indicates that the regions that encode unknown proteins in the last 1/3 part of the SARS genome could be useful as markers for studying the virus. b). Phylogenetic analysis can be used for tracking the step-by- step transmission of the SARS. The work was discussed with Louis Chen, Choi Kwok Pui, David Chew, Phil Long, P. Kolatkar, Vega Vinsensius, Lin Chin-Yo. The work was done with Li Quan and Wu YongHui.


Download ppt "Phylogenetic Analysis of the SARS virus Zhang Louxin Dept of Math, NUS Dept of Math, NUS."

Similar presentations


Ads by Google