Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mapping Influenza A Virus Transmission Networks with Whole Genome Comparisons (Methods)‏ Adrienne Breland TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA.

Similar presentations


Presentation on theme: "Mapping Influenza A Virus Transmission Networks with Whole Genome Comparisons (Methods)‏ Adrienne Breland TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA."— Presentation transcript:

1 Mapping Influenza A Virus Transmission Networks with Whole Genome Comparisons (Methods)‏ Adrienne Breland TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

2 Goal - to characterize global Influenza A Virus transmission as a complex network TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

3 TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Russell (2008) The global circulation of seasonal influenza A (H3N2) viruses Proposed global H3N2 circulation

4 TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

5 Motivation Major Questions Data Genome Comparison Method TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Outline

6 Motivation Major Questions Data Genome Comparison Method TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Outline

7 Motivation Delineating real disease networks is difficult – Infection tracing: Detecting exact transmission links – Contact tracing: All potential transmission contacts – Diary Based: Subject records all contacts TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

8 Motivation TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Infection tracingContact tracingDiary Based Keeling M & K Eames (2005) Networks and epidemic models. J. R. Soc. Interface 2:295-307

9 Motivation Delineating real disease networks is very useful TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

10 Motivation Delineating real disease networks is very useful -targeting an attack TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

11 Motivation Delineating real disease networks is very useful TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Error and attack tolerance of complex networks. Réka Albert, Hawoong Jeong and Albert-László Barabási

12 Motivation Delineating real disease networks is very useful TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT http://prblog.typepad.com/strategic_public_relation/images/2007/06/22/simple_soci al_network.png

13 Motivation Delineating real disease networks is very useful -correlation coefficients

14 Motivation Delineating real disease networks is very useful -detecting more probable global routes TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

15 Motivation Global routes TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

16 Motivation Global routes TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Breland A, S Nasser, K Schlauch, M Nicolescu, F Harris (2008) Efficient Influenza A Virus Origin Detection. Journal of Electronics and Computer Science, 10;1-12

17 Motivation Delineating real disease networks is very useful -examine with other spatial data TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

18 Motivation Spatial data TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

19 Motivation Spatial data TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT VEGETATION

20 Motivation Spatial data TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT POPULATION

21 Motivation Spatial data TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT CLIMATE CHANGE

22 Motivation Major Questions Data Genome Comparison Method TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Outline

23 Major questions Location and degree of host jumping Underlying structure (small world, power law..)‏ Subtype independence Re-assortment Geographic routes TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

24 Motivation Major Questions Data Genome Comparison Method TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Outline

25 Data http://www.ncbi.nlm.nih.gov/genomes/FLU/FLU.html TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

26 Data ≈ 4000 sequences 1999-2009 Global regions (i.e. China, U.S., Africa, India...)‏ All subtypes (i.e. H5N1, H1N1,..)‏ All hosts species (Domestic Avian, Wild Avian, etc..)‏ TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

27 Data ≈ 374 per year TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

28 Data Multiple host types TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

29 Data Multiple sub types TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

30 Motivation Major Questions Data Genome Comparison Method TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Outline

31 Genome Comparisons Similarity matrix, N sequences: N(N-1)/2 comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT -0.40.10.970.10.82N --0.30.60.70.9. ---0.30.50.02. ---- 0.01. -----0.932 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21

32 Romanova,J (2006) The fight against new types of influenza virus. Biotechnology J, 1:1381-1392

33 Genome Comparisons 8 segments TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 - 1 N -- 1. ---. ----. -----12 ------1 N...21 HA ≈ 1750bp NS ≈ 900bp M ≈ 1000bpNA ≈ 1300bpNP ≈ 1500bp PA ≈ 2100bpPB1 ≈ 2200bpPB2 ≈ 2300bp

34 Genome Comparisons 8 segments TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT

35 Genome Comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Alignment, O(n 2 ), n = max sequence length.....AAAACTTGAACC..........GGACTTGACCT.....

36 Genome Comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA AAGAACCTTTATGACAAGGTTCGACTACA GCTTAGGGATAATGCAAAGGAGCTGGT Alignment-free k-mers, O(n)‏ ∑ = {A,C,G,T/U} 4 k possible k-mers, k≥0 TT TG... AG AC AA frequencyk-word

37 Genome Comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA Feature Frequency Profiles (FFP)‏ C k = F k = = Sims GE, Jun SR, Wu GA, Kim SH (2009) Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions. Proc Natl Acad Sci U S A., 106(8):2677-82.

38 Genome Comparisons Jensen-Shannon Divergence (JS) compare(s 1,s 2 ) TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA P k = FFP(s 1 ), Q k = FFP(s 2 ), M k = (P k + M k )/2 JS(P k,M k ) = 1/2KL(P k,M k ) + 1/2KL(Q k,M k )‏ KL =

39 Genome Comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA k=?

40 Genome Comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA k=? k s.t. N(k) ≥ N(k+1)‏ k ≈ 4

41 Genome Comparisons TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA Actual & Predicted times

42 Questions/Comments? Thanks


Download ppt "Mapping Influenza A Virus Transmission Networks with Whole Genome Comparisons (Methods)‏ Adrienne Breland TTGTGGATTCTTGATCGTCTTTTCTTCAAATGTAT TTATCGTCGCCTTAAATACGGA."

Similar presentations


Ads by Google