Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 A Framework for Modelling Short, High-Dimensional Multivariate Time Series: Preliminary Results in Virus Gene Expression Data Analysis Paul Kellam 1,

Similar presentations


Presentation on theme: "1 A Framework for Modelling Short, High-Dimensional Multivariate Time Series: Preliminary Results in Virus Gene Expression Data Analysis Paul Kellam 1,"— Presentation transcript:

1 1 A Framework for Modelling Short, High-Dimensional Multivariate Time Series: Preliminary Results in Virus Gene Expression Data Analysis Paul Kellam 1, Xiaohui Liu 2, Nigel Martin 3, Christine Orengo 4, Stephen Swift 2, Allan Tucker 2 1 Dept of Immunology and Molecular Pathology, UCL, UK 2 Dept of Information Systems and Computing, Brunel University, UK 3 Dept of Computer Science, Birkbeck College, London, WC1E 7HX, UK 4 Dept of Biochemistry and Molecular Biology, UCL, WC1E 6BT, UK

2 2 Framework Expression Data Clustering Algorithms Cluster Fusion Model Building Clusters Robust Clusters ForecastsExplanations

3 3 Clustering Algorithms  Hierarchical  The Grouping Genetic Algorithm  K-Means  The Self Organising Map

4 4 Cluster Fusion (1) Construct Agreement Matrix Clusterfusion...... Cluster Method 1 Cluster Method 2 Cluster Method N

5 5 The Agreement Matrix F = To Gene From Gene

6 6 Viral Gene Expression Data  Kaposi's Sarcoma-Associated Human Herpesvirus 8 (HHV8)  106 viral and human genes  Induced with 12-O-TetradecoylPhorbol 13-Acetate (TPA)  13 Measurements over time  Normalised expression levels

7 7 Evaluation  Compare cluster similarity using Weighted-Kappa  Compare clusters against biological domain knowledge  Clusterfusion

8 8 Weighted-Kappa Results Hx :Hierarchical Clustering with x Clusters Kx :K-Means Clustering with x Clusters Sx :Self Organising Map with x Clusters Gx :Grouping Genetic Algorithm with x Clusters

9 9 Domain Knowledge Results

10 10 Clusterfusion Results  48 out of 106 genes unassigned  Mostly pairs or triples  Only 3 of feature 2 are present!  Although there are some interesting results, e.g. unknown function genes placed with those of known function

11 11 Modelling  We have focussed on the Dynamic Bayesian Network Models a temporal domain probabilisticallyModels a temporal domain probabilistically Consists of a graphical representation and conditional probability distributionsConsists of a graphical representation and conditional probability distributions Facilitates the combining of expert knowledge and dataFacilitates the combining of expert knowledge and data Models can be queried to investigate the relationships discovered from dataModels can be queried to investigate the relationships discovered from data Requires data discretisationRequires data discretisation

12 12 Dynamic Bayesian Networks g0g1g2g3g4g0g1g2g3g4 t-5 t-4 t-3 t-2 t-1 t Genes Time Lag

13 13 Modelling Results Example DBNs (compact representation without lags included):

14 14 Forecast Results

15 15 Explanation  Apply inference given observations about certain nodes: Insert observations into DBNInsert observations into DBN Apply inference back in timeApply inference back in time Construct explanations using posterior probabilitiesConstruct explanations using posterior probabilities

16 16 Explanation - Results An example explanation using a discovered DBN P(C7 is 2) =1.000 P(H8 is 2) = 0.999P(B12 is 2) = 0.884 P(C7 is 1) =1.000 P(B6 is 2) = 0.568 P(A7 is 1) = 0.510 P(B12 is 1) = 0.440 122 121

17 17 Conclusions  Modelling gene expression data is a challenging task  Introduced a framework for modelling such data  Encouraging preliminary results when applied to viral gene expression data  More rigorous testing on different datasets

18 18 Acknowledgements  Biotechnology and Biological Sciences Research Council (BBSRC), UK  The Engineering and Physical Sciences Research Council (EPSRC), UK


Download ppt "1 A Framework for Modelling Short, High-Dimensional Multivariate Time Series: Preliminary Results in Virus Gene Expression Data Analysis Paul Kellam 1,"

Similar presentations


Ads by Google