Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Grouping Multivariate Time Series Variables: Applications to Chemical Process and Visual Field Data Allan Tucker- Birkbeck College Stephen Swift- Brunel.

Similar presentations


Presentation on theme: "1 Grouping Multivariate Time Series Variables: Applications to Chemical Process and Visual Field Data Allan Tucker- Birkbeck College Stephen Swift- Brunel."— Presentation transcript:

1

2 1 Grouping Multivariate Time Series Variables: Applications to Chemical Process and Visual Field Data Allan Tucker- Birkbeck College Stephen Swift- Brunel University Nigel Martin- Birkbeck College Xiaohui Liu- Brunel University

3 2 Introduction b Present a methodology to group Multivariate Time Series (MTS) variables b MTS is a series of observations recorded over time b Test on two real-world applications b Grouping - partitioning a set of objects into a number of mutually exclusive subsets b Many, if not all, are NP-Hard

4 3 MTS Example

5 4 Grouping MTS - Introduction b Desirable to model MTS as a group of several smaller dimensional MTS b Decompose MTS into several smaller dimensional MTS based on dependencies in data b Large number of dependencies because one variable may affect another after a certain time lag

6 5 Grouping MTS - Methodology One High Dimensional MTS (X) 1. Correlation Search (EP) 2. Grouping Algorithm (GGA) Several Lower Dimensional MTS Q (x a, x b, lag) (x c, x d, lag)... (x e, x f, lag) 12... Q len G {{0,3} {1,4,5} {2}

7 6 Correlation Search b Spearman’s Rank Correlation used b Entire Search Space is too large b Invalid Triples: AutocorrelationsAutocorrelations duplicates irrespective of direction where lag = 0 e.g. (x i,x j,0) and (x j,x i,0)duplicates irrespective of direction where lag = 0 e.g. (x i,x j,0) and (x j,x i,0) b Evolutionary Programming approach found to be the most efficient

8 7 Grouping Genetic Algorithm - Representation and Operators b Previously compared and contrasted different GA representations and operators b Falkenauer’s Crossover & Mutation ensure Schema Theory holds for grouping problems 0 3 41 2 6 5 7 Group 0 Group 1 Group 2 Chromosome: 0 1 1 0 0 2 1 2 : 0 1 2

9 8 Grouping - The Grouping Metric Properties b If Q is empty, then fitness maximised when each variable is in a separate group b If Q contains all pairings of variables (the entire search space), then fitness maximised when all variables in the same group b If data is from mixed set of MTS, fitness maximised when variables in the same group have as many correlations as possible in Q and variables in different groups have as few correlations as possible in Q

10 9 Oil Refinery Data b Oil Refinery Process in Scotland b Data recorded every minute b Hundreds of variables b Years of data available on repository b Selected 50 interrelated variables over 10000 time points b Large Time Lags (up to 120 minutes between some variables)

11 10 Visual Field Data The interval between tests is about 6 months Typically, 76 points are measured The number of tests can range between 10 and 44 B Nerve Fibre Bundle (Right Eye) Usual Position of Blind Spot (Right Eye) X Values Range Between 60 =very good, 0 = blind 5666 55 5 667 555 444 5677 32 5 24678 4332211B88 1314 15 11B99 13 1415 1311109 12 1110 12 11 10 1211

12 Oil Refinery Data - Results (1) b Very rapid generation of Groups (seconds) b 3 major groups discovered, 2 relating to the upper and lower trays of the column b Most of the single variables appear noisy b Used as a method for pre-processing data before model building where time is short

13 12 Oil Refinery Data - Results (2)

14 13 Visual Field Data - Results (1) - Patient Group Comparison Patients are ordered on Average Sensitivity Patient 1 - lowest and Patient 82 - the highest Graph goes from light (BRHC) to dark (TLHC)

15 14 Visual Field Data - Results (2) b High Sensitivity implies similar groups Small groups in generalSmall groups in general Points in the eye will be associated with similar nerve fibre bundlesPoints in the eye will be associated with similar nerve fibre bundles b Low Sensitivity implies dissimilar groups Large groups in generalLarge groups in general Different areas of the visual field may be deterioratingDifferent areas of the visual field may be deteriorating

16 15 Conclusions b Decomposing Large, High-Dimensional MTS is a challenging one b Proposed methodology very encouraging b Oil Refinery Data : 3 relatively independent sub-systems rapidly identified b Visual Field Data : Discovered groups offer ideal starting point for modelling as a VAR process

17 16 Future Work b Experimenting with new datasets Gene Expression DataGene Expression Data EEG DataEEG Data b Determining the ideal Parameters e.g. Q len is very influential on final groupingse.g. Q len is very influential on final groupings b Combining the two stages - correlation search and grouping into one incremental process

18 17 Acknowledgements b Engineering and Physical Sciences Research Council, UK b Moorfields Eye Hospital, UK b Honeywell Technology Centre, USA b Honeywell Hi-Spec Solutions, UK b BP-Amoco, UK


Download ppt "1 Grouping Multivariate Time Series Variables: Applications to Chemical Process and Visual Field Data Allan Tucker- Birkbeck College Stephen Swift- Brunel."

Similar presentations


Ads by Google