Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Automatic Explanation of Multivariate Time Series (MTS) Allan Tucker.

Similar presentations


Presentation on theme: "The Automatic Explanation of Multivariate Time Series (MTS) Allan Tucker."— Presentation transcript:

1 The Automatic Explanation of Multivariate Time Series (MTS) Allan Tucker

2 The Problem - Data Datasets which are Characteristically: –High Dimensional MTS –Large Time Lags –Changing Dependencies –Little or No Available Expert Knowledge

3 Lack of Algorithms to Assist Users in Explaining Events where: –Model Complex MTS Data –Learnable from Data with Little or No User Intervention –Transparency Throughout the Learning and Explaining Process is Vital The Problem - Requirement

4 Contribution to Knowledge Using a Combination of Evolutionary Programming (EP) and Bayesian Networks (BNs) to Overcome Issues Outlined Extending Learning Algorithms for BNs to Dynamic Bayesian Networks (DBNs) with Comparison of Efficiency Introduction of an Algorithm for Decomposing High Dimensional MTS into Several Lower Dimensional MTS

5 Contribution to Knowledge (Continued) Introduction of New EP-Seeded GA Algorithm Incorporating Changing Dependencies Application to Synthetic and Real-World Chemical Process Data Transparency Retained Throughout Each Stage

6 Real Data Data Preparation Search Methods Variable Groupings Synthetic Data Explanation Model Building Evaluation Changing Dependencies Framework Pre-processing

7 Key Technical Points 1 Comparing Adapted Algorithms New Representation K2/K3 [Cooper and Herskovitz] Genetic Algorithm [Larranaga] Evolutionary Algorithm [Wong] Branch and Bound [Bouckaert] Log Likelihood / Description Length Publications: –International Journal of Intelligent Systems, 2001

8 Key Technical Points 2 Grouping A Number of Correlation Searches A Number of Grouping Algorithms Designed Metrics Comparison of All Combinations Synthetic and Real Data Publications: –IDA99 –IEEE Trans System Man and Cybernetics 2001 –Expert Systems 2000

9 Key Technical Points 3 EP-Seeded GA Approximate Correlation Search Based on the One Used in Grouping Strategy Results Used to Seed Initial Population of GA Uniform Crossover Specific Lag Mutation Publications: –Genetic Algorithms and Evolutionary Computation Conference 1999 (GECCO99) –International Journal of Intelligent Systems, 2001 –IDA2001

10 Key Technical Points 4 Changing Dependencies Dynamic Cross Correlation Function for Analysing MTS Extend Representation Introduce a Heuristic Search - Hidden Controller Hill Climb (HCHC) –Hidden Variables to Model State of the System –Search for Structure and Hidden States Iteratively

11 Future Work Parameter Estimation Discretisation Changing Dependencies Efficiency New Datasets –Gene Expression Data –Visual Field Data

12 DBN Representation t-4 t-3 t-2 t-1 t a 0 (t) a 1 (t) a 2 (t) a 3 (t) a 4 (t) a 2 (t-2) a 3 (t-2) a 4 (t-3) a 3 (t-4) (3,1,4) (4,2,3) (2,3,2) (3,0,2) (3,4,2)

13 Sample DBN Search Results N = 5, MaxT = 10N = 10, MaxT = 60

14 Grouping One High Dimensional MTS (A) 1. Correlation Search (EP) 2. Grouping Algorithm (GGA) Several Lower Dimensional MTS List (a, b, lag) 12R12R G {0,3} {1,4,5} {2}

15 Sample Grouping Results 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 0 6 1 2 3 4 5 7 8 9 10 11 12 13 14 15 20 21 22 16 17 18 19 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Original Synthetic MTS Groupings Groupings Discovered from Synthetic Data Sample of Variables from a Discovered Oil Refinery Data Group

16 Parameter Estimation Simulate Random Bag (Vary R, s and c, e) Calculate Mean and SD for Each Distribution (the Probability of Selecting e from s) Test for Normality (Lilliefors’ Test) Symbolic Regression (GP) to Determine the Function for Mean and SD from R, s and c (e will be Unknown) Place Confidence Limits on the P(Number of Correlations Found  e)

17 0: (a,b,l) 1:(a,b,l) 2:(a,b,l) EPListSize: (a,b,l) Final EPList EP 0: ((a,b,l),(a,b,l)…(a,b,l)) 1: ((a,b,l),(a,b,l)…(a,b,l)) 2: ((a,b,l),(a,b,l)…(a,b,l)) GAPopsize: ((a,b,l) … (a,b,l)) GA Initial GAPopulation DBN EP-Seeded GA

18 EP-Seeded GA Results N = 10, MaxT = 60N = 20, MaxT = 60

19 Varying the value of c

20 P(TGF instate_0) = 1.0 t t-1 t-11 t-13 t-16 t-20 t-60 P(TT instate_0) = 1.0P(BPF instate_3) = 1.0 P(TT instate_1) = 0.446 P(TGF instate_3) = 1.0 P(SOT instate_0) = 0.314 P(C2% instate_0) = 0.279 P(T6T instate_0) = 0.347 P(RinT instate_0) = 0.565 TimeExplanation

21 Changing Dependencies 20 25 30 35 40 45 50 1501100115012001250130013501 Time (Minutes) Variable Magnitude 7 7.5 8 8.5 9 9.5 10 10.5 A/M_GB TGF

22 Dynamic Cross- Correlation Function

23 Hidden Variable - OpState t-4 t-3 t-2 t-1 t a 2 (t)OpState 2 a 2 (t-1) a 3 (t-2) a 0 (t-4)

24 Hidden Controller Hill Climb Update Segment_Lists through Op_State Parameter Estimation Update DBN_List through DBN Structure Search Score

25 HCHC Results - Oil Refinery Data

26 HCHC Results - Synthetic Data Generate Data from Several DBNs Append each Section of Data Together to Form One MTS with Changing Dependencies Run HCHC

27 t t-1 t-3 t-5 t-6 t-9 Time Explanation P(OpState 1 is 0) = 1.0P(a 1 is 0) = 1.0P(a 0 is 0) = 1.0 P(a 2 is 1) = 1.0 P(OpState 1 is 0) = 1.0P(a 1 is 1) = 1.0P(a 0 is 0) = 1.0 P(a 2 is 1) = 1.0 P(a 2 is 0) = 0.758 P(a 2 is 0) = 0.545 P(a 0 is 0) = 0.968 P(a 0 is 1) = 0.517 P(OpState 0 is 0) = 0.519 P(a 0 is 1) = 0.778P(OpState 0 is 0) = 0.720

28 t t-1 t-3 t-5 t-6 t-7 t-9 Time Explanation P(OpState 1 is 4) = 1.0P(a 1 is 0) = 1.0P(a 0 is 0) = 1.0 P(a 2 is 1) = 1.0 P(OpState 1 is 4) = 1.0P(a 1 is 1) = 1.0P(a 0 is 0) = 1.0 P(a 2 is 1) = 1.0 P(a 2 is 1) = 0.570 P(a 2 is 1) = 0.974 P(a 0 is 0) = 0.506 P(a 0 is 1) = 0.549 P(OpState 2 is 3) = 0.210 P(a 2 is 0) = 0.882 P(OpState 2 is 4) = 0.222

29 Process Diagram TT T6T T36T RBT SOTT11 SOFT13 TGF BPF %C3 %C2 RINT FF PGM PGB AFT C11/3T

30 Typical Discovered Relationships TT T6T T36T RBT SOTT11 SOFT13 AFT TGF BPF %C3 %C2 RINT FF C11/3T PGM PGB

31 Parameters DBN SearchGAEP PopSize10010 MR0.10.8 CR0.8--- GenBased on FCBased on FC Correlation Search c - Approx. 20% of s R - Approx. 2.5% of s Grouping GA Synth. 1Synth. 2-6 Oil PopSize 150100150 CR0.80.80.8 MR 0.10.10.1 Gen 150100 (1000 for GPV) 150

32 Parameters EP-Seeded GA c- Approx. 20% of s EPListSize- Approx. 2.5% of s GAPopSize - 10 MR- 0.1 CR- 0.8 LMR- 0.1 Gen- Based on FC HCHC OilSynthetic DBN_Iterations1×10 6 5000 Win len 1000200 Win jump 50050


Download ppt "The Automatic Explanation of Multivariate Time Series (MTS) Allan Tucker."

Similar presentations


Ads by Google