Presentation is loading. Please wait.

Presentation is loading. Please wait.

Variable Penalty Dynamic Time Warping For Aligning Chromatography Data David Clifford Research Scientist June 2009.

Similar presentations


Presentation on theme: "Variable Penalty Dynamic Time Warping For Aligning Chromatography Data David Clifford Research Scientist June 2009."— Presentation transcript:

1 Variable Penalty Dynamic Time Warping For Aligning Chromatography Data David Clifford Research Scientist June 2009

2 CSIRO Issues in aligning multiple - MS spectra Talk Outline Gas Chromatography Mass Spectrometry Examples and Properties Dynamic time warping – origins in speech recognition Uses in the 21 st century aligning GC-MS data Central Idea of the talk – variable penalty DTW, joint work with Glenn Stone Results of alignment and How to do it

3 CSIRO Issues in aligning multiple - MS spectra Gas Chromatography Separates a gas into its constituent parts These elute from machine over period of 40 minutes Measures quantity several times a second Does not identify compounds Gold standard in analytical chemistry Slow process, expensive technology

4 CSIRO Issues in aligning multiple - MS spectra Uses of Gas Chromatography Wine Chemistry Meat quality Metabolomic studies Data format is similar to Liquid Chromatography-MS etc

5 CSIRO Issues in aligning multiple - MS spectra Goal of this talk How can we align the two signals How can we align many signals Dynamic time warping – yes but it overdoes the warping Variable penalty DTW – balances warping with alignment needs VPdtw package now available on CRAN

6 CSIRO Issues in aligning multiple - MS spectra Before and After Alignment

7 CSIRO Issues in aligning multiple - MS spectra Calling for a taxi…. Matches what you say with database of placenames Dynamic time warping was invented in the late 60s early 70s to do this kind of matching. DTW can expand or contract your words to match placenames DTW is natural choice for matching speech Speed of speech differs between individuals Um’s and ah’s need to be cut out etc. DTW is a very fast algorithm, achieves global optimum

8 CSIRO Issues in aligning multiple - MS spectra Dynamic Time Warping REFERENCE Q U E R Y

9 CSIRO Issues in aligning multiple - MS spectra Dynamic Time Warping REFERENCE Q U E R Y

10 CSIRO Issues in aligning multiple - MS spectra No alignment REFERENCE Q U E R Y

11 CSIRO Issues in aligning multiple - MS spectra Alignment by Shift REFERENCE Q U E R Y

12 CSIRO Issues in aligning multiple - MS spectra Linear Transformation (Shift and Stretch) REFERENCE Q U E R Y

13 CSIRO Issues in aligning multiple - MS spectra Parametric Time Warping REFERENCE Q U E R Y

14 CSIRO Issues in aligning multiple - MS spectra Symmetric Dynamic Time Warping REFERENCE Q U E R Y

15 CSIRO Issues in aligning multiple - MS spectra Asymmetric Dynamic Time Warping REFERENCE Q U E R Y

16 CSIRO Issues in aligning multiple - MS spectra Sakoe-Chiba DTW (bound on shift) Memory efficient variation of DTW – faster method REFERENCE Q U E R Y

17 CSIRO Issues in aligning multiple - MS spectra Dynamic Time Warping Guaranteed global optimum, but lots of non-diagonal moves REFERENCE Q U E R Y

18 CSIRO Issues in aligning multiple - MS spectra Dynamic Time Warping REFERENCE Q U E R Y

19 CSIRO Issues in aligning multiple - MS spectra DTW and GC-MS DTW overdoes the warping…. Let’s examine the path REFERENCE Q U E R Y

20 CSIRO Issues in aligning multiple - MS spectra Rotate our view – it’s a complicated warp

21 CSIRO Issues in aligning multiple - MS spectra Paths found with two different penalties

22 CSIRO Issues in aligning multiple - MS spectra Why do we need to care about this Analysis is based on peak area – and overwarping will affect peak shape and area. Overwarping introduces artificial features into data. Overwarping occurs due to too many non-diagonal moves Solution #1: penalise non-diagonal moves Solution #2: variable penalty dependent on size of peaks

23 CSIRO Issues in aligning multiple - MS spectra Variable penalty DTW Minimise over paths w Choose penalty vector using a dilation of the signals Large penalty with large peaks Minimise this function using dynamic programming Easy to implement How does it compare to DTW, constant penalty DTW, and parametric time warping?

24 CSIRO Issues in aligning multiple - MS spectra Key Ingredient for VPdtw Penalty vector – proportional to a dilation of the signal. There is some subjectivity here to balance the need for alignment with the affect on raw signals.

25 CSIRO Issues in aligning multiple - MS spectra Before Alignment – can’t see detail but

26 CSIRO Issues in aligning multiple - MS spectra Check Alignment #1

27 CSIRO Issues in aligning multiple - MS spectra Check Alignment #2

28 CSIRO Issues in aligning multiple - MS spectra Check Alignment #3

29 CSIRO Issues in aligning multiple - MS spectra How far are points moved by alignment?

30 CSIRO Issues in aligning multiple - MS spectra VPdtw package – now on CRAN, GPL 2 VPdtw, dilation, plot.VPdtw, print.VPdtw result <- VPdtw(reference, query, penalty, maxshift = 350) print(result) plot(result,”Before”) plot(result,”After”) plot(result,”Shifts”) plot(result) Many queries, one penalty One query, many penalties Reference can be NULL

31 CSIRO Issues in aligning multiple - MS spectra Comparisons – Time

32 CSIRO Issues in aligning multiple - MS spectra Summary Introduced GC-MS data This talk is really about improving data quality Improvement via alignment without data reduction without unnatural features via fast computation VPdtw available on CRAN Faster Better than available alternatives

33 CSIRO Issues in aligning multiple - MS spectra References DTW: Vintsyuk, T. K. Kibernetika 1968 4 81 - 88 Sakoe, H., and Chiba, S. Proceedings of the International Congress on Acoustics, Budapest, Hungary, 1971; paper 20 c 13. Parametric Time Warping: Eilers, P.H.C. Anal. Chem. 2004 76 404 - 411 Alignment Using Variable Penalty Dynamic Time Warping by Clifford, Stone, Montoliu, Rezzi, Martin, Guy, Bruce and Kochhar. Anal. Chem., 2009, 81 (3), pp 1000–1007

34 Thank you Statistical Bioinformatics - Agribusiness David Clifford Research Scientist CSIRO Division of Mathematics, Informatics and Statistics Phone: +61 2 9325 3210 Email: David.Clifford@csiro.au Web: www.csiro.au/science/org/CMIS.html Contact Us Phone: 1300 363 400 or +61 3 9545 2176 Email: Enquiries@csiro.au Web: www.csiro.au

35 CSIRO Issues in aligning multiple - MS spectra VPdtw package – plot(result,”Before”)

36 CSIRO Issues in aligning multiple - MS spectra VPdtw package – plot(result,”After”)

37 CSIRO Issues in aligning multiple - MS spectra VPdtw package – print(result) Reference is NULL. Query column # 13 is chosen at random. Query matrix is made up of 16 samples of length 5000. Single Penalty vector supplied by user. Max allowed shift is 150. Cost Overlap Max Obs Shift # Diag Moves # Expanded # Dropped Query #1: 1521.10 4994 51 4996 47 2 Query #2: 1708.30 4996 53 5000 49 0 Query #3: 1479.60 4998 59 5000 57 0 Query #4: 1302.30 4998 62 5000 60 0 Query #5: 1505.40 4996 61 5000 57 0 Query #6: 1296.80 4997 60 5000 57 0 Query #7: 1420.80 5000 61 5000 62 0 Query #8: 1484.20 5000 59 5000 60 0 Query #9: 1424.30 5000 51 5000 53 0 Query #10:1306.30 4997 42 5000 39 0 Query #11:1193.30 4994 29 4990 28 5 Query #12: 225.04 4999 13 4998 13 1 Query #13: 0.00 5000 0 5000 0 0 Query #14: 266.09 4944 56 4894 2 53 Query #15: 746.93 4937 63 4880 4 60 Query #16: 345.87 4914 86 4836 0 82


Download ppt "Variable Penalty Dynamic Time Warping For Aligning Chromatography Data David Clifford Research Scientist June 2009."

Similar presentations


Ads by Google