Presentation is loading. Please wait.

Presentation is loading. Please wait.

Functional Data Analysis T-61.6030 Chapters 10,11,12 Markus Kuusisto.

Similar presentations


Presentation on theme: "Functional Data Analysis T-61.6030 Chapters 10,11,12 Markus Kuusisto."— Presentation transcript:

1 Functional Data Analysis T-61.6030 Chapters 10,11,12 Markus Kuusisto

2 Topics 10 PCA of mixed data 11 Canonical Correlation Analysis 12 Functional linear models

3 PCA of mixed data Both: functional part and vector part (x i,y i ) Canadian temperature: Registeration process finds suitable shift. - Vector part is size of shift - Functional part is shifted curve

4 Canadian temperature

5 Canadian temperature (shifted)

6 Using PCA, vector part y i y i are nuisance parameters -> we ignore y i are marginal importance -> we ignore them when calculating PCA, but afterwards we investigate connections between PCA scores and y i y i are primary importance with functions x i -> we treat them as a hybrid data (x i,y i )

7 The PCA of hybrid data PCA weight function ( ,v) PCA score of particular observation:  i = x i (s)  (s) ds + y’ i v inner product: z i = (x i, y i )  z 1, z 2  = x 1 x 2 + y’ 1 y 2 To find leading principal component maximize sample variance of the  ( ,v), z i  when ||( ,v)|| = 1

8 Balance between functional and vector variation Measure units between functional and vector parts usually are not comparable  z 1, z 2  = x 1 x 2 + C 2 y’ 1 y 2 Choice of C 2 C 2 = |T |, where T is interval of function x i C 2 = |T | / M, where M is length of y C 2 = Var(x) / Var(y)

9 Incorporating smoothing Roughness of z = (x, y ) - D 2 z = (D 2 x, 0) - || D 2 z || 2 = || D 2 x || 2 Calculating like in chapter 9

10 Canonical Correlation Analysis CCA is a way of measuring the linear relationship between two multidimensional variables Ordinary correlation analysis is dependent on the coordinate system in wich variables are described CCA finds the coordinate system where the correlation is maximized

11 Definition of CCA Consider the linear combination x = x T w x y = y T w y Function to be maximized is The maximum of  with respect to w x and w y is maximum canonical correlation The number of solutions are limited to the smallest dimensionality of x and y

12 Car marks example

13 CCA of car marks Correlation r 1 = 0.9792 r 2 = 0.8851 w x1 = w x2 = Price-0.4935 0.6887 Value 0.8697 0.7251 w y1 = w y2 = Economy-0.5471 0.4693 Service 0.2418 0.4496 Design 0.0060 -0.0097 Sport 0.5800 -0.0790 Safety 0.2817 -0.0117 Easy h. 0.4758 0.7558

14 (x T w x2, y T w y2 )

15 Predicting by CCA

16 Learning w x corresponds output (x) w y corresponds 52 previous datapoints (y) Learning - Finding maximum canonical correlation and its weights w x, w y - Linear line fitting Predicting output x is done by projecting y = y T w y to fitted line.

17 Predicting recursively next 50 data points

18 Functional canonical correlation analysis Function to be maximized subject to constraints It is possible allways to find perfect correlation Maximization does not produce a meaningfult result

19 Unsmoothed canonical variate weight function that attain perfect correlation. A standard condition for classical CCA n > p + q + 1, - n is number of samples - p is length of x i and q is lenght of y i In functional case p and q are infinite, no unique solution Overfitting

20 Smoothing Smoothing is essential Choice of can be done –subjectively –by leave one out cross validation, maximazing squared correlation. (11.3.3) ccorsq calculated as above but with the observation (X i,Y i ) omited

21 Smoothed canonical weight functions

22 Functional linear models Previous we have been exploring the variability of a functional variables Now we explore how much of variation is explained by other variables In calssical statistics we do that by linear regression and the general linear models. Now functional linear models

23 Precipitation example Preciptitation (= total rainfall) of particular area where i indexes the 35 weather stations Does the precipitation depend on temperature of that area Overfitting without smoothing

24 A Functional response and a functional independent variable How does a precipitation profile depend on the associated temperature profile ? Concurrent: Precipitation now depends only on the temperature now Annual: Precipitation now depend on the temperature of the whole year

25 Short-term feed-forward: For reasons of parsimony, precipitation now depends on the temperature over an interval back in time. Local influence: Precipitation now depends on the temperature over an interval back in time and the season (is it summer or winter ?)

26 Predicting derivatives Dynamic model: Model is designed to explain a derivative of some order –homogenous first order linear differential equation –- nonhomogenous temperature in the equation is called forcing function

27 References Book: Functional Data Analysis, J.O.Ramsay, B.W.Silverman –http://www.functionaldata.org/http://www.functionaldata.org/ matlab toolbox for FDA http://www.imt.liu.se/~magnus/cca/ –Classical Canonical Correlation Analysis –Method about solving blind source separation problem based on CCA –Matlab functions: cca.m and ccabss.m http://www.quantlet.com/mdstat/scripts/mva/htmlbook/mvahtmlnode95.html –Car marks example –You may get confused because results presented here differs from the site above. Reason is that in that site the first and second canonical correlations are changed places. http://www.estsp2007.org/ –Data of example ”Predicting by CCA”


Download ppt "Functional Data Analysis T-61.6030 Chapters 10,11,12 Markus Kuusisto."

Similar presentations


Ads by Google