Download presentation

Presentation is loading. Please wait.

Published byRylan Singleterry Modified over 3 years ago

1
Transformations Data transformation is commonly used to linearise the relationship between two numerical variables. If the relationship is non-linear, as revealed by a curved scatterplot, data transformation can used to make the relationship linear. Why transform the data?

2
Transformations For example, consider the following data which gives the marks achieved by a group of students plotted against the number of hours they spent studying.

3
Transformations Data transformation works by stretching out or compressing the scale of measurement. The result of this is that a non- linear relationship can be made linear. How do data transformations work?

4
Transformations We will consider the following commonly used transformations: LogXCompresses the X-scale. X 2 Expands the X-scale 1/XCompresses the X-scale LogYCompresses the Y-scale Y 2 Expands the Y-scale 1/YCompresses the Y-scale

5
Transformations How do we know which transformation to use? In practice we look at the scatterplot and decide whether a ‘stretching’ or ‘compressing’ transformation is needed. When we do this, we will see that (in theory at least) more than one transformation will do the job.

6
Transformations The relationship between mark and study hours can be linearised by either stretching the y-axis, or compressing the x-axis. Appropriate transformations are thus Y 2, logX and 1/X Stretch Y axis Compress X axis

7
Transformations Which is the best transformation in this case? We should try each of these suitable transformations. We should look at both the scatterplot and the residual plot for each, to evaluate the effect of the transformation. If the residual plot shows a clear pattern then the transformation has not been successful. If more than one residual plot is acceptable, then we should choose the transformation which results in the highest value of the coefficient of determination (r 2 ).

8
Transformations The Y 2 -transformation has improved the situation but the residual plot shows a clear pattern. Here, r 2 =.799 ScatterplotResidual plot Try Y 2

9
Transformations The log-transformation has proved more effective than the Y 2 -transformation the but the residual plot still shows some structure. Here, r 2 =.887 ScatterplotResidual plot Try log X

10
Transformations The 1/X-transformation appears to have linearised the relationship, and this is confirmed by the residual plot which shows no apparent pattern. Here, r 2 =.936 ScatterplotResidual plot Try 1/X

11
Transformations Which is the best transformation in this case? Here it is clear that the 1/X transformation is the most appropriate of the three. We can now fit a least squares line to the data, giving us the relationship:

Similar presentations

OK

Chapters 8, 9, 10 Least Squares Regression Line Fitting a Line to Bivariate Data.

Chapters 8, 9, 10 Least Squares Regression Line Fitting a Line to Bivariate Data.

© 2018 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on kotak life insurance Ppt on agile software development Ppt on abstract art tattoo Ppt on manufacturing industry in india Ppt on power line communication Free download ppt on lost city of atlantis Ppt on sectors of indian economy download Ppt on telephone etiquettes Ppt on polynomials for class 8 Ppt on aircraft landing gear system