Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics Correlation and regression. 2 Introduction Some methods involve one variable is Treatment A as effective in relieving arthritic pain as Treatment.

Similar presentations


Presentation on theme: "Statistics Correlation and regression. 2 Introduction Some methods involve one variable is Treatment A as effective in relieving arthritic pain as Treatment."— Presentation transcript:

1 Statistics Correlation and regression

2 2 Introduction Some methods involve one variable is Treatment A as effective in relieving arthritic pain as Treatment B? Correlation and regression used to investigate relationships between variables most commonly linear relationships between two variables is BMD related to dietary calcium level?

3 3 Contents Coefficients of correlation meaning values role significance Regression line of best fit prediction significance

4 4 Introduction Correlation the strength of the linear relationship between two variables Regression analysis determines the nature of the relationship Is there a relationship between the number of units of alcohol consumed and the likelihood of developing cirrhosis of the liver?

5 5 Pearson’s coefficient of correlation r Measures the strength of the linear relationship between one dependent and one independent variable curvilinear relationships need other techniques Values lie between +1 and -1 perfect positive correlation r = +1 perfect negative correlation r = -1 no linear relationship r = 0

6 6 Pearson’s coefficient of correlation r = +1 r = -1 r = 0.6 r = 0

7 7 Scatter plot dependent variable make inferences about independent variable controlled in some cases Calcium intake BMD make inferences from

8 8 Non-Normal data

9 9 Normalised

10 10 Calculating r The value and significance of r are calculated by SPSS

11 SPSS output: scatter plot 11

12 SPSS output: correlations 12

13 13 Interpreting correlation Large r does not necessarily imply: strong correlation r increases with sample size cause and effect strong correlation between the number of televisions sold and the number of cases of paranoid schizophrenia watching TV causes paranoid schizophrenia may be due to indirect relationship

14 14 Interpreting correlation Variation in dependent variable due to: relationship with independent variable: r 2 random factors: 1 - r 2 r 2 is the Coefficient of Determination e.g. r = 0.661 r 2 = = 0.44 less than half of the variation in the dependent variable due to independent variable

15 15

16 16 Agreement Correlation should never be used to determine the level of agreement between repeated measures: measuring devices users techniques It measures the degree of linear relationship 1, 2, 3 and 2, 4, 6 are perfectly positively correlated

17 17 Assumptions Errors are differences of predicted values of Y from actual values To ascribe significance to r: distribution of errors is Normal variance is same for all values of independent variable X

18 18 Non-parametric correlation Make no assumptions Carried out on ranks Spearman’s  easy to calculate Kendall’s  has some advantages over  distribution has better statistical properties easier to identify concordant / discordant pairs Usually both lead to same conclusions

19 19 Calculation of value and significance Computer does it!

20 20 Role of regression Shows how one variable changes with another By determining the line of best fit linear curvilinear

21 21 Line of best fit Simplest case linear Line of best fit between: dependent variable Y BMD independent variable X dietary intake of Calcium value of Y when X=0 Y = a + bX change in Y when X increases by 1

22 22 Role of regression Used to predict the value of the dependent variable when value of independent variable(s) known within the range of the known data extrapolation risky! relation between age and bone age Does not imply causality

23 SPSS output: regression 23

24 24 Assumptions Only if statistical inferences are to be made significance of regression values of slope and intercept

25 25 Assumptions If values of independent variable are randomly chosen then no further assumptions necessary Otherwise as in correlation, assumptions based on errors balance out (mean=0) variances equal for all values of independent variable not related to magnitude of independent variable seek advice / help

26 26 Multivariate regression More than one independent variable BMD dependent on: age gender calorific intake etc

27 27 Logistic regression The dependent variable is binary yes / no predict whether a patient with Type 1 diabetes will undergo limb amputation given history of prior ulcer, time diabetic etc result is a probability Can be extended to more than two categories Outcome after treatment recovered, in remission, died

28 28 Summary Correlation strength of linear relationship between two variables Pearson’s - parametric Spearman’s / Kendalls non-parametric Interpret with care! Regression line of best fit prediction multivariate logistic


Download ppt "Statistics Correlation and regression. 2 Introduction Some methods involve one variable is Treatment A as effective in relieving arthritic pain as Treatment."

Similar presentations


Ads by Google