Dr Kelvin Ng Kuan Huei MBBS MRCP Specialist Registrar in CPT/GIM Crash Course in Statistics.

Dr Kelvin Ng Kuan Huei MBBS MRCP Specialist Registrar in CPT/GIM Crash Course in Statistics

‘There are three kinds of lies: lies, damned lies, and statistics.’ -- Benjamin Disraeli

Why understand statistics? Statistics help us to see patterns Bad statistics = Bad Decisions If you don’t understand statistics, you can’t spot bad statistics

Quantitative vs Qualitative QualitativeQuantitative Complete detailed descriptionClassify, count and analyse statistically Researcher may only roughly know endpoint Researcher knows what the endpoint is Researcher is data gathering instrument Researcher uses tools Data in form of pictures, words or objects Data in the form of numbers SubjectiveObjective ‘Rich’ more time consuming and not generalizable Efficient, hypothesis testing but loss of detail

‘Red apple was the favourite as it was sweeter, crunchier and tastier but on the other hand green apple was more refreshing! ‘ ‘The red apple was the favourite compared with the green apple with P<0.05’

Observational studies vs RCT Experimental and quasi-experimental Observational studies –Easy, fast and relatively cheap –Dependent on stratification eg. selection bias, covariates RCT –Balancing of confounding factors –Lack of generalization, not always applicable, slow

Statistics Descriptive Statistics –Describe or summarise data Inferential Statistics –Make statistical inferences and draw conclusions Estimation –Confidence interval –Parameter estimation Hypothesis testing –Null hypothesis

Descriptive statistics Measures of central tendency –Mean, mod, median Measures of dispersion and variability –Standard deviation, variance, Diagrams eg. stem and leaf, box plots

Descriptive statistics Sample –9, 4, 5, 4, 7, 4, 2, 5 –2, 4, 4, 4, 5, 5, 7, 9 –Mean = 5 –Median = 4.5 –Mod = 4 –Standard deviation = 2

Inferential Statistics Reach conclusion beyond the immediate data alone ie. make inferences on population based on sample True state of affairs + chance = sample –Sample error –Central limit theorem ie. normally distributed

Inferential Statistics Comparisons analysis –Either compares means or medians between groups Correlation analysis –Correlation does not imply causation Regression analysis –Incorporates multiple covariates into equation

Comparisons Analysis T-test –Comparisons of means Mann Whitney U and Wilcoxon matched pair test –Comparisons of medians ANOVA and Kruskal Wallis test –Comparison of means between unrelated groups (ANOVA) –Comparisons of medians between unrelated groups (Kruskal Wallis test)

Correlations analysis Linear datasets? Spearman rank correlation –Ordinal data but no need for normal distibution Pearsons product moment –Interval data Correlation does not imply cause and effect!

Regression analysis Does not assume normal sampling. Allows modeling the dependence of a variable against another (or more) Binomial dataset –Chi2 test Linear regression Multiple regression

Linear regression

Multiple regression

Correlation vs regression Correlation –Makes no assumption about association –Test for interdependence Regression –Assumes variable is dependent covariates –One way causal relationship (in linear regression)

Correlation or regression analysis?

The P value It is not a measure of the hypothesis ie. It is the probability of obtaining the result by chance…. But null hypothesis is not a random event! P value of <0.05 is a less that 5% chance of obtaining the result by chance Pre-test probability –Bayesian probability

The P value High P value –Underpowered –Limited clinical difference Low P value –Large enough sample size will find even trivial differences are associated with statistical significance –Statistical significance does not equate to clinical significance

P value is no replacement for common sense!

Type I and II Errors Type 1 error (α error) –False positive ie. reject null hypothesis when it is true Type 2 error (β error) –False negative ie. fail to reject null hypothesis when it is false Type 1 error Type 2 error

Subgroup analysis Not statistically powered Multiple testing Usually not adjusted for covariates Predetermined endpoints ISIS-2 and star signs

Hazard ratio –The risk of an event eg. death, composite endpoint –A value of 1 suggests no difference between comparator groups ie. risk relative to another group –Often expressed within 95% confidence intervals

Relative vs absolute risk reduction Beware of headline grabbing statements! –If I buy two lottery tickets, I double my chances of winning by 100% –If I buy two lottery tickets, I increase my chance of winning to 0.0001% Significance of effect is dependent on incidence Important in health economics assessments.

Statistics Studies Qualtitative vs quantitative Observation vs RCT Descriptive Central limit theorem Central measures Dispersion Inferential ComparisonCorrelationRegression Concepts Hazard ratios ErrorsP-values Subgroup analysis Summary

Questions?

‘There are three kinds of lies: lies, damned lies, and statistics.’ -- Benjamin Disraeli

Dr Kelvin Ng Kuan Huei MBBS MRCP Specialist Registrar in CPT/GIM Crash Course in Statistics.

Similar presentations

Presentation on theme: "Dr Kelvin Ng Kuan Huei MBBS MRCP Specialist Registrar in CPT/GIM Crash Course in Statistics."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dr Kelvin Ng Kuan Huei MBBS MRCP Specialist Registrar in CPT/GIM Crash Course in Statistics.

Similar presentations

Presentation on theme: "Dr Kelvin Ng Kuan Huei MBBS MRCP Specialist Registrar in CPT/GIM Crash Course in Statistics."— Presentation transcript:

Similar presentations

About project

Feedback