Applied statistics Katrin Jaedicke

Slides:



Advertisements
Similar presentations
A PowerPoint®-based guide to assist in choosing the suitable statistical test. NOTE: This presentation has the main purpose to assist researchers and students.
Advertisements

David Pieper, Ph.D. STATISTICS David Pieper, Ph.D.
PSY 307 – Statistics for the Behavioral Sciences Chapter 20 – Tests for Ranked Data, Choosing Statistical Tests.
Ordinal Data. Ordinal Tests Non-parametric tests Non-parametric tests No assumptions about the shape of the distribution No assumptions about the shape.
INTRODUCTION TO NON-PARAMETRIC ANALYSES CHI SQUARE ANALYSIS.
Statistical Tests Karen H. Hagglund, M.S.
Lecture 13 – Tues, Oct 21 Comparisons Among Several Groups – Introduction (Case Study 5.1.1) Comparing Any Two of the Several Means (Chapter 5.2) The One-Way.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 17: Nonparametric Tests & Course Summary.
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Lecture 9: One Way ANOVA Between Subjects
Social Research Methods
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
Today Concepts underlying inferential statistics
1 Introduction to biostatistics Lecture plan 1. Basics 2. Variable types 3. Descriptive statistics: Categorical data Categorical data Numerical data Numerical.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Chapter 12: Analysis of Variance
Practical statistics for Neuroscience miniprojects Steven Kiddle Slides & data :
Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242.
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
ANALYSIS OF VARIANCE. Analysis of variance ◦ A One-way Analysis Of Variance Is A Way To Test The Equality Of Three Or More Means At One Time By Using.
Inferential Statistics: SPSS
1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas.
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
 Mean: true average  Median: middle number once ranked  Mode: most repetitive  Range : difference between largest and smallest.
Statistical Analysis Statistical Analysis
ANOVA Analysis of Variance.  Basics of parametric statistics  ANOVA – Analysis of Variance  T-Test and ANOVA in SPSS  Lunch  T-test in SPSS  ANOVA.
SW388R7 Data Analysis & Computers II Slide 1 Assumption of Homoscedasticity Homoscedasticity (aka homogeneity or uniformity of variance) Transformations.
Independent samples- Wilcoxon rank sum test. Example The main outcome measure in MS is the expanded disability status scale (EDSS) The main outcome measure.
Choosing and using statistics to test ecological hypotheses
Statistics Definition Methods of organizing and analyzing quantitative data Types Descriptive statistics –Central tendency, variability, etc. Inferential.
TAUCHI – Tampere Unit for Computer-Human Interaction ERIT 2015: Data analysis and interpretation (1 & 2) Hanna Venesvirta Tampere Unit for Computer-Human.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Analysis of variance Petter Mostad Comparing more than two groups Up to now we have studied situations with –One observation per object One.
ANOVA (Analysis of Variance) by Aziza Munir
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
Final review - statistics Spring 03 Also, see final review - research design.
Linear correlation and linear regression + summary of tests
Recap of data analysis and procedures Food Security Indicators Training Bangkok January 2009.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
Introduction to Statistics Alastair Kerr, PhD. Think about these statements (discuss at end) Paraphrased from real conversations: – “We used a t-test.
Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics ANalysis Of VAriance: ANOVA.
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
Experimental Design and Statistics. Scientific Method
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly.
Introduction to Basic Statistical Tools for Research OCED 5443 Interpreting Research in OCED Dr. Ausburn OCED 5443 Interpreting Research in OCED Dr. Ausburn.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 6 Putting Statistics to Work.
Copyright © 2005 Pearson Education, Inc. Slide 6-1.
NON-PARAMETRIC STATISTICS
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 14 th February 2013.
Principles of statistical testing
Soc 3306a Lecture 7: Inference and Hypothesis Testing T-tests and ANOVA.
Introduction to ANOVA Research Designs for ANOVAs Type I Error and Multiple Hypothesis Tests The Logic of ANOVA ANOVA vocabulary, notation, and formulas.
Analysis of Variance STAT E-150 Statistical Methods.
Analysis of variance Tron Anders Moger
Introduction to Statistics Alastair Kerr, PhD. Overview Understanding samples and distributions Binomial and normal distributions Describing data Visualising.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Nonparametric statistics. Four levels of measurement Nominal Ordinal Interval Ratio  Nominal: the lowest level  Ordinal  Interval  Ratio: the highest.
Interpretation of Common Statistical Tests Mary Burke, PhD, RN, CNE.
Chapter 4 Selected Nonparemetric Techniques: PARAMETRIC VS. NONPARAMETRIC.
1 Underlying population distribution is continuous. No other assumptions. Data need not be quantitative, but may be categorical or rank data. Very quick.
Data Analysis and Interpretation
Social Research Methods
Introductory Statistics
Presentation transcript:

Applied statistics Katrin Jaedicke

Basic statistic terminology Using SPSS Summary statistics Cross-sectional and longitudinal comparisons of 2 and more samples Corrections for multiple comparisons Correlations Transformations Creating graphs in SPSS and SigmaPlot To be confident in using statistics! The statistics presented in the lecture are correct (to the best of my knowledge), but this does not imply that all other statistical methods are wrong! (But be sure you know what you are doing if you are using other methods!) What you will learn in this course

Introduction to SPSS

Comparison of 2 groups (k = 2) independent samplesdependent samples metric datacategorical datametric datacategorical data normal distribution Shapiro-Wilk Test yes no t-Test for independent samples (Student’s t-test) Mann-Whitney U-Testpaired t-TestWilcoxon Test normal distribution Shapiro-Wilk Test yesno Levene Test for homogeneity of variances yes no

Independent samples, dependent samples and replicates 15 kg 5 kg 15 kg Starvation 10 kg 15.1 kg 15 kg 14.9 kg 15 kg kg a) Independent samples b) Dependent (related) samples c) Replicates

Exercise Cell culture: Treatment 1 Treatment 2 Treatment 3 A B 24 h later Independent samples, dependent samples and replicates C D 0 h 6 h 24 h E ELISA

Metric and categorical data Age groups Child Teenager Adult Examples from the lab Metric ELISA Bradford protein assay Cell proliferation Flow cytometry Realtime PCR Categorical States of disease severity Cancer classifications Staining categories Metric Categorical

Normal distribution Height of each person Number of people Very few very small people Many average height people Very few very tall people

The Null Hypothesis The question that you ask when doing a statistic test. It is important to know which question the test is asking in order to understand the result! The accepted mistake is (generally) set at 5 % < 5 %  *p < 0.05 (small mistake) < 1 %  **p < 0.01 (even smaller mistake) < 0.1 %  ***p < (very small mistake!) What we test in statistics: How big is the mistake that I make if I reject the Null Hypothesis? (e.g. if I say the Null Hypothesis is wrong)

The normal distribution test (Shapiro-Wilk test) asks the following question: p > 0.05 e.g. the hypothesis is right and our data follow a normal distribution! Answer to that question: No-> p < 0.05 Yes -> Do our data follow a normal distribution?

Homogeneity of variance How spread out are two different samples? Null Hypothesis Question: Are the variances in both populations equal? p > 0.05 = homogeneity of variance!

Null Hypothesis Question for any tests looking at differences between groups: There are no differences between the groups.? p < 0.05 = there is a significant difference between the groups

Comparison of more groups (k > 2) independent samplesdependent samples metric data categorical data normal distribution Shapiro-Wilk Test yes no t-Test with Bonferroni correction U-Test with Bonferroni correction Levene Test homogeneity of variances yes no oneway ANOVA Kruskal-Wallis metric data categorical data paired t-Test with Bonferroni correction Wilcoxon Test with Bonferroni correction repeated measurement ANOVA Friedman Test normal distribution Shapiro-Wilk Test yes no Mauchly’s Test sphericity yesno

Mauchly’s Test of Sphericity Null hypothesis question: Is the variance between all group differences the same? p > 0.05 = homogeneity of variance (Sphericity)! P1 P2 Patient Numbers P3 P4 P5 0 h 24 h 48 h 0 h-24 h 0 h-48h 24h-48h Note: if you want to know how to calculate Variance, check here:

Post-hoc testing and the Bonferroni correction 5 Student’s t-Tests: 1.Control-A 2.Control-B 3.Control-C 4.A-C 5.B-C Error of Multiple testing -> Control and C are replicates! Very small new p-values, risk of loosing all significance, especially if small sample size. Bonferroni-Holm or Benjamini-Hochberg (Benjamini only parametric data) correction: stepwise correction (less conservative, more powerful)

Corrections for multiple comparisons (Bonferroni corrections) ELISA 1.Control-A (p= ) 2.Control-B (p= 0.003) 3.Control-C (p= 0.01) 4.A-C (p= 0.04) 5.B-C (p = 0.06) Replicates! As post-hoc testing, we do 5 comparisons which give us 5 different p values The exact same Control data are used 3 times->Replicates! The exact same stimulation data C are used 3 times->Replicates! We need to correct for the Error of Multiple testing e.g. for the mistake of using Replicates! It does not matter if we have used (for each of the 5 tests, do not! mix different tests!) Student’s t-test, the paired samples t-test, Mann-Whitney or the Wilcoxon test to get these -> corrections should be done no matter which branch/side of the overview diagram you are on

Exercise Bonferroni-Holm 1. Put all the p values from the smallest to the highest into the K column ; 0.003; 0.01; 0.04; Use the new p values to define the level of significance (**) Note: If less tests are done (e.g. 3 or 4) or if more tests are done (e.g. 6, 7…), delete or add cells in the excel spreadsheet and change K accordingly.

Transformations -> achieve parametric testing Height of each person Number of people -To get not normal distributed data into a normal distribution -To get data which does not have equal variances into data which has equal variances -After transformations, data have to be checked again for normal distribution and equality of variance -!use the new data for statistics, but not for graphs! Graphs should be done with the original, untransformed data

Correlations metric data categorical data normal distribution Shapiro-Wilk Test yes no small sample size yes no Pearson correlationSpearman’s rank correlation

- p draw line - Correlation coefficient between 0 and 1 - < 0.3weak correlation - > 0.75strong correlation Correlations + Chi square Correlations Chi square -Only Yes-No answers exist -For example: comparison of gender, races, blood groups… -Important to test if patient groups are matched

The “grey” areas of statistics Q: How important is the normal distribution? A: The “big” tests such as ANOVA and repeated measures ANOVA, but also the t-tests for larger sample sizes, can “cope” with having only approximate normal distribution. Q: How important is the equality of variance? A: Very! A violation of equality of variances potentially changes test results and may also reduce statistical power. Q: What is a small and what is a large sample size? A: There is no “definition” of small and large sample size, it depends on the field of research what is commonly used. Rule of thumb: sample size of n=4 is the minimum when I can do parametric testing, anything less should be tested non-parametric. Q: Do I always have to correct for multiple comparisons? A: No, but you have stronger results if your p-values are still significant after correction and they are less likely being open to criticism of being a “chance” finding.

Mean and Median Mean-> Normal distributed data Add all numbers of analysed samples together and divide by n (sample size) For example: 1, 2, 4, 6, =25 Mean: 25/5=5 Median-> Data are not normal distributed Find the middle number of the analysed samples For example: Odd amount of numbers: 3, 9, 15, 17, 44 Middle number Median: 15 Even amount of numbers: 3, 6, 8, 12, 17, 44 Add the 2 middles numbers and divide by 2 Median: (8+12)/2=10

Standard deviation, Standard error and Interquartile range Standard deviation and Standard error-> Normal distributed data Standard deviation: how much variation is there around the mean - Small Standard deviation: data points are spread closely around the mean - Large Standard deviation: data points are spread widely around the mean - In Excel: =STDEV Standard error: Standard deviation of the error of how accurate the mean is -> does not add valuable information to the data, do not use! Interquartile range-> Data are not normal distributed first quartile (Q1) or lower quartile: 25 th percentile second quartile (Q2) or median: 50 th percentile third quartile (Q3) or upper quartile: 75 th percentile Interquartile range: Q3-Q1

Box plot