SPSS Instructions for Introduction to Biostatistics Larry Winner Department of Statistics University of Florida.

Slides:



Advertisements
Similar presentations
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Advertisements

WINKS SDA Statistical Data Analysis (Windows Kwikstat) Getting Started Guide.
ANOVA notes NR 245 Austin Troy
Chapter 11 Survival Analysis Part 3. 2 Considering Interactions Adapted from "Anderson" leukemia data as presented in Survival Analysis: A Self-Learning.
Final Review Session.
A Simple Guide to Using SPSS© for Windows
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
Lecture 9: One Way ANOVA Between Subjects
Chapter 19 Data Analysis Overview
ARE OBSERVATIONS OBTAINED DIFFERENT?. ARE OBSERVATIONS OBTAINED DIFFERENT? You use different statistical tests for different problems. We will examine.
Analysis of variance (2) Lecture 10. Normality Check Frequency histogram (Skewness & Kurtosis) Probability plot, K-S test Normality Check Frequency histogram.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 18-1 Chapter 18 Data Analysis Overview Statistics for Managers using Microsoft Excel.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Chapter 14 Inferential Data Analysis
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Two-Way Analysis of Variance STAT E-150 Statistical Methods.
FEBRUARY, 2013 BY: ABDUL-RAUF A TRAINING WORKSHOP ON STATISTICAL AND PRESENTATIONAL SYSTEM SOFTWARE (SPSS) 18.0 WINDOWS.
Chapter 12: Analysis of Variance
AS 737 Categorical Data Analysis For Multivariate
Statistics for the Social Sciences Psychology 340 Fall 2013 Thursday, November 21 Review for Exam #4.
Inferential Statistics: SPSS
Analysis of Categorical Data
Simple Linear Regression
Using SPSS for Windows Part II Jie Chen Ph.D. Phone: /6/20151.
Assessing Survival: Cox Proportional Hazards Model
X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let X =  cholesterol level (mg/dL);
2 Categorical Variables (frequencies) Testing mean differences of a continuous variable between groups (categorical variable) 2 Continuous Variables 2.
Repeated Measures  The term repeated measures refers to data sets with multiple measurements of a response variable on the same experimental unit or subject.
Lab 5 instruction.  a collection of statistical methods to compare several groups according to their means on a quantitative response variable  Two-Way.
Linear correlation and linear regression + summary of tests
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Introduction to SPSS. Object of the class About the windows in SPSS The basics of managing data files The basic analysis in SPSS.
1 An Introduction to SPSS for Windows Jie Chen Ph.D. 6/4/20161.
MK346 – Undergraduate Dissertation Preparation Part II - Data Analysis and Significance Testing.
Statistical Inference for more than two groups Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Within Subjects Analysis of Variance PowerPoint.
SPSS Instructions for Introduction to Biostatistics Larry Winner Department of Statistics University of Florida.
ANOVA: Analysis of Variance.
Lecture 3 Topic - Descriptive Procedures Programs 3-4 LSB 4:1-4.4; 4:9:4:11; 8:1-8:5; 5:1-5.2.
Fundamental Statistics in Applied Linguistics Research Spring 2010 Weekend MA Program on Applied English Dr. Da-Fu Huang.
Review Lecture 51 Tue, Dec 13, Chapter 1 Sections 1.1 – 1.4. Sections 1.1 – 1.4. Be familiar with the language and principles of hypothesis testing.
Fundamental Concepts of Biostatistics Cathy Jenkins, MS Biostatistician II Lisa Kaltenbach, MS Biostatistician II April 17, 2007.
SPSS Workshop Day 2 – Data Analysis. Outline Descriptive Statistics Types of data Graphical Summaries –For Categorical Variables –For Quantitative Variables.
Chap 18-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 18-1 Chapter 18 A Roadmap for Analyzing Data Basic Business Statistics.
Statistical Analysis using SPSS Dr.Shaikh Shaffi Ahamed Asst. Professor Dept. of Family & Community Medicine.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
Statistics for Neurosurgeons A David Mendelow Barbara A Gregson Newcastle upon Tyne England, UK.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Tuesday PM  Presentation of AM results  What are nonparametric tests?  Nonparametric tests for central tendency Mann-Whitney U test (aka Wilcoxon rank-sum.
WINKS 7 Tutorial 3 Analyzing Summary Data (Using Student’s t-test) Permission granted for use for instruction and for personal use. ©
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Between Subjects Analysis of Variance PowerPoint.
Comparing k > 2 Groups - Numeric Responses Extension of Methods used to Compare 2 Groups Parallel Groups and Crossover Designs Normal and non-normal data.
Nonparametric Statistics: ANOVA STAT E-150 Statistical Methods.
Analysis of Variance STAT E-150 Statistical Methods.
STATS 10x Revision CONTENT COVERED: CHAPTERS
Analysis of variance Tron Anders Moger
Comparing I > 2 Groups - Numeric Responses Extension of Methods used to Compare 2 Groups Independent and Dependent Samples Normal and non-normal data structures.
ANOVA and Multiple Comparison Tests
Nonparametric Statistics
Data Workshop H397. Data Cleaning  Inputting data  Missing Values  Converting String Variables  Creating Scales  Creating Dummy Variables.
Additional Regression techniques Scott Harris October 2009.
Interpretation of Common Statistical Tests Mary Burke, PhD, RN, CNE.
Analysis of Variance (ANOVA) Scott Harris October 2009.
Chapter 4 Selected Nonparemetric Techniques: PARAMETRIC VS. NONPARAMETRIC.
Chapter 18 Data Analysis Overview Yandell – Econ 216 Chap 18-1.
Comparing Three or More Means
Statistical Analysis using SPSS
Applied Statistics Using SPSS
Applied Statistics Using SPSS
Presentation transcript:

SPSS Instructions for Introduction to Biostatistics Larry Winner Department of Statistics University of Florida

SPSS Windows Data View –Used to display data –Columns represent variables –Rows represent individual units or groups of units that share common values of variables Variable View –Used to display information on variables in dataset –TYPE: Allows for various styles of displaying –LABEL: Allows for longer description of variable name –VALUES: Allows for longer description of variable levels –MEASURE: Allows choice of measurement scale Output View –Displays Results of analyses/graphs

Data Entry Tips I For variables that are not identifiers (such as name, county, school, etc), use numeric values for levels and use the VALUES option in VARIABLE VIEW to give their levels. Some procedures require numeric labels for levels. SPSS will print the VALUES on output For large datasets, use a spreadsheet such as EXCEL which is more flexible for data entry, and import the file into SPSS Give descriptive LABEL to variable names in the VARIABLE VIEW Keep in mind that Columns are Variables, you don’t want multiple columns with the same variable

Data Entry/Analysis Tips II When re-analyzing previously published data, it is often possible to have only a few outcomes (especially with categorical data), with many individuals sharing the same outcomes (as in contingency tables) For ease of data entry: –Create one line for each combination of factor levels –Create a new variable representing a COUNT of the number of individuals sharing this “outcome” When analyzing data Click on: –DATA  WEIGHT CASES  WEIGHT CASES BY –Click on the variable representing COUNT –All subsequent analyses treat that outcome as if it occurred COUNT times

Example Grapefruit Juice Study To import an EXCEL file, click on: FILE  OPEN  DATA then change FILES OF TYPE to EXCEL (.xls) To import a TEXT or DATA file, click on: FILE  OPEN  DATA then change FILES OF TYPE to TEXT (.txt) or DATA (.dat) You will be prompted through a series of dialog boxes to import dataset

Descriptive Statistics-Numeric Data After Importing your dataset, and providing names to variables, click on: ANALYZE  DESCRIPTIVE STATISTICS  DESCRIPTIVES Choose any variables to be analyzed and place them in box on right Options include:

Example Grapefruit Juice Study

Descriptive Statistics-General Data After Importing your dataset, and providing names to variables, click on: ANALYZE  DESCRIPTIVE STATISTICS  FREQUENCIES Choose any variables to be analyzed and place them in box on right Options include (For Categorical Variables): –Frequency Tables –Pie Charts, Bar Charts Options include (For Numeric Variables) –Frequency Tables (Useful for discrete data) –Measures of Central Tendency, Dispersion, Percentiles –Pie Charts, Histograms

Example Smoking Status

Vertical Bar Charts and Pie Charts After Importing your dataset, and providing names to variables, click on: GRAPHS  BAR…  SIMPLE (Summaries for Groups of Cases)  DEFINE Bars Represent N of Cases (or % of Cases) Put the variable of interest as the CATEGORY AXIS GRAPHS  PIE… (Summaries for Groups of Cases)  DEFINE Slices Represent N of Cases (or % of Cases) Put the variable of interest as the DEFINE SLICES BY

Example Antibiotic Study

Histograms After Importing your dataset, and providing names to variables, click on: GRAPHS  HISTOGRAM Select Variable to be plotted Click on DISPLAY NORMAL CURVE if you want a normal curve superimposed (see Chapter 3).

Example Drug Approval Times

Side-by-Side Bar Charts After Importing your dataset, and providing names to variables, click on: GRAPHS  BAR…  Clustered (Summaries for Groups of Cases)  DEFINE Bars Represent N of Cases (or % of Cases) CATEGORY AXIS: Variable that represents groups to be compared (independent variable) DEFINE CLUSTERS BY: Variable that represents outcomes of interest (dependent variable)

Example Streptomycin Study

Scatterplots After Importing your dataset, and providing names to variables, click on: GRAPHS  SCATTER  SIMPLE  DEFINE For Y-AXIS, choose the Dependent (Response) Variable For X-AXIS, choose the Independent (Explanatory) Variable

Example Theophylline Clearance

Scatterplots with 2 Independent Variables After Importing your dataset, and providing names to variables, click on: GRAPHS  SCATTER  SIMPLE  DEFINE For Y-AXIS, choose the Dependent Variable For X-AXIS, choose the Independent Variable with the most levels For SET MARKERS BY, choose the Independent Variable with the fewest levels

Example Theophylline Clearance

Contingency Tables for Conditional Probabilities After Importing your dataset, and providing names to variables, click on: ANALYZE  DESCRIPTIVE STATISTICS  CROSSTABS For ROWS, select the variable you are conditioning on (Independent Variable) For COLUMNS, select the variable you are finding the conditional probability of (Dependent Variable) Click on CELLS Click on ROW Percentages

Example Alcohol & Mortality

Independent Sample t-Test After Importing your dataset, and providing names to variables, click on: ANALYZE  COMPARE MEANS  INDEPENDENT SAMPLES T-TEST For TEST VARIABLE, Select the dependent (response) variable(s) For GROUPING VARIABLE, Select the independent variable. Then define the names of the 2 levels to be compared (this can be used even when the full dataset has more than 2 levels for independent variable).

Example Levocabastine in Renal Patients

Wilcoxon Rank-Sum/Mann-Whitney Tests After Importing your dataset, and providing names to variables, click on: ANALYZE  NONPARAMETRIC TESTS  2 INDEPENDENT SAMPLES For TEST VARIABLE, Select the dependent (response) variable(s) For GROUPING VARIABLE, Select the independent variable. Then define the names of the 2 levels to be compared (this can be used even when the full dataset has more than 2 levels for independent variable). Click on MANN-WHITNEY U

Example Levocabastine in Renal Patients

Paired t-test After Importing your dataset, and providing names to variables, click on: ANALYZE  COMPARE MEANS  PAIRED SAMPLES T-TEST For PAIRED VARIABLES, Select the two dependent (response) variables (the analysis will be based on first variable minus second variable)

Example C max in SRC&IRC Codeine

Wilcoxon Signed-Rank Test After Importing your dataset, and providing names to variables, click on: ANALYZE  NONPARAMETRIC TESTS  2 RELATED SAMPLES For PAIRED VARIABLES, Select the two dependent (response) variables (be careful in determining which order the differences are being obtained, it will be clear on output) Click on WILCOXON Option

Example t 1/2 SS in SRC&IRC Codeine

Relative Risks and Odds Ratios After Importing your dataset, and providing names to variables, click on: ANALYZE  DESCRIPTIVE STATISTICS  CROSSTABS For ROWS, Select the Independent Variable For COLUMNS, Select the Dependent Variable Under STATISTICS, Click on RISK Under CELLS, Click on OBSERVED and ROW PERCENTAGES NOTE: You will want to code the data so that the outcome present (Success) category has the lower value (e.g. 1) and the outcome absent (Failure) category has the higher value (e.g. 2). Similar for Exposure present category (e.g. 1) and exposure absent (e.g. 2). Use Value Labels to keep output straight.

Example Pamidronate Study

Example Lip Cancer

Fisher’s Exact Test After Importing your dataset, and providing names to variables, click on: ANALYZE  DESCRIPTIVE STATISTICS  CROSSTABS For ROWS, Select the Independent Variable For COLUMNS, Select the Dependent Variable Under STATISTICS, Click on CHI-SQUARE Under CELLS, Click on OBSERVED and ROW PERCENTAGES NOTE: You will want to code the data so that the outcome present (Success) category has the lower value (e.g. 1) and the outcome absent (Failure) category has the higher value (e.g. 2). Similar for Exposure present category (e.g. 1) and exposure absent (e.g. 2). Use Value Labels to keep output straight.

Example Antiseptic Experiment

McNemar’s Test After Importing your dataset, and providing names to variables, click on: ANALYZE  DESCRIPTIVE STATISTICS  CROSSTABS For ROWS, Select the outcome for condition/time 1 For COLUMNS, Select the outcome for condition/time 2 Under STATISTICS, Click on MCNEMAR Under CELLS, Click on OBSERVED and TOTAL PERCENTAGES NOTE: You will want to code the data so that the outcome present (Success) category has the lower value (e.g. 1) and the outcome absent (Failure) category has the higher value (e.g. 2). Similar for Exposure present category (e.g. 1) and exposure absent (e.g. 2). Use Value Labels to keep output straight.

Example Report of Implant Leak P-value

Cochran Mantel-Haenszel Test After Importing your dataset, and providing names to variables, click on: ANALYZE  DESCRIPTIVE STATISTICS  CROSSTABS For ROWS, Select the Independent Variable For COLUMNS, Select the Dependent Variable For LAYERS, Select the Strata Variable Under STATISTICS, Click on COCHRAN’S AND MANTEL- HAENSZEL STATISTICS NOTE: You will want to code the data so that the outcome present (Success) category has the lower value (e.g. 1) and the outcome absent (Failure) category has the higher value (e.g. 2). Similar for Exposure present category (e.g. 1) and exposure absent (e.g. 2). Use Value Labels to keep output straight.

Example 5.7 Smoking/Death by Age

Chi-Square Test After Importing your dataset, and providing names to variables, click on: ANALYZE  DESCRIPTIVE STATISTICS  CROSSTABS For ROWS, Select the Independent Variable For COLUMNS, Select the Dependent Variable Under STATISTICS, Click on CHI-SQUARE Under CELLS, Click on OBSERVED, EXPECTED, ROW PERCENTAGES, and ADJUSTED STANDARDIZED RESIDUALS NOTE: Large ADJUSTED STANDARDIZED RESIDUALS (in absolute value) show which cells are inconsistent with null hypothesis of independence. A common rule of thumb is seeing which if any cells have values >3 in absolute value

Example Marital Status & Cancer

Goodman & Kruskal’s  / Kendall’s  b After Importing your dataset, and providing names to variables, click on: ANALYZE  DESCRIPTIVE STATISTICS  CROSSTABS For ROWS, Select the Independent Variable For COLUMNS, Select the Dependent Variable Under STATISTICS, Click on GAMMA and KENDALL’S  b

Examples 5.9,10 - Nicotine Patch/Exhaustion

Kruskal-Wallis Test After Importing your dataset, and providing names to variables, click on: ANALYZE  NONPARAMETRIC TESTS  k INDEPENDENT SAMPLES For TEST VARIABLE, Select Dependent Variable For GROUPING VARIABLE, Select Independent Variable, then define range of levels of variable (Minimum and Maximum) Click on KRUSKAL-WALLIS H

Example Antibiotic Delivery Note: This statistic makes the adjustment for ties. See Hollander and Wolfe (1973), p. 140.

Cohen’s  After Importing your dataset, and providing names to variables, click on: ANALYZE  DESCRIPTIVE STATISTICS  CROSSTABS For ROWS, Select Rater 1 For COLUMNS, Select Rater 2 Under STATISTICS, Click on KAPPA Under CELLS, Click on TOTAL Percentages to get the observed percentages in each cell (the first number under observed count in Table 5.17).

Example Siskel & Ebert

1-Factor ANOVA - Independent Samples (Parallel Groups) After Importing your dataset, and providing names to variables, click on: ANALYZE  COMPARE MEANS  ONE-WAY ANOVA For DEPENDENT LIST, Click on the Dependent Variable For FACTOR, Click on the Independent Variable To obtain Pairwise Comparisons of Treatment Means: –Click on POST HOC –Then TUKEY and BONFERRONI (among many other choices)

Examples 6.1,2 - HIV Clinical Trial

Kruskal-Wallis Test After Importing your dataset, and providing names to variables, click on: ANALYZE  NONPARAMETRIC TESTS  k INDEPENDENT SAMPLES For TEST VARIABLE, Select Dependent Variable For GROUPING VARIABLE, Select Independent Variable, then define range of levels of variable (Minimum and Maximum) Click on KRUSKAL-WALLIS H

Example 6.2(a) - Thalidomide and HIV-1

Randomized Block Design - F-test After Importing your dataset, and providing names to variables, click on: ANALYZE  GENERAL LINEAR MODEL  UNIVARIATE Assign the DEPENDENT VARIABLE Assign the TREATMENT variable as a FIXED FACTOR Assign the BLOCK variable as a RANDOM FACTOR Click on MODEL, then CUSTOM, under BUILD TERMS choose MAIN EFFECTS, move both factors to MODEL list Click on POST HOC and select the TREATMENT factor for POST HOC TESTS and BONFERRONI and TUKEY (among many choices) For PLOTS, Select the BLOCK factor for HORIZONTAL AXIS and the TREATMENT factor for SEPARATE LINES, click ADD

Example Theophylline Clearance

Randomized Block Design - Friedman’s test After Importing your dataset, and providing names to variables, click on: ANALYZE  NONPARAMETRIC TESTS  k RELATED SAMPLES For TEST VARIABLES, select the variables representing the treatments (each line is a subject/block) Click on FRIEDMAN

Example Absorption of Valproate Depakote Note: This makes an adjustment for ties, see Hollander and Wolfe (1973), p. 140.

2-Way ANOVA After Importing your dataset, and providing names to variables, click on: ANALYZE  GENERAL LINEAR MODEL  UNIVARIATE Assign the DEPENDENT VARIABLE Assign the FACTOR A variable as a FIXED FACTOR Assign the FACTOR B variable as a FIXED FACTOR Click on MODEL, then CUSTOM, select FULL FACTORIAL Click on POST HOC and select the both factors for POST HOC TESTS and BONFERRONI and TUKEY (among many choices) For PLOTS, Select FACTOR B for HORIZONTAL AXIS and FACTOR A for SEPARATE LINES, click ADD

Example Nortriptyline Clearance

Linear Regression After Importing your dataset, and providing names to variables, click on: ANALYZE  REGRESSION  LINEAR Select the DEPENDENT VARIABLE Select the INDEPENDENT VARAIABLE(S) Click on STATISTICS, then ESTIMATES, CONFIDENCE INTERVALS, MODEL FIT For histogram of residuals, click on PLOTS, and HISTOGRAM under STANDARDIZED RESIDUAL PLOTS

Examples Gemfibrozil Clearance

Example TB/Thalidomide in HIV

Useful Regression Plots Scatterplot with Fitted (Least Squares) Line –GRAPHS  INTERACTIVE  SCATTERPLOT –Select DEPENDENT VARIABLE for UP/DOWN AXIS –Select INDEPENDENT VARIABLE for RIGHT/LEFT AXIS –Click on FIT Tab, then REGRESSION for METHOD –NOTE: Be certain both variables are SCALE in VARIABLE VIEW under MEASURE Partial Regression Plots (Multiple Regression) to observe association of each Independent Variable with Y, controlling for all others –Fit REGRESSION model with all Independent Variables –Click PLOTS, then PRODUCE ALL PARTIAL PLOTS

Example Gemfibrozil Scatterplot

Logistic Regression After Importing your dataset, and providing names to variables, click on: ANALYZE  REGRESSION  BINARY LOGISTIC Select the DEPENDENT VARIABLE Select the INDEPENDENT VARAIABLE(S) as COVARIATES For a 95% CI for the odds ratio, click on OPTIONS, then CI for exp(B) Declare any CATEGORICAL COVARIATES (Independent variables whose levels are categorical, not numeric)

Example Navelbine Toxicity Omnibus test for all regression coefficients (like F in linear regression)

Example CHD, BP, Cholesterol

Nonlinear Regression After Importing your dataset, and providing names to variables, click on: ANALYZE  REGRESSION  NONLINEAR Select the DEPENDENT VARIABLE Define the MODEL EXPRESSION as a function of the INDEPENDENT VARIABLE(s) and unknown PARAMETERS Define the PARAMETERS and give them STARTING VALUES (this may take several attempts)

Example MK-639 in AIDS Patients Nonlinear Regression Summary Statistics Dependent Variable RNACHNG Source DF Sum of Squares Mean Square Regression Residual Uncorrected Total (Corrected Total) R squared = 1 - Residual SS / Corrected SS = Asymptotic 95 % Asymptotic Confidence Interval Parameter Estimate Std. Error Lower Upper A B C

Survival Analysis -Kaplan-Meier Estimates and Log-Rank Test After Importing your dataset, and providing names to variables, click on: ANALYZE  SURVIVAL  KAPLAN-MEIER Select the variable representing the survival TIME of individual Select the variable representing the STATUS of individual (whether or not event has occured). NOTE: If the variable is an indicator that the observation was CENSORED, then a value of 0 for that variable will mean the event has occured. Select the variable representing the FACTOR containing the groups to be compared Click on COMPARE FACTOR, select LOG-RANK, and POOL ACROSS STRATA

Examples Navelbine and Taxol in Mice Survival Analysis for TIME Factor REGIMEN = 1 Time Status Cumulative Standard Cumulative Number Survival Error Events Remaining Factor REGIMEN = 2 Time Status Cumulative Standard Cumulative Number Survival Error Events Remaining

Examples Navelbine and Taxol in Mice Test Statistics for Equality of Survival Distributions for REGIMEN Statistic df Significance Log Rank This is the square of the Z-statistic in text, and is a chi-square statistic

Relative Risk Regression (Cox Model) After Importing your dataset, and providing names to variables, click on: ANALYZE  SURVIVAL  COX REGRESSION Select the variable representing the survival TIME of individual Select the variable representing the STATUS of individual (whether or not event has occured). NOTE: If the variable is an indicator that the observation was CENSORED, then a value of 0 for that variable will mean the event has occured. Select the variable(s) representing the COVARIATES (Independent Variables in Model) Identify any CATEGORICAL COVARIATES including Dummy/Indicator variables K-M PLOTS can be obtained, with separate SURVIVAL curves by categories

Example MP vs Placebo