Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics for clinicians l Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,

Similar presentations


Presentation on theme: "Statistics for clinicians l Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,"— Presentation transcript:

1 Statistics for clinicians l Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida, College of Nursing Professor, College of Public Health Department of Epidemiology and Biostatistics Associate Member, Byrd Alzheimer’s Institute Morsani College of Medicine Tampa, FL, USA 1

2 SECTION 1.1 Module Overview and Introduction Introduction to biostatistics, descriptive statistics, SPSS, and Power Point.

3 SECTION 1.4 Introduction to SPSS

4 Introduction to SPSS Database structure Data view and variable view Variable names, labels, and formats Interactive menus SPSS syntax generated from interactive analyses

5 SECTION 1.5 Summarizing Data in Charts

6 Summarizing Data – Charts 1.One categorical, >1 proportion/percentage (i)Bar chart (ii)Stacked bar chart (iii)Stacked bar chart (100%) 2.One categorical, >1 continuous variable (i)Box plot (ii)High-low (iii)Line (iv)Kernel-density plots 3.Two continuous variables (i)X-Y scatter (ii)Histogram (can be used for 1 variable)

7 1.One categorical, >1 proportion/percentage (i)Bar chart  Rectangular bars with lengths proportional to the values that they represent.  Bars can be plotted vertically or horizontally.

8 1.One categorical, >1 proportion/percentage (ii)Stacked bar chart  Can be counts or percentages.  Do not sum to a specified value % Obese Age Group

9 1.One categorical, >1 proportion/percentage (iii)Stacked bar chart (100%) Bar Charts and Stacked Bar Charts Important to select either row versus column percentages Example:Race and blood pressure classification Usually, the row variable is the “predictor”, and the column variable is the “outcome”. SPSS: Analyze Descriptive statistics Crosstabs

10 Bar Charts and Stacked Bar Charts Column Percentage: SPSS-CROSSTABS /TABLES=SCR_RACECAT3 BY SCR_BP_CLASS4 /FORMAT=AVALUE TABLES /CELLS=COUNT COLUMN /COUNT ROUND CELL /BARCHART. Race * BP classification Crosstabulation BP classification Total NormalPrehypertensive Hypertensive Stage 1 Hypertensive Stage 2 RaceWhiteCount % within BP classification 65.2%58.3%49.8%38.0%54.4% BlackCount % within BP classification 30.9%38.5%46.6%59.6%42.3% OtherCount % within BP classification 4.0%3.2%3.6%2.4%3.4% TotalCount % within BP classification 100.0%

11 Difficult to identify trends

12 Bar Charts and Stacked Bar Charts Row Percentage: SPSS-CROSSTABS /TABLES=SCR_RACECAT3 BY SCR_BP_CLASS4 /FORMAT=AVALUE TABLES /CELLS=COUNT ROW /COUNT ROUND CELL /BARCHART. Use row percentages in stacked bar chart (PP)

13 Power Point Chart Column 100% Stacked Column

14 Power Point Chart (Practice) Column - 100% Stacked Column Display Quality of Life from Poor to Excellent by Gender Column Percentages for QOL Row Percentages for QOL

15 Power Point Chart Column 100% Stacked Column

16 Power Point Chart Column 100% Stacked Column

17 2.One categorical, >1 continuous variable (i)Box plot  Also known as box-and-whisker diagram.  Displays 5 summary statistics: minimum, lower quartile (Q1), median (Q2), upper quartile (Q3), and maximum  No assumptions on underlying statistical distribution – non-parametric SPSS: Graphs Chart Builder Boxplot Example: HDL Cholesterol (continuous) distribution by gender (categorical)

18 2.One categorical, >1 continuous variable (i)Box plot Question: Are HDL cholesterol levels positively or negative skewed? Run SPSS frequencies procedure

19 2.One categorical, >1 continuous variable (i)Box plot Question: Are triglycerides positively or negative skewed? Run SPSS frequencies procedure

20 2.One categorical, >1 continuous variable (i)Box plot (Practice) Draw a box plot of the distribution of HDL cholesterol by ethnicity: Hispanic: Min=30, Q1=40, Q2=46, Q3=56, Max=86 Non-Hispanic:Min=21, Q1=46, Q2=56, Q3=66, Max=131 Example:

21 2.One categorical, >1 continuous variable (i)Box plot (Practice) Draw a box plot of the distribution of HDL cholesterol by ethnicity: Hispanic: Min=30, Q1=40, Q2=46, Q3=56, Max=86 Non-Hispanic:Min=21, Q1=46, Q2=56, Q3=66, Max=131

22 2.One categorical, >1 continuous variable (ii)High-low  Can “trick” Power Point to use open-high-low-close chart (i.e. used for financials) to show distributions of continuous variables  Upper and lower ends (high-low) can represent any percentiles, such as 5 th/ 95 th percentiles

23

24 White Self-Report Black WhiteBlack Admixture Defined EU>85%EU>40% EU>25% EU<40%EU<25% Total Cholesterol (mg/dl) N (753) (464) (753) (68) (201) (195) P=0.003P trend =0.009 The filled rectangles depict the interquartile range (25 th and 75 th percentile). The lower and upper limits of the vertical lines depict the 5 th and 95 th percentiles, respectively.

25 Total Cholesterol (mg/dl) N=594N=546N=80N=111 U.S. Black vs. Ghana Urban: P= U.S. Black vs. Ghana Rural: P< Ghana Urban vs. Ghana Rural: P< The filled rectangles depict the interquartile range (25 th and 75 th percentile). The lower and upper limits of the vertical lines depict the 5 th and 95 th percentiles, respectively.

26 Total Cholesterol: (Practice in Power Point – first draw by hand) (mg/dl) The filled rectangles depict the interquartile range (25 th and 75 th percentile). The lower and upper limits of the vertical lines depict the 5 th and 95 th percentiles, respectively. 5%25%75%95% Male Female

27 Total Cholesterol: (Practice in Power Point) (mg/dl) The filled rectangles depict the interquartile range (25 th and 75 th percentile). The lower and upper limits of the vertical lines depict the 5 th and 95 th percentiles, respectively. 5%25%75%95%“Trick” Power Point Male OpenHighLowClose Female %95% 5%75%

28 2.One categorical, >1 continuous variable (iii)Line chart  Typically represents trend in data over intervals of time (i.e. time series)  Often used to show repeated health outcome measurements over time. Prevalence of Use (%)” Crohn’s Disease Medications

29 In this example, the “categorical” variable is individual subject nested within each treatment arm of the trial

30 2. One categorical, >1 continuous variable (iv)Kernel density plots  Like a histogram, but constructs a “smooth” probability density function

31 3.Two continuous variables (i)X-Y scatter Body Density Body Mass Index  Shows the relationship between two sets of continuous data  Also called a scatter chart, scattergram, scatter diagram or scatter graph.

32 3.Two continuous variables (ii)Histogram(s)  Probability distribution of a continuous variable(s) displayed over discrete intervals (bins)  The bins contain frequency counts, or can be normalized to display relative frequencies (i.e. proportion of cases that fall into each category (bin) with total area = 1.0) # subjects

33 3.Two continuous variables (ii)Histogram(s)  Probability distribution of a continuous variable(s) displayed over discrete intervals (bins)  The bins contain frequency counts, or can be normalized to display relative frequencies (i.e. proportion of cases that fall into each category (bin) with total area = 1.0)

34 SECTION 1.6 SPSS Data Manipulation

35 SPSS Data Manipulation and Syntax Editor 1.Recode continuous variable into arbitrarily- defined or pre-defined categories 2.Visual binning of continuous variable 3.Transform a skewed variable 4.Using the SPSS Data Editor

36 SPSS Data Manipulation and Syntax Editor 1.Recode continuous variable into arbitrarily-defined or pre-defined categories Example: Define age into 3 categories (arbitrary) and older SPSS Transform Recode into different variables Input variable is age Output variable Name:age_cat Label:Age in 3 categories Click on old and new values Range – specify explicitly = value = value 2 65 and older = value 3

37 SPSS Data Manipulation and Syntax Editor 2.Visual binning of continuous variable Example: Body mass index Put in output name for binned variable Make cutpoints Equal percentiles based on scanned cases Put in labels for frequency display in bar chart SPSS Code Visual Binning.

38 SPSS Data Manipulation and Syntax Editor 3.Transform a skewed variable Descriptive statistics for triglycerides in natural scale Mean, median, SD, min, max, skewness, kurtosis Chart = histogram with normal curve superimposed Triglycerides are skewed. Use a transformation to create a new variable and reduce the skew in triglycerides. SPSS Compute variable Target Variable:LOG_TRIG Numeric Expression:lg10(LAB_TRIG_VAP) SPSS Syntax:COMPUTE log_trig=lg10(LAB_TRIG_VAP).

39 SPSS Data Manipulation and Syntax Editor 4.Using the SPSS Data Editor SPSS:File: New (syntax) Save the file with a new name 1.Select males only (scr_sex=1) Data Select Cases If scr_sex=1 USE ALL. COMPUTE filter_$=(SCR_SEX=1). VARIABLE LABELS filter_$ 'SCR_SEX=1 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMATS filter_$ (f1.0). FILTER BY filter_$. EXECUTE. 2.Run descriptives for age 3.Copy code and repeat for females (scr_sex=2);

40 SPSS Data Manipulation and Syntax Editor 4.Using the SPSS Data Editor USE ALL. COMPUTE filter_$=(SCR_SEX=1). VARIABLE LABELS filter_$ 'SCR_SEX=1 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMATS filter_$ (f1.0). FILTER BY filter_$. EXECUTE. DESCRIPTIVES VARIABLES=SCR_AGE /STATISTICS=MEAN STDDEV MIN MAX. USE ALL. COMPUTE filter_$=(SCR_SEX=2). VARIABLE LABELS filter_$ 'SCR_SEX=2 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMATS filter_$ (f1.0). FILTER BY filter_$. EXECUTE. DESCRIPTIVES VARIABLES=SCR_AGE /STATISTICS=MEAN STDDEV MIN MAX.


Download ppt "Statistics for clinicians l Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,"

Similar presentations


Ads by Google