Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to STATA for Clinical Researchers Jay Bhattacharya August 2007.

Similar presentations


Presentation on theme: "Introduction to STATA for Clinical Researchers Jay Bhattacharya August 2007."— Presentation transcript:

1 Introduction to STATA for Clinical Researchers Jay Bhattacharya August 2007

2 What is STATA? A general purpose statistical analysis package used by A general purpose statistical analysis package used by –epidemiologists, demographers, clinical researchers, social scientists, many others Tool to graphically display data Tool to graphically display data –Good for data exploration –Also good for publishing in journals

3 Why STATA? Easy to learn Easy to learn Powerful Powerful It will help you produce papers It will help you produce papers

4 Anatomy of A Clinical Research Project Collect (the data) Collect (the data) Clean Clean Explore Explore Analyze Analyze Submit (for publication) Submit (for publication) Revise Revise

5 Collect the Data STATA is good for analyzing STATA is good for analyzing –large secondary databases –smaller home grown data Store the data as a relational database (or maybe as a spreadsheet) Store the data as a relational database (or maybe as a spreadsheet) –It’s easy to convert to STATA format from SAS and other formats

6 Clean the Data Merge in other sources of data Merge in other sources of data –STATA does merges of all types, including match merge, table-lookup, and more complicated merging Recode variables Recode variables Hunt for outliers Hunt for outliers Apply inclusion/exclusion criteria Apply inclusion/exclusion criteria Treat missing variables consistently Treat missing variables consistently

7 Explore the Data Make a data codebook Make a data codebook Examine univariate statistics Examine univariate statistics –mean, standard deviation, percentiles Explore bivariate relationships Explore bivariate relationships –correlations, conditional means, etc. Examine the data graphically Examine the data graphically –STATA has powerful graphics capabilities (with a simple GUI interface)

8 Analyze the Data STATA is powerful all-purpose statistical package with most common statistical computations built in STATA is powerful all-purpose statistical package with most common statistical computations built in STATA is extensible for uncommon statistical computations STATA is extensible for uncommon statistical computations –You can share the tools you develop with the rest of the STATA community –Built-in and user written commands have a common interface –The STATA community is vibrant and helpful

9 Built-In Commands Linear models (ANOVA, regressions) Linear models (ANOVA, regressions) Nonlinear models (logit, poission regression) Nonlinear models (logit, poission regression) Failure time models (KM curves, Cox models) Failure time models (KM curves, Cox models) Time-series models Time-series models R-like matrix processing tools R-like matrix processing tools Bootstrap Bootstrap Robust statistics Robust statistics –Standard error corrections for clustering –Accounting for complex survey design Powerful and easy to use macro language to automate commands Powerful and easy to use macro language to automate commands

10 Submit for Publication With STATA, you can make a wide variety of publishable-quality graphs With STATA, you can make a wide variety of publishable-quality graphs You can automatically generate tables of results that are easy to edit in your favorite word processor You can automatically generate tables of results that are easy to edit in your favorite word processor –These are commands added to STATA by the user community –LaTeX support

11 Revise STATA has a nice, intuitive GUI for interactive data exploration STATA has a nice, intuitive GUI for interactive data exploration –Don’t use it too much! STATA commands can be stored in a text (.do) file, edited, and re-run STATA commands can be stored in a text (.do) file, edited, and re-run

12 An Example Body mass index is weight (kg) divided by height (m) squared Body mass index is weight (kg) divided by height (m) squared Why squared? Why squared? –Presumably to make BMI independent of height—BMI should mean the same thing for a short man and a tall woman But does it? But does it? –And is the triceps skinfold test height independent?

13 NHANES data National Health and Nutrition Examination Survey (NHANES) National Health and Nutrition Examination Survey (NHANES) –2001-2 edition Publicly available version can be downloaded from the National Center for Health Statistics Publicly available version can be downloaded from the National Center for Health Statistics –Includes anthropometric measurements –Plus lots of other covariates

14

15 Comparing SAS and STATA Pro: Pro: –STATA is easier to learn and at least as powerful –STATA is substantially cheaper –STATA tends to be faster –STATA has better help facilities Con: Con: –“Live” data management and report generation is easier with SAS –Simple analyses with datasets larger than memory is possible with SAS


Download ppt "Introduction to STATA for Clinical Researchers Jay Bhattacharya August 2007."

Similar presentations


Ads by Google