Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.

Slides:



Advertisements
Similar presentations
Module Introduction and Getting Started with Stata
Advertisements

Data Analysis using SPSS By Dr. Shaik Shaffi Ahamed Ph. D
Basics of Biostatistics for Health Research Session 2 – February 14 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
Chapter 10: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 10: Estimating with Confidence
Getting Started With STATA How do I do this? It probably opened automatically, but you may have to save it to the desktop, and double-click it to open.
5/15/2015Slide 1 SOLVING THE PROBLEM The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population.
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
The World’s Fastest Crash Course in Statistics Or, What You Need to Know to Answer Your Research Question 13 November 2006.
Stat 512 – Lecture 13 Chi-Square Analysis (Ch. 8).
Introduction to SPSS Descriptive Statistics. Introduction to SPSS Statistics Program for the Social Sciences (SPSS) Commonly used statistical software.
CHAPTER 8 Estimating with Confidence
Chapter 10: Estimating with Confidence
SPSS 1: An Introduction to the Statistical Package SPSS Suzie Cro MRC Clinical Trials Unit.
Pet Fish and High Cholesterol in the WHI OS: An Analysis Example Joe Larson 5 / 6 / 09.
Problem 1: Relationship between Two Variables-1 (1)
Introduction to SPSS (For SPSS Version 16.0)
SW388R6 Data Analysis and Computers I Slide 1 Chi-square Test of Goodness-of-Fit Key Points for the Statistical Test Sample Homework Problem Solving the.
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
CHAPTER 8 Estimating with Confidence
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
SW388R6 Data Analysis and Computers I Slide 1 Central Tendency and Variability Sample Homework Problem Solving the Problem with SPSS Logic for Central.
Spreadsheets in Finance and Forecasting Presentation 9 Macros.
+ Warm-Up4/8/13. + Warm-Up Solutions + Quiz You have 15 minutes to finish your quiz. When you finish, turn it in, pick up a guided notes sheet, and wait.
Instructor Resource Chapter 5 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Project 6 Using The Analysis ToolPak To Analyze Sales Transactions Jason C. H. Chen, Ph.D. Professor of Management Information Systems School of Business.
What is SPSS  SPSS is a program software used for statistical analysis.  Statistical Package for Social Sciences.
Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Estimating with Confidence Section 10.1 Confidence Intervals: The Basics.
Section 10.1 Confidence Intervals
SW318 Social Work Statistics Slide 1 Get ready to work on practice problems 1. Create a directory and subdirectory on your computer named C:\StudentData\SW318_Spring_2004.
A Simple Guide to Using SPSS ( Statistical Package for the Social Sciences) for Windows.
SW318 Social Work Statistics Slide 1 Frequency: Nominal Variable Practice Problem This question asks the frequency of widowed respondents of the survey.
AP STATISTICS LESSON INFERENCE FOR A POPULATION PROPORTION.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
+ DO NOW. + Chapter 8 Estimating with Confidence 8.1Confidence Intervals: The Basics 8.2Estimating a Population Proportion 8.3Estimating a Population.
Data Collection & Sampling Dr. Guerette. Gathering Data Three ways a researcher collects data: Three ways a researcher collects data: By asking questions.
4 normal probability plots at once par(mfrow=c(2,2)) for(i in 1:4) { qqnorm(dataframe[,1] [dataframe[,2]==i],ylab=“Data quantiles”) title(paste(“yourchoice”,i,sep=“”))}
Basics of Biostatistics for Health Research Session 3 – February 21, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.
PSC 47410: Data Analysis Workshop  What’s the purpose of this exercise?  The workshop’s research questions:  Who supports war in America?  How consistent.
Today Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation – GOF.
POLS 7000X STATISTICS IN POLITICAL SCIENCE CLASS 5 BROOKLYN COLLEGE-CUNY SHANG E. HA Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for.
Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.
Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation –
Access Queries and Forms. Adding a New Field  To insert a field after you have saved your table, open Access, and open the table  It is easier to add.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Selecting and Assigning Cases Randomly PowerPoint.
Chapter 10 Confidence Intervals for Proportions © 2010 Pearson Education 1.
CHAPTER 8 Estimating with Confidence
CHAPTER 8 Estimating with Confidence
DEPARTMENT OF COMPUTER SCIENCE
CHAPTER 10 Comparing Two Populations or Groups
Lab 2 Data Manipulation and Descriptive Stats in R
CHAPTER 8 Estimating with Confidence
CHAPTER 8 Estimating with Confidence
CHAPTER 8 Estimating with Confidence
Chapter 8: Estimating with Confidence
Amos Introduction In this tutorial, you will be briefly introduced to the student version of the SEM software known as Amos. You should download the current.
CHAPTER 8 Estimating with Confidence
CHAPTER 8 Estimating with Confidence
CHAPTER 8 Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
CHAPTER 8 Estimating with Confidence
CHAPTER 8 Estimating with Confidence
Chapter 8: Estimating with Confidence
Presentation transcript:

Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences & Department of Psychiatry

Objective 1: Upon completion, students will be (more) able to …. Read, Understand, Critically interpret, The statistical portions of articles in the medical literature.

Objective 2: Given a dataset, students will be able to …. Select appropriate statistical procedures for basic analyses Implement these analyses using typical statistical software (we will use Stata)

Objective 3: Upon completion, students will be (more) able to …. Define and interpret specialized parameters found in the clinical epidemiology literature, for example… –Sensitivity –Specificity –Predictive values The statistical portions of articles in the medical literature.

Topics for Session 1: Why do we need statistics? Calculating a 95% confidence interval for a proportion

Why Do We Need Statistics? We don’t always need statistics. However, statistics are the most powerful tools for answering questions in medicine, for example…. –Determining whether treatments work –Comparing different treatments –Identifying the causes of diseases

Why Do We Need Statistics? Statistics are the most powerful tools for answering questions in medicine, for example…. –Determining whether treatments work –Comparing different treatments –Identifying the causes of diseases

The Power of Statistics Where does it come from? –Fundamentally, from the laws of probability A familiar example: –Flipping one coin versus flipping many coins

Coin Flipping First, I’ll flip a coin and you can try to guess what I got. Then, I’ll ask you to flip a coin and I’ll guess how many you get

Coin Tossing Simulator

The Power of Statistics A set of observations can allow us to make statements of a sort that we generally cannot make based on a single observation –E.g. how well does a treatment work? Larger and larger sets of observations allow us to make stronger and stronger statements

Formal Terminology Source of the observations are a sample The sample is a subset of a population The observations are data The collection of observations are a dataset

Inference Generally, “A conclusion reached on the basis of evidence and reasoning.” Statistical, “Making a statement about a population based on observations from a sample (a dataset)”

Stata’s Graphical Interface

Lets do a Study! We’ll select a sample of half the class Tabulate the frequency of male/female Estimate the proportion of women

Select a Sample We’ll consider the class, of ‘N’ students as our population The first step in obtaining a sample is to have a sampling frame – a list of the population Lets make one in Stata For notation, I’ll type Stata commands in red. These go into the command window To execute a command, press Enter:

Command Menus (an alternative to the command window) 1 2 I’ll use screen captures and add red numbers if things need to be done in more than one step.

Use this drop-down variable to select the new variable 1 2 Click OK

Let’s create a sampling frame in Stata. In the command window, tell Stata that we want to create a list with N rows: set obs 30 (instead of 40, we’ll use the # in the class) generate id = _n

Let’s create a sampling frame in Stata? We’ll start by typing into the command window..

In the command window, tell Stata that we want to create a list with N* rows: set obs 30 generate id = _n * instead of 30, we’ll use the # in the class

The data viewer

Now, lets sample half of these sample 50 Click on the data viewer to see our sample

Data Collection From each member of our sample, we’ll record the person’s sex Male = 0 Female = 1 Let’s create a variable called “sex” in which to enter our data generate sex =.

The data viewer Look at the Dataset!

The data editor Enter the Data

Highlight a cell (click on it) and start entering data!

Closing the Data Editor Click Exit

Making a Table At this point, we could make a table to show the frequency of men and women in our sample,

Use this drop-down variable to select the new variable 1 2 Click OK

A few things to note…. Our table doesn’t look so great The command that our menus created is executed by Stata (see the “. tab var2” in the output window) We can do the same thing by typing: tab var2 in the command line

Command Line

Our Table is Still Very Ugly (not exactly, but something like this)

Renaming a Variable 1 2

The Variables Manager 1 Select “var2” (click) 2 Type “sex” here, under Name

Using the Command Window Another way to do it is just to type into the command window rename var2 sex

Our Table is Still Very Ugly (not exactly, but something like this)

Creating a Label

Click Here

Creating a Label In Stata, you need to give your label a name, Our values are 0 and 1 Our labels are men and women Click Here

Creating a Label After adding women, make add a second value-label for men. Our labels are men and women

Attaching the Label

Assigning the Label 1 2 3

A Good Looking Table

Saving a Dataset Click Here To Save

Let’s do Statistics! 1 23 We need to enter the Statistics menu 4

Entering the Command

Our Output What is the 95% confidence interval? What does it mean? What kind of statement can be made about the population (our class)? Is the statement true?

Introducing the “do file” editor 1 23

Executing a “do” file

Something more Realistic Go to “ Scroll to the bottom. Right click to download the two files described as being “for PGME Students” Save them on your desktop

Open the Datafile

Explore the Datafile Click on the data browser in Stata Type describe into the command bar Open the data documentation file Note that sex is not labeled properly and that it is coded differently than in our example

Recode the Sex Variable as 0/1 Let’s use the command window: generate female = sex recode female 1=0 2=1 Double check you’ve done it right: tab female sex

Your Task… Create a good label for this new variable Make a good table of the new variable Create a 95% exact binomial confidence interval for the proportion of females in Framingham Interpret what this 95% confidence interval means Create a do file that will do all of these steps automatically

Creating a Log File 1 23

Additional Tasks Create a log file for your calculation of the proportion of women in Framingham, and an associated 95% confidence interval.

Additional Tasks Calculate an estimate of the proportion of people in Framingham with greater than high school education (and 95% confidence interval) – generate and save a log file that shows this calculation.