Presentation on theme: "Exporting Data for Analysis Michael A. Kohn, MD, MPP 16 August 2012."— Presentation transcript:
Exporting Data for Analysis Michael A. Kohn, MD, MPP 16 August 2012
Lab 4 (8/23) uses REDCap You need a REDCap logon Web-based research data collection system developed at Vanderbilt Available free through UCSF Academic Research Systems You are both the Principal Investigator and User 1.
Final Project: Part A Send in or Demonstrate Your Study Database Due 9/20/2012 Send in a copy of your research study database*. We prefer a database that you are currently using or will use for a research study. However, a demonstration or pilot database is acceptable. *If you are unable to package your database in a file to , you can send us a link or work out another way to review your database.
If you are doing secondary analysis of data collected by someone else, obtain the data collection forms* used in the original data collection, set up a new database that you would use for a follow-up study. *Often easily obtained by doing a Google search or ing the author of the original study. Final Project: Part A Send in or Demonstrate Your Study Database Due 9/20/2012
General description of database Data collection and entry Error checking and data validation Analysis (e.g., export to Stata) Security/confidentiality Back up Final Project: Part B Submit Your Data Management Plan Due 9/20/2012
Final Project Due 9/20/2012 Start thinking about this now. Build your own study database as you work through the labs. Use extra time in lab to work on your study database. Set up appointments with course faculty early.
Normalization -- Lab Results (from last week) Occasionally, the subjects (in the Infant Jaundice Study) had blood tests. Robert had a CBC on 1/30/2010. Helen had a CBC on 1/30/2010, LFTs on 2/28/2010, and a CD-4 count on 3/31/2010.
Lab Results Amy had maximum daily T bili as follows: 1/13/ (DOB) 1/14/ /15/ /16/ /17/ Demonstration: Enter Amy’s T. Bili Results
Quiz: Field(s) Storing Amy’s T Bili Results Which Table? SubjectMeds LabResult Exam Subject None of the above
Quiz: Fields for Birth Weight and Gestational Age Which Table? SubjectMeds LabResult Exam Subject None of the above
Quiz: Field for Parental Education (Any College?) Which Table? SubjectMeds LabResult Exam Subject None of the above
Assignment 3 Extra Credit: Write a sentence or two for the “Methods” or “Results” section on inter-rater reliability. (Use Bland and Altman, BMJ 1996; 313:744) Lab 3: Exporting and Analyzing Data 8/16/2012 Determine if neonatal jaundice was associated with the 5-year IQ scores and create a table or paragraph appropriate for the “Results” section of a manuscript summarizing the association.
Newman T et al. N Engl J Med 2006;354:
Essential Elements Sample size (N 1 jaundiced, N 0 non-jaundiced) Indication of effect size (report both means, or the difference between them) Get direction of effect right. Indication of variability (Sample SDs, SEs of means*, CIs of means, or CI of difference between means.) *Not my favorite
Browner on Figures Figures should have a minimum of four data points. A figure that shows that the rate of colon cancer is higher in men than in women, or that diabetes is more common in Hispanics than in whites or blacks, [or that jaundiced babies had lower/higher IQs at age 5 years than non- jaundiced babies,] is not worth the ink required to print it. Use text instead. Browner, WS. Publishing and Presenting Clinical Research; 1999; Williams and Wilkins. Pg. 90
Takes the prize for ugliest figure.
Figure 1: In N 1 infants with neonatal jaundice, the average IQ scores were xxxxer compared to the N 0 non-jaundiced infants when evaluated at age 5 (p=xxxx).
Box Plot Median Line Box extends from 25 th to 75 th percentile Whiskers to upper and lower adjacent values Adjacent value = 75 th /25 th percentile ±1.5 x IQR (interquartile range) Values outside the adjacent values are graphed individually Would be nice if area of box were proportional to sample size (N). In some box plots the width of the box is proportional to log N, but not in Stata.
Extra Credit Report within-subject SD as a measure of reliability. Calculate repeatability Bland-Altman plot with mean difference and 95% limits of agreement
Methods: We assessed inter-rater reliability of the IQ test by having different examiners re-test some of the children. We calculated the within-subject standard deviation and repeatability. (Bland and Altman, BMJ 1996; 313:744) Results: Different examiners re-tested N retest children. The within-subject standard deviation was s w, so the “repeatability” was 2.77× s w, meaning that two examiners of the same subject would score within 2.77×s w points of each other 95 percent of the time. (Bland and Altman, BMJ 1996; 313:744) Methods/Results
N = N S&R (children examined by both Satcher and Richmond) Mean Difference = 0.49 (95% CI – 1.38) 95% Limits of Agreement: – 11.0
N = 142 (examined by both Satcher and Richmond) Mean difference = Limits of agreement (LLA - ULA)
Bland-Altman in Stata ssc install batplot batplot richmondscore satcherscore, notrend title(Agreement between Richmond and Satcher) ytitle(Difference (Richmond - Satcher)) xtitle(Average of Richmond and Satcher)