DATA MANAGEMENT Using EpiData and SPSS.

Slides:



Advertisements
Similar presentations
1 Copyright © 2002 Pearson Education, Inc.. 2 Chapter 2 Getting Started.
Advertisements

Use of EpiData (questionnaire design and entry)
Statistical Software Packages: How do I get this into that? Gillian Byrne Memorial University of Newfoundland Atlantic DLI Training - April 23, 2004.
Quantitative Data Preparation Louise Corti ESDS/ UKDA Social Science Data Archives for Social Historians: creating, depositing and using qualitative data.
Quantitative Data Preparation Alasdair Crockett, Data Services Manager UK Data Archive.
1 Data processing and exporting Module 2 Session 6.
1 From the data to the report Module 2. 2 Introduction Welcome Housekeeping Introductions Name, job, district, team.
Creating Data Entry Screens in Epi Info
1 Using SPSS: Introduction Department of Operations Weatherhead School of Management.
1 2 In a computer system, a file is a collection of information with a single name, such as addresses.doc, or filebackup.ppt, or ftwr.exe, or guidebook.xls.
Office Links - Sharing Data in Microsoft Office A Mixed Bag of Treasures Chester N. Barkan Registrar Long Island University, C.W.Post Campus.
Electronic Data Processing, Analysis, and Reporting for Public Health Surveys Using Epi Info Alison Smith, MPH Global AIDS Program, CDC.
Data Analysis using SPSS By Dr. Shaik Shaffi Ahamed Ph. D
Using SD K12 SharePoint®.
1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.
SPSS Basics I Dr. Isaac Gusukuma Department of Social Work, Sociology and Criminal Justice Dr. Trent Terrell Department of Psychology NOTE: Accessing Workshop.
The SAS ® System Additional Information on Statistical Analysis Programming.
Why a course with EpiData software? oRequirement of data quality o5S principle (Eric Brenner, FoED): Small; Simple; Stable; Speedy; Sufficient oFreeware.
EndNote. What is EndNote:  EndNote is referencing software that enables you to create a database of references from your readings. Your database of references.
© 2009 Pearson Education, Inc publishing as Prentice Hall 15-1 Data Preparation and Analysis Strategy Chapter 15.
Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.
Tutorial 7: Developing an Excel Application
Developing an Excel Application
Tutorial 8: Developing an Excel Application
Module 4.2 File management 1. Contents Introduction The file manager Files – the basic unit of storage The need to organise Glossary 2.
Epi Info Software for Public Health For Microsoft Windows®
How to enter data in SPSS
Srinivasulu Rajendran Centre for the Study of Regional Development (CSRD) School of Social Sciences (SSS) Jawaharlal Nehru University (JNU) New Delhi -
INTERPRET MARKETING INFORMATION TO TEST HYPOTHESES AND/OR TO RESOLVE ISSUES. INDICATOR 3.05.
SPSS 1: An Introduction to the Statistical Package SPSS Suzie Cro MRC Clinical Trials Unit.
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
FIRST COURSE Creating Web Pages with Microsoft Office 2007.
CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH (CSSCR) UNIVERSITY OF WASHINGTON Fall Quarter, 2012 Prepared by Eric Hamilton INTRODUCTION TO SPSS (AND.
Microsoft Office Word 2013 Expert Microsoft Office Word 2013 Expert Courseware # 3251 Lesson 4: Working with Forms.
COMPREHENSIVE Excel Tutorial 8 Developing an Excel Application.
TrendReader Standard 2 This generation of TrendReader Standard software utilizes the more familiar Windows format (“tree”) views of functions and file.
Computer Systems Week 10: File Organisation Alma Whitfield.
Computer Literacy BASICS: A Comprehensive Guide to IC 3, 5 th Edition Lesson 3 Windows File Management 1 Morrison / Wells / Ruffolo.
Overview of EpiInfo 6 Dr. Troy Gepte. Why do we use Statistical Software? Convenience Accuracy Guides data collection Ensures that data is processed Facilitates.
Using SD K12 SharePoint ®. What is SharePoint? Microsoft SharePoint Components Web Browser Collaboration functions Process management modules Search modules.
Introduction to SPSS Edward A. Greenberg, PhD
Creating a Web Site to Gather Data and Conduct Research.
Eurotrace Hands-On The Eurotrace File System. 2 The Eurotrace file system Under MS ACCESS EUROTRACE generates several different files when you create.
XP New Perspectives on Integrating Microsoft Office XP Tutorial 2 1 Integrating Microsoft Office XP Tutorial 2 – Integrating Word, Excel, and Access.
R BRO SOLUTIONS INC. ©2006 RBRO Solutions Inc., All Rights Reserved Systems Design Consultants Document Migration into WorkSite.
Questionnaire Development: SPSS and Reliability Personality Lab October 8, 2010.
11 3 / 12 CHAPTER Databases MIS105 Lec15 Irfan Ahmed Ilyas.
CMPS 1371 Introduction to Computing for Engineers FILE Input / Output.
Page 1 Non-Payroll Cost Transfer Enhancements Last update January 24, 2008 What are the some of the new enhancements of the Non-Payroll Cost Transfer?
ISU Basic SAS commands Laboratory No. 1 Computer Techniques for Biological Research Animal Science 500 Ken Stalder, Professor Department of Animal Science.
What is SPSS  SPSS is a program software used for statistical analysis.  Statistical Package for Social Sciences.
A Simple Guide to Using SPSS ( Statistical Package for the Social Sciences) for Windows.
1st NRC Meeting, October 2006, Amsterdam 1 Data Management Procedures Preview of software used in ICCS Michael Jung, IEA Data Processing Center.
RESEARCH METHODS Lecture 29. DATA ANALYSIS Data Analysis Data processing and analysis is part of research design – decisions already made. During analysis.
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
TIMOTHY SERVINSKY PROJECT MANAGER CENTER FOR SURVEY RESEARCH Data Preparation: An Introduction to Getting Data Ready for Analysis.
John Porter Sheng Shan Lu M. Gastil Gastil-Buhl With special thanks to Chau-Chin Lin and Chi-Wen Hsaio.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
Quick Videos: A tutorial on creating reports. Select a report and click this to view it. Select a report and click this to change it. Select a report and.
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
Microsoft Office 2008 for Mac – Illustrated Unit D: Getting Started with Safari.
Analyzing Data. Learning Objectives You will learn to: – Import from excel – Add, move, recode, label, and compute variables – Perform descriptive analyses.
Data Entry, Coding & Cleaning SPSS Training Thomas Joshua, MS July, 2008.
DATA MANAGEMENT Using EpiData and SPSS.
Excel Tutorial 8 Developing an Excel Application
Development Environment
Introduction to Web programming
Using a set-up file to read ASCII data into SPSS
Working with Data in Windows
Tutorial 6 PHP & MySQL Li Xu
Presentation transcript:

DATA MANAGEMENT Using EpiData and SPSS

References Public domain (pdf) book on data management: Bennett, et al. (2001). Data Management for Surveys and Trials. A Practical Primer Using EpiData. The EpiData Documentation Project. : http://www.epidata.dk/downloads/dmepidata.pdf EpiData Association Website: http://www.epidata.dk/ Importing raw data into SPSS: http://www.ats.ucla.edu/stat/spss/modules/input.htm

Data Management Planning data needs Data collection Data entry and control Validation and checking Data cleaning and variable transformation Data backup and storage System documentation Other

Types of Data Base Management Systems (DBMSs) Spreadsheets (e.g., Excel, SPSS Data Editor) Prone to error, data corruption, & mismanagement Lack data controls, limited programmability Suitable only for small and didactic projects Also good for last step data cleaning Commercial DBMS programs (e.g., Oracle, Access) Limited data control, good programmability Slow & expensive Powerful and widely available Public domain programs (e.g., EpiData, Epi Info) Controlled data entry, good programmability Suitable for research and field use

We will use two platforms: EpiData controlled data entry data documentation export (“write”) data SPSS import (“read”) data analysis reporting

What is EpiData ? EpiData is computer program (small in size 1.2Mb) for simple or programmed data entry and data documentation It is highly reliable It runs on Windows computers Runs on Macs and Linus with emulator software (only) Interface pull down menus work bar

History of EpiInfo & EpiData 1976–1995: EpiInfo (DOS program) created by CDC (in wake of swine flu epidemic) Small, fast, reliable, 100,000+ users worldwide 1995–2000: DOS dies slow painful death 2000: CDC releases EpiInfo2000 Based on Microsoft Jet (Access) data engine Large, slow, unreliable (resembled EpiInfo in name only) 2001: Loyal EpiInfo user group decides it needs real “EpiInfo for Windows” Creates open source public domain program Calls program “EpiData”

Goal: Create & Maintain Error-Free Datasets Two types of data errors Measurement error (i.e., information bias) – discussed last couple of weeks Processing errors = errors that occur during data handling – discussed this week Examples of data processing errors Transpositions (91 instead of 19) Copying errors (O instead of 0) Additional processing errors described on p. 18.2

Avoiding Data Processing Errors Manual checks (e.g., handwriting legibility) Range and consistency checks* (e.g., do not allow hysterectomy dates for men) Double entry and validation* Operator 1 enters data Operator 2 enters data in separate file Check files for inconsistencies Screening during analysis (e.g., look for outliers) * covered in lab

Controlled Data Entry Criteria for accepting & rejecting data Types of data controls Range checks (e.g., restrict AGE to reasonable range) Value labels (e.g., SEX: 1 = male, 2 = female) Jumps (e.g., if “male,” jump to Q8) Consistency checks (e.g., if “sex = male,” do not allow “hysterectomy = yes”) Must enters etc.

Data Processing Steps File naming conventions Variables types and names QES (questionnaire) development Convert .QES file to .REC (record) file Add .CHK file Enter data in REC file Validate data (double entry procedure) Documentation data (code book) Export data to SPSS Import data into SPSS

Filenaming and File Management c:\path\filename.ext A web address is a good example of a filename, e.g., http://www2.sjsu.edu/faculty/gerstman/StatPrimer/data.ppt Some systems are case sensitive (Unix) Others are not (Windows) Always be aware of Physical location (local, removable, network) Path (folders and subfolders) Filename (proper) Extension Demo Windows Network Explorer: right-click Start Bar > Explore

File extensions you should know Software program .qes EpiInfo/EpiData questionnaire .rec EpiInfo/EpiData records (data) .chk EpiInfo/EpiData check (controls & labels) .not EpiData notes (data documentation) .sav SPSS permanent data file .sps SPSS syntax file (program) .txt Generic (flat) text data .htm Web Browser .doc Microsoft Word .xls Microsoft Excel

Selected EpiData Variable Types Examples Text _ <A > Numeric # ##.# Date <mm/dd/yyyy> <dd/mm/yyyy> Auto ID <IDNUM> Sondex (sanitized) <S >

EpiData Variable Names Variable name based on text that occurs before variable type indicator code EpiData variable naming default vary depending on installation Create variable names exactly as specified To be safe, denote variable names in {curly brackets} For example, to create a two byte numeric variable called age, use the question: What is your {age}? ##

Demo / Work Along Create QES file [demo.qes] Convert QES to REC [demo.rec] Create CHK file [demo.chk] Create double entry file [demo2.rec] Enter data Validate data Fname Lname DOB SEX DEATHAGE John Snow 3/15/1813 1 45 George Orwell 6/25/1903 46

We will stop here and pick up the second part of the lecture next week “Stay tuned”

Codebooks Contain info that helps users decipher data file content and structure Includes: Filename(s) File location(s) Variable names Coding schemes Units Anything else you think might be useful

EpiData codebook generators

File Structure Codebook Full codebook contains descriptive statistics (demo)

Full Codebook Notice descriptive statistics

Conversion of Data File Requires common intermediate file format Examples of common intermediate files .TXT = plain text .DBF = dBase program .XLS = Excel Steps Export .REC file  .TXT file Import .TXT file into SPSS Save permanent SAV file

Current Export Formats Supported by EpiData

Plain (“raw”) TXT data plain ASCII data format no column demarcations no variable names no labels

TXT file with codebook tox-samp.txt tox-samp.not

SPSS Data Export / Import TXT (raw data) SAV REC SPS (syntax)

Top of tox-samp.sps Lines beginning with * are comments (ignored by command interpreter) Next set of commands show file location and structure via SPSS command syntax

Bottom part of tox-samp.sps file Labels being imported into SPSS Delete * if you want this command to run

Opening the SPS (command) file

Running the SPS file

Ethics of Data Keeping Confidentiality (sanitized files – free of identifiers) Beneficence Equipoise Informed consent (To what extent?) Oversight (IRB)