Data Entry, Coding & Cleaning SPSS Training Thomas Joshua, MS July, 2008.

Slides:



Advertisements
Similar presentations
1 Using SPSS: Introduction Department of Operations Weatherhead School of Management.
Advertisements

An introduction to data entry, data analysis, and graphing using SPSS
Variables 9/10/2013. Readings Chapter 3 Proposing Explanations, Framing Hypotheses, and Making Comparisons (Pollock) (pp.48-58) Chapter 1 Introduction.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Processing and Fundamental Data Analysis CHAPTER fourteen.
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Processing and Fundamental Data Analysis CHAPTER fourteen.
©2004, 2006, 2008 UIW Department of Instructional Technology Meat and Potatoes SPSS Presented by Terence Peak.
TYPES OF DATA. Qualitative vs. Quantitative Data A qualitative variable is one in which the “true” or naturally occurring levels or categories taken by.
Introduction to SPSS Opening the program Type in data Open an existing data set For now, click “Cancel”
INTERPRET MARKETING INFORMATION TO TEST HYPOTHESES AND/OR TO RESOLVE ISSUES. INDICATOR 3.05.
A Simple Guide to Using SPSS© for Windows
SOWK 6003 Social Work Research Week 10 Quantitative Data Analysis
Introduction to SPSS Descriptive Statistics. Introduction to SPSS Statistics Program for the Social Sciences (SPSS) Commonly used statistical software.
SPSS 1: An Introduction to the Statistical Package SPSS Suzie Cro MRC Clinical Trials Unit.
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
LEVEL OF MEASUREMENT Data is generally represented as numbers, but the numbers do not always have the same meaning and cannot be used in the same way.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect3_1.
Introduction to SPSS (For SPSS Version 16.0)
Intro to SPSS Kin 260 Jackie Kiwata. Overview Intro to SPSS Defining Variables Entering Data Analyzing Data SPSS Output Analyzing Data Max, Min, Range.
Chapter Sixteen Starting the Data Analysis Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
DEVELOPING A CODING SCHEME AND SETTING UP YOUR SPSS DATA FILE
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 9 Processing the Data.
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
PY550 Research and Statistics Dr. Mary Alberici Central Methodist University.
Organizing Your Data for Statistical Analysis in SPSS
Chapter 1 Displaying the Order in a Group of Numbers and… Intro to SPSS (Activity 1) Thurs. Aug 22, 2013.
Tutor: Prof. A. Taleb-Bendiab Contact: Telephone: +44 (0) CMPDLLM002 Research Methods Lecture 9: Quantitative.
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
Fortran 1- Basics Chapters 1-2 in your Fortran book.
APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical.
Introduction to SPSS Edward A. Greenberg, PhD
9/23/2015Slide 1 Published reports of research usually contain a section which describes key characteristics of the sample included in the study. The “key”
LINDSEY BREWER CSSCR (CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH) UNIVERSITY OF WASHINGTON September 17, 2009 Introduction to SPSS (Version 16)
SW388R6 Data Analysis and Computers I Slide 1 Central Tendency and Variability Sample Homework Problem Solving the Problem with SPSS Logic for Central.
CSCI N207: Data Analysis Using Spreadsheets Copyright ©2005  Department of Computer & Information Science Univariate Data Analysis.
COMMUNICATION ARTS RESEARCH CA3011 A. Parichart W. and A. Chulamani C.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 4 l Introduction to Statistical Software Package 4.1 Data Input 4.2 Data Editor 4.3 Data.
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Using SPSS to Analyze Data Anastasia.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Entering Data Manually PowerPoint Prepared by.
Introduction to SPSS Prof. Ramez Bedwani. Outcomes By the end of this lecture, the student will be able to Know definition, uses and types of statistics.
CADA Final Review Assessment –Continuous assessment (10%) –Mini-project (20%) –Mid-test (20%) –Final Examination (50%) 40% from Part 1 & 2 60% from Part.
Experimental Research Methods in Language Learning Chapter 9 Descriptive Statistics.
Chapter Twelve Copyright © 2006 John Wiley & Sons, Inc. Data Processing, Fundamental Data Analysis, and Statistical Testing of Differences.
A Simple Guide to Using SPSS ( Statistical Package for the Social Sciences) for Windows.
SW318 Social Work Statistics Slide 1 Frequency: Nominal Variable Practice Problem This question asks the frequency of widowed respondents of the survey.
Perform Descriptive Statistics Section 6. Descriptive Statistics Descriptive statistics describe the status of variables. How you describe the status.
Level of Measurement Data is generally represented as numbers, but the numbers do not always have the same meaning and cannot be used in the same way.
Verification & Validation. Batch processing In a batch processing system, documents such as sales orders are collected into batches of typically 50 documents.
Overview Excel is a spreadsheet, a grid made from columns and rows. It is a software program that can make number manipulation easy and somewhat painless.
Set-up a Data Entry Page Section 3. Set Up Columns Switch to Variable View. At the bottom left of your screen there are two tabs (Data View and Variable.
Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques.
TIMOTHY SERVINSKY PROJECT MANAGER CENTER FOR SURVEY RESEARCH Data Preparation: An Introduction to Getting Data Ready for Analysis.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
Using SPSS Next. An Introduction SPSS (the Statistical Package for the Social Sciences)
Use SPSS for solving the problems Lecture#21. Opening SPSS The default window will have the data editor There are two sheets in the window: 1. Data view2.
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
Measurements Statistics WEEK 6. Lesson Objectives Review Descriptive / Survey Level of measurements Descriptive Statistics.
Entering Data in SPSS Open SPSS. Select the radio button beside ‘type in data’. Click OK. At the bottom of the SPSS spreadsheet, select variable view.
SPSS Statistical Package for Social Sciences Setting up an SPSS Data File Department of Psychology California State University Northridge
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
Chapter Fourteen Copyright © 2004 John Wiley & Sons, Inc. Data Processing and Fundamental Data Analysis.
DATA TYPES.
Center of Statistical Analysis
SPSS For a Beginner CHAR By Adebisi A. Abdullateef
Measurements Statistics
Introduction to Statistical Software Package
Introduction to SPSS.
LEVEL OF MEASUREMENT Data is generally represented as numbers, but the numbers do not always have the same meaning and cannot be used in the same way.
LINDSEY BREWER CSSCR (CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH) UNIVERSITY OF WASHINGTON September 17, 2009 Introduction to SPSS (Version 16)
SPSS Setting up an SPSS Data File
Creating a Codebook.
Presentation transcript:

Data Entry, Coding & Cleaning SPSS Training Thomas Joshua, MS July, 2008

Lecture Overview Data Entry Data Entry Data Coding – the Variable View in SPSS Data Coding – the Variable View in SPSS Data Cleaning Data Cleaning

Data Entry & Coding Before describing the process for defining variables, an important distinction should be made between two terms that are often confused: variable and value A variable is a measure or classification scheme that can have several values Values are the numbers or categorical classification representing individual instances of the variable being measured

Data Entry You may create a data file using one of your favorite text editors, or word processing packages (e.g., Word Perfect, MS-Word). Files created using word processing software should be saved in text format before trying to read them into an SPSS session. You may create a data file using one of your favorite text editors, or word processing packages (e.g., Word Perfect, MS-Word). Files created using word processing software should be saved in text format before trying to read them into an SPSS session. You may enter your data into a spreadsheet (e.g., Lotus 123, Excel, dBASE) and read it directly into SPSS for Windows. You may enter your data into a spreadsheet (e.g., Lotus 123, Excel, dBASE) and read it directly into SPSS for Windows. Finally, you may enter the data directly into the spreadsheet-like Data Editor of SPSS for Windows. Finally, you may enter the data directly into the spreadsheet-like Data Editor of SPSS for Windows. –In this document we are going to examine one data entry methods: using the Data Editor of SPSS for Windows.

The Variable View The Data View

Define Information – The Variable View Name Name –Each variable name must be unique; duplication is not allowed. –Start with a letter. –May have up to 8 characters, including letters, numbers, and the symbols #, _, or $). –Variable names cannot end with a period.

The Variable View (con’t) Name (con’t) Name (con’t) –Variable names that end with an underscore should be avoided. –The certain key words are reversed and may not be used as variable names, e.g. “compute”, “sum” and so forth. –Ex. Subject_ID, but not “subject-ID”, and not “Subject ID”.

The Variable View (con’t) Type Type –Basic type – numeric and string –Maximum width for numeric variables is 40 characters, the maximum number of decimal positions is 16. –String variables may contain letters or numbers. For string values a blank is considered a valid value. –Numeric operations on the string variables will NOT be allowed, e.g. finding the mean, variance, standard deviation, etc…

–If you select a string variable, you can tell SPSS how much “room” to leave in memory for each value, indicating the number of characters to b allowed for data entry in this string variable

The Variable View (con’t) Width Width –The number of characters SPSS will allow to be entered for the variable. –For a numerical value with decimals, this total width has to include a spot for each decimal, as well as one for the decimal point. Decimals Decimals –If more decimals have been entered or computed by SPSS, the additional information will be retained internally but not displayed on screen.

The Variable View (con’t) Label Label –A string to identify in detail what a variable represents. –Is limited to 255 characters –May contain spaces and punctuation.

The Variable View (con’t) Values Values –Indicate how the numbers are assigned for categorical data. –Instead of typing into the computer the full answer to each question, codes are typed in (e.g. 1 if the respondent is female, 2 if male). –Codes are usually numerical, because this is what most statistical software expects, and using only numerical codes makes data entry faster. –These are easier to remember, and therefore tend to have lower error rates.

The Variable View (con’t) Values (con’t) Values (con’t) –To code categorical variables in numeric format. –The Value Labels will be used.

The Variable View (con’t) The labels can be seen in the Data View by clicking on the “toe tag” icon in the tool bar, which switches between the numeric values and their labels.

The Variable View (con’t) Missing Missing –Signal to SPSS which data should be treated as missing. –System Missing data – SPSS display a single period.

The Variable View (con’t) Columns Columns –How wide the column should be for each variable –Columns affect only the display of values in the Data Editor. Changing the column width does not change the defined width of a variable. Align Align

The Variable View (con’t) Measure Measure –Indicates the level of measurement. –Since SPSS does NOT differentiate between interval and ratio levels of measurement, both of these quantitative variable types are lumped together as “Scale”. –Nominal and ordinal levels of measurement ARE differentiated.

Type of Measurement The answers to the "numerical questions" are real numbers, not just arbitrary codes. There are four types of numerical scales that exist: nominal scales, ordinal scales, interval scales, and ratio scales. Scale Scale –A ratio scale is one in which the answers are real numbers, and an answer of zero means what it says. "What age are you?" - "How tall are you?" - "How many children do you have?" –An interval scale (meaning equal-interval) - if there’s a zero point, it’s arbitrary, but the difference between two successive possible answers is the same. For example, the scale of temperature.

Type of Measurement (con’t) Ordinal Ordinal –Frequently, categorical data responses represent more than two possible outcomes, and often these possible outcomes take on some inherent ordering. –No clue as to the relative distances between the levels –For example, low – medium – high 50% – 75% – 100% – 200% strong agree – agree – neutral – disagree – strongly disagree

Type of Measurement (con’t) Nominal Nominal –A nominal scale isn’t really a scale at all, but an arbitrary code value to distinguish the different groups. –No inherent ordering to the categories. –For example, “Do you prefer the beach, mountains, or lake for a vacation?” “Which color is your favorite?”

Data Cleaning What most data entry programs will not do is warn the user when unlikely (but possible) codes occur. For example, if a respondent’s age is shown as 99, this may be true, but it may also be a mistake. What most data entry programs will not do is warn the user when unlikely (but possible) codes occur. For example, if a respondent’s age is shown as 99, this may be true, but it may also be a mistake. Therefore it’s not only wild values that need to be checked. The first frequencies check from a program needs to be looked at very carefully to detect this kind of mistake. Therefore it’s not only wild values that need to be checked. The first frequencies check from a program needs to be looked at very carefully to detect this kind of mistake.

Data Cleaning (con’t) Check missing values - If the question was "Which sex are you, male or female?" and the possible answers are 1 for male and 2 for female, these should be the only values for that variable - except perhaps for a few blanks for the missing values Check missing values - If the question was "Which sex are you, male or female?" and the possible answers are 1 for male and 2 for female, these should be the only values for that variable - except perhaps for a few blanks for the missing values

Data Cleaning (con’t) There are two types of missing values in SPSS: system-missing and user-defined. There are two types of missing values in SPSS: system-missing and user-defined. System-missing values are assigned by SPSS when, for example, you perform an illegal function, like dividing a number by zero. System-missing values can also be assigned in an input data set. System-missing values are assigned by SPSS when, for example, you perform an illegal function, like dividing a number by zero. System-missing values can also be assigned in an input data set. User-defined missing values are numeric values that you can specify and SPSS will consider to be missing. For example, you may define to be a missing value. User-defined missing values are numeric values that you can specify and SPSS will consider to be missing. For example, you may define to be a missing value.

Data Cleaning (con’t) You can assign many different missing values to a given variable, perhaps using the different values to indicate different reasons for the data point to be missing. You can assign many different missing values to a given variable, perhaps using the different values to indicate different reasons for the data point to be missing. For example, for an item on a survey, might indicate that the respondent skipped the item, might indicate that the item was not answered because it was part of skip pattern, and might indicate that a note was written in the margin instead of a standard response. You can specify up to three unique values for each variable. User-defined missing values can also be a range, such as 5 to 10. This is useful when you want to include only half of a scale, for example. You can specify up to three unique values for each variable. User-defined missing values can also be a range, such as 5 to 10. This is useful when you want to include only half of a scale, for example. String values can also be used as missing values, including a series of blanks (i.e., a null string). String values can also be used as missing values, including a series of blanks (i.e., a null string).