Presentation on theme: "Housekeeping: Variable labels, value labels, calculations and recoding"— Presentation transcript:
1Housekeeping: Variable labels, value labels, calculations and recoding Session 2Housekeeping:Variable labels, value labels, calculations and recoding
2Review You have used Stata We hope you found it (surprisingly?) easy Largely through the menus and dialoguesBut also with a few commandsWe hope you found it (surprisingly?) easyDiscusswhat you likedAnd difficulties so far
3Housekeeping tasks By housekeeping, we mean This includes: the small jobs to organise and add labels to the dataThey make life easier later.This includes:labelling and adding notes to datasets;labelling variableslabelling categories (or values) taken by the variablerecoding variables and dealing with codes for missing valuesusing log files to keep a record of what you have done.
4Labels and notes Use Data Labels Label dataset Open the file named E_HouseholdComposition.dtaUse Data Labels Label dataset
5Dialogue for labelling data set Type in dialogue as below or use the commandlabel data “Young Lives Study……”4
6Labelling variables Use the menu sequence Data Labels Label variable as shown belowOr type the command:label variable relcare "What is your relationship to child?“
7Defining value labelsUse:Data Labels Label values Define or modify value labelsand complete the dialogue box that follows.The corresponding commands show that two steps are needed to label the values.First, a label must be defined,e.g.label define sexlabel 1 "male" 2 "female"Then this label is attached to the variable,e.g. for the variable called sex use the commandlabel values sex sexlabel
8Your turn Work through Section 4.1of the Stata Guide Note down any difficulties you have and clarify your difficulties with a resource person
9Recoding a variable Data Create or change variables Also use options to define a new variableData Create or change variables Other variable transformation commands Recode categorical variable
10Information on the recoded variable Always safer to recode into a new variable, e.g. seedad2.The effect of the recoding can be seen by typingcodebook seedad2If seedad is later no longer needed, it can be dropped.Use File Save, to save information on the new variable in the data set.
11Your turn again Work through Section 4.2 of the Stata Guide Note down any difficulties you have and clarify your difficulties with a resource person
12. is less than .a Missing values Symbols for missing values in Stata: . and .a .b .c and so on, up to .zThese are used to distinguish between the different reasons for values to be missing.When making calculations, comparisons or sorting, the following rules are observed:all non-missing numbers are less than .. is less than .a.a is less than .b, and so on, up to .z
13Memory The initial memory in Stata is 1 megabyte This can be changed, but first type Clear to clear memoryTo increase the current memory to 20 mbytes, typeset memory 20mFor setting Permanent memory, useset memory 20m, permanentlyFor problems processing large datasets, use the compress command.
14Log files To keep a record of the output, while using Stata Open a log file by clicking on the Log icon.This opens a dialogueIn your working directoryso you can name the log fileIt suggests an extension smcl.smcl stands for Stata Markup and Control Language.Log files in Stata record both commands and output.
15RemarksYou can change the extension to “log” to produce a simple ASCII fileOther packages use the idea of a log file to record just the command – not the output as wellYou can do this in Stata (but not from the menus)Notice that the command Stata used for its log file was. log using “name of file”Do the same again, but using. cmdlog using “name of file”If at a later stage you need to append or replace this file, add the option replace or append at the end of the above commands.
16Your turnPractice the above ideas by working through Sections 4.6, 4.7, 4.8 of the Stata Guide.Then either read your own data into Stataand perform some simple analyses using methods covered so farOr use a dataset suggested by the resource persons.
17So if you have a dataset… Open, within Stata, the data file in Stata format that you created in the previous session.Identify the key variables in your data set and set up labels for each of these variables.Identify any categorical variables in your data set. Then define, and set value labels that describe the levels for each categorical variable.Finally, re-save your data file.