Presentation is loading. Please wait.

Presentation is loading. Please wait.

Max Perez Leon Quinoso Brian Fried StatLab. Create a folder named IntroStata in the desktop. Lets put all files in that folder Very simple. We can use.

Similar presentations


Presentation on theme: "Max Perez Leon Quinoso Brian Fried StatLab. Create a folder named IntroStata in the desktop. Lets put all files in that folder Very simple. We can use."— Presentation transcript:

1 Max Perez Leon Quinoso Brian Fried StatLab

2 Create a folder named IntroStata in the desktop. Lets put all files in that folder Very simple. We can use StatTransfer (which usually comes with Stata) or export data directly using Stata

3 Working space cd Changing working space to IntroStata folder (the exact path will be different for each user) cd “C:\Users\MPLQ\Desktop\IntroStata” Stata always has a default working directory

4 Different ways to call a dataset use “C:\Users\MPLQ\Desktop\IntroStata\example1.dta” If we defined the working directory, we do not need to specify all the path. Notice we are also using command “clear”. clear use example1.dta clear insheet using example1.csv

5 Browsing/Describing data browse br br patient department edit list list patient age desc desc age desc score* codebook tab patient tab age tab department tab depart

6 tab age survey sum score2000 sum score2000,detail sum score2000,det sum score* br *t sort score2000 gsort survey gsort - survey gsort -survey score2000

7 Using “in” and “if” browse in 1/4 browse if age<=20 Qualifiers br if age =40 br if (age =40) & department == “FES” <== <=!= (not) >& (and) >=| (or)

8 It is not a good practice to use the command window for our research. We should have a file in which we store all our commands and that allows us to run efficiently our procedures. Important Shortcuts: –“Ctrl+D” (visible) –“Ctrl+R” (invisible) If we select some lines, the shortcuts will only run the commands in those specific lines. If we do not select any lines, it will run all the do file. Comments start with asterisk.

9 clear set mem 100m set more off cd “C:\Users\MPLQ\Desktop\IntroStata” use example1

10 generate doubscore= score2000 * 2 gen av_score=( score2000 + score2001 + score2002 )/3 gen ones=1 gen indicator= score2000<.5 Missing values. Be aware that missing values are different from zero: gen small= av_score if av_score<0.4 tab small tab small,m

11 Operations with missing values give missing values gen small_modify= small * 10 replace small_modify=0 if small_modify==. replace av_score=1000 if av_score<0.5 Renaming variables rename doubscore score_double Creating dummy variables tab survey, gen(Dsurvey) br br *survey*

12 Egen egen mscore2000=max(score2000) br *score2000 egen Dscore2000=max(score2000),by(department) br score2000 department Dscore2000 Bysort bysort department: tab score2000 bysort survey: sum score2001 gen index=_n br bysort department: gen dep_index=_n Collapse collapse (count) patient (mean) score2000 (sum) score2001 (sd) score2002,by(department)

13 Label a variable to convey more information label var survey "Patients Survey in 2010“ desc Label values of categorical variables label define ex_label 1 low 2 medium 3 high label values survey ex_label desc

14 Notice variable department is a string variable desc Sometimes we would like to store a string variable as a numeric variable, but not loose the information contained in the strings. encode department, gen(dep_num) desc I can reconvert numeric to string decode dep_num,gen(department2) br department dep_num department2

15 Long and wide format. Our original data is in wide format reshape long score,i( patient ) j(year) reshape wide score,i( patient ) j(year)

16 Open file example2 use example2,clear use example1,clear append using example2 Open file example3. Watch out, we are saving and replacing example3 but sorted by patient identification number. use example3,clear sort patient save,replace use example1, clear merge patient using C:\Users\MPLQ\Desktop\IntroStata\example3.dta

17 use example1, clear append using example2 sort patient merge patient using C:\Users\MPLQ\Desktop\IntroStata\example3.dta

18 clear set mem 100m set more off cd "C:\Users\MPLQ\Desktop\IntroStata" use example1 capture log close log using history, replace text tab survey log close log using history, text append tab department log close

19 Be very careful when saving data. You could be eliminating your original data and months of hard work. Always keep a copy of your original data on a separate folder. save final_database save final_database,replace use final_database,clear save,replace

20 Mean sum score2000,det return list Correlations corr score2000 score2001 score2002 corr score* corr score*, covar Regression reg score2000 score2001 score2002 reg score2000 score2001 score2002, noconstant reg score2000 score2001 score2002, robust ereturn list

21 It is very useful to use Stata menus to obtain the command lines. scatter score2000 score2001 graph matrix score* graph bar (count) patient, over(survey)

22 Text within brackets [] are optional restrictions or options. Underlined sections indicate acceptable abbreviations help tab help help gen

23 Local-macro variables Foreach command (loops) Regular expressions (Very useful if working with strings) Commands –#delimit –return list –ereturn list –macro list

24 Stata’s YouTube channel: http://www.youtube.com/user/statacorp/featured http://www.youtube.com/user/statacorp/featured http://survey-design.com.au/tips.html http://www.ats.ucla.edu/stat/stata/ http://data.princeton.edu/stata/ http://dss.princeton.edu/online_help/stats_packag es/stata/


Download ppt "Max Perez Leon Quinoso Brian Fried StatLab. Create a folder named IntroStata in the desktop. Lets put all files in that folder Very simple. We can use."

Similar presentations


Ads by Google