Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stata Intro Practice Exercises 2014 - Debby Kermer, George Mason University Libraries Data Services.

Similar presentations

Presentation on theme: "Stata Intro Practice Exercises 2014 - Debby Kermer, George Mason University Libraries Data Services."— Presentation transcript:

1 Stata Intro Practice Exercises Debby Kermer, George Mason University Libraries Data Services

2 Instructions Create and run syntax to accomplish each task. Press the spacebar to see the next instruction, an answer or a hint. Open the Pew Social Trends Dataset ___ "" OR File | Open… [type in] hint use

3 Exercise 1 Using Help 1

4 Produce statistics about yrborn using the summarize command summarize yrborn Open the help for that command help summarize Modify the syntax to… … use abbreviations sum yrbornor sum yr or su y … display additional statistics sum yr, detail summarize sum yr, _____ hint 1a Need to create yrborn? generate yrborn = age

5 summarize yrborn … ignore those who refused to give their age sum yr if (age != 99) sum yr if (age < 99) Now, summarize age, ignoring those who refused to answer sum age if (age < 99) … and ALSO display additional statistics sum age if (age < 99), detail sum yr if (_______) 3 hints 1b Forgot which value meant refused? label list AGE Your result should look like ↓ Variable | Obs Mean Std. Dev. Min Max yrborn |

6 Extra Challenge Compare average age by Region (cregion) tab cregion, sum(age) Notice how this is a combination of both tab cregion - frequencies for categorical variables and sum age - means for numeric variables But, summarize is used as an option, so the comma and parentheses are necessary hint 1c See the help page we used as an example: help tab then tabulate, summarize()

7 Exercise 2 Indicator Variables 2

8 Make a new variable "voted" indicating those who voted in the '04 election. Voters should have a 1, non-voters should have a 0. First, get information about the variable you will use: codebook pvote04a Then, create your variable: generate voted ___________ Check whether it is correct, your result should look like ↓ tab pvote04a voted generate voted = (________) codebook ________ hint 2a generate voted = (pvote == 1) 3 hints

9 If you want, this is how you can label the variable "voted" label variable "Voted in the '04 Election" label define yesno 1 "Yes" 0 "No" label values voted yesno ("yesno" is a made-up name, you may use anything) Now, you try: label the variable "youth" appropriately lab var "Youth: age < 30" lab def under30 1 "< 30 yrs old" 0 "30 yrs and up" lab val youth under30 2b Need to create "youth"? generateyouth = (age < 30) replace youth =. if (age == 99)

10 Extra Challenge In one statement (i.e., one line of syntax), create a variable legal indicating only those of legal drinking age (n=2,842) gen legal = (age >= 21) if age < 99 gen legal = (age >= 21) & age < 99 Although both of the above are good, the values generated by these two commands are not identical. How do they differ? 2c & recodes 99's as 0 if recodes 99's as missing Legal Drinker Not Legal No Age (99) gen legal = (age >= 21) & (age < 99)100 gen legal = (age >= 21) if (age < 99)10.

11 Exercise 3 Illustrating Relationships 3

12 3a Show the relationship between age group and voting rate What variables can you use? youth and voted What command can you use? Open help. help tab then tabulate twoway Construct your syntax tab youth voted___________ Use options to include percentages, like this ↓ 12 3 hints | voted youth | 0 1 | Total | | | | Total | | Pearson chi2(1) = Pr = 0.000

13 Show the relationship between age group and voting rate tab youth voted, row nofreq chi2 | voted youth | 0 1 | Total | | | | Total | | Pearson chi2(1) = Pr = So, is there a relationship between age and voting? Among those younger than 30, 52% voted. But, among those 30 or older, 81% voted. Youth were less likely to have voted (p <.001). 13 hint 3b

14 Extra Challenge What are the 4 ways the tabulate command can be written? tab youth1-way, frequencies tab youth voted 2-way, crosstab / contingeny table tab youth voted cregion too many variables tab1 y vote cr →tab y+ tab vote+ tab cr tab2 y vote cr→ tab y vote+ tab vote cr+ tab y cr tab y, sum(vote) → tab y + sum voteMeans by Group tab y cr, sum(vote) → tab y cr+ sum vote Pivot Table 3c

15 That's All! Thanks for trying the Stata Exercises. If you have any questions about using Stata contact Debby Kermer at or see our online resources at:

Download ppt "Stata Intro Practice Exercises 2014 - Debby Kermer, George Mason University Libraries Data Services."

Similar presentations

Ads by Google