Download presentation
Presentation is loading. Please wait.
Published byῬούθ Κομνηνός Modified over 5 years ago
1
November 26, 2019 Miami University Dr. Rachel Blum
Pol 604 student r tutorial November 26, 2019 Miami University Dr. Rachel Blum
2
Prompt 1 Investigating GGPlot2
By Kelleigh Beatty
3
Basic steps 1. Import your data and the packages you will need.
Then use the library() function to tell R to use them. 2. Pick your dependent and independent variable(s). Then bind them to your data. 3. Make basic plots using the ggplot2()package. Then use the wonderful internet to find ways to make them prettier!
4
Basic steps 1. Import your data and the packages you will need.
Then use the library() function to tell R to use them. 2. Pick your dependent and independent variable(s). Then recode them, if necessary (they are already bound to the dataset they came from.) 3. Make basic plots using the ggplot2()package. Then use the wonderful internet to find ways to make them prettier!
5
Code for binding variables
> View(TeaParty_and_Congress) > library(ggplot2) > library(ggthemes) > library(sjlabelled) > Trump2016Share <- TeaParty_and_Congress$Trump2016Share > TPGroups.RepCD <- TeaParty_and_Congress$TPGroups.RepCD > TPGroups.DemCD<-TeaParty_and_Congress$TPGroups.DemCD Import Code for binding variables
6
Code for binding variables
View(TeaParty_and_Congress) > View(TPC) > library(ggplot2) > library(ggthemes) > library(sjlabelled) > Trump2016Share <- TeaParty_and_Congress$Trump2016Share > TPGroups.RepCD <- TeaParty_and_Congress$TPGroups.RepCD > TPGroups.DemCD<-TeaParty_and_Congress$TPGroups.DemCD Import Code for binding variables
7
Basic plots > ggplot(TeaParty_and_Congress, aes(x=Trump2016Share,
y=TPGroups.RepCD)) + geom_point()+ theme_classic()+ labs (x="Trump Vote Share", y="Number of Tea Party Groups in Rep. Districts")
8
Pretty plots > ggplot(TeaParty_and_Congress, aes(x=Trump2016Share,
y=TPGroups.RepCD)) + geom_point()+ theme_classic()+ labs (x="Trump Vote Share", y="Number of Tea Party Groups in Rep. Districts")
9
< The end! This code uses a template for the style
The Economist uses for some of their graphics. I think it’s pretty cool. < The end!
10
Prompt 2 identifying and fixing errors in code
By Sara Carnahan
11
Step-by-step guide 1. Insert Packages library(ggplot2) library(ggthemes) library(viridis) 2. Make dataframe TeaParty_and_Cognress$CD <- as.data.frame(TeaParty_and_Congress$CD) 3. Do Not Bind Variables with the $ CD <- TeaParty_and_Congress TP <- TPGroups Romney<- Romney2012Share 4. Trial and Error geomjitter() -> geom_jitter() theme_mnnimal() -> theme_minimal()
12
Initial code (with errors)
ggplot(CD, aes(y=TP, x=Romney, color=TP)) + geomjitter() + theme_mnnimal() + scale_color_viridis(option="inferno", direction=-1) + labs(title="Tea Party groups clustered in districts that went Republican in 2012" x = "2012 GOP presidential vote share", y="Groups per district")+
14
Helpful extra code Vertical Line: geom_vline(xintercept=60,
color="red", size=1) Horizontal Line: geom_hline(yintercept=?) Edit Title of Plot: theme(plot.title = element_text(color= "green", size=10, face="italic")) Edit Background Panel: theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) Edit Font and text: #extrafont, #extrafontdb, #Rttf2pt1 #font_import() theme(text=element_text(family="Lato Heavy", size=10), axis.text = element_text(color = "green”), axis.title = element_text(color="green”))
15
Prompt 3 fun with color scales
By Eunice Laryea
16
Color palettes in R include
Viridis color scales > install(“viridis”) Colorbrewer scales > install (“Rcolorbrewer”) Grey color scales > install (“ggplot2”)
17
Viridis color palettes
> scale_color_viridis() Change the color of points, lines and texts > scale_fill_viridis() Change the fill color of areas (box plot, bar plot, etc.) Note: scale_color_viridis() and scale_fill_viridis() have an argument option. Character string indicating the color map option to use. Four options are available: “magma” (or “A”), “inferno” (or “B”), “plasma” (or “C”), and “viridis” (or “D”, the default option). ggplot(TeaParty_and_Congress, aes(y=TPGroups, x=Romney2012Share)) + geom_jitter() + scale_fill_viridis(option = "D") + theme_minimal() + geom_segment(aes(x = 65, y = 0, xend = 65, yend=9.5), color="grey60", size=0.5, linetype="solid") + labs(title="Ts ea Party groups clustered in districts that went Republican in 2012", x = "\n 2012 GOP presidential vote share", y="Groups per district \n")
18
Scatterplots with Color palettes Viridis and Rcolorbrewer
19
Prompt 4 design your own model
By John Kirchoefer
20
summary(TeaParty_and_Congress) View(TeaParty_and_Congress)
DESIGN YOUR OWN MODEL. Use Trump2012 as your dependent variable. Then: Choose at least two independent variables. Recode them if you need to. Specify an OLS model using these variables. Show your results in an attractive TABLE Document the steps you took as you would explain them to someone who's new to R summary(TeaParty_and_Congress) View(TeaParty_and_Congress) t.test(TeaParty_and_Congress$Trump2016Share) Trump = TeaParty_and_Congress$Trump2016Share Romney = TeaParty_and_Congress$Romney2012Share Party = TeaParty_and_Congress$Party TEAGROUP = TeaParty_and_Congress$TPGroups table(Romney,Trump) table(TEAGROUP,Trump)
21
plot(Romney, Trump) plot(Romney~Trump, type="p", xlab = "Vote Share for Romney in 2012", ylab = "Vote Share for Trump in 2016", pch=18, col="red", bty="l") abline(lm(Romney~Trump), col = "blue") plot(TEAGROUP~Trump, xlab = "Tea Party Groups per District", ylab = "Vote Share for Trump in 2016", pch=18, col="red", bty="l") abline(lm(TEAGROUP~Trump), col = "blue") OLSTRUMP = lm(formula = Trump ~ Romney + TEAGROUP) summary(OLSTRUMP) OLSTRUMP$fitted.values OLSTRUMP$residuals
22
Prompt 5 descriptive statistics
By Lauren Strope
23
Boxplot pdf("Plot1Trump.pdf", width=4, height=6) plot(TPC$Party,
TPC$Trump2016Share, xlab="Party", ylab="Trump Vote Share", pch=18, col="gray", bty="l", main="2016 District Vote Share by Party") dev.off()
24
Scatterplot with line of best fit
pdf("Plot3TrumpRomney.pdf", width=4, height=6) plot(TPC$Trump2016Share, TPC$Romney2012Share, xlab="Trump 2016 Vote Share", ylab="TRomney 2012 Vote Share", pch=18, col="gray", bty="l", main="Trump & Romney Vote Share") abline(lm(TPC$Trump2016Share~ TPC$Romney2012Share)) dev.off()
25
Density plot d <- density(TPC$Trump2016Share) plot(d,
main= "% of District Voting for Trump in 2016")
26
Crosstabs install.packages("gmodels") library(gmodels)
CrossTable(TPC$Party, TPC$Congress)
27
Prompt 6 descriptive statistics & plots for continuous variables
By Morgan McCracken
28
Where to start Determine continuous variables in your data set:
A continuous variable has a potentially infinite number of values. Order matters. From the TeaParty_and_Congress.csv, I chose the continuous variables of: TPGroups: Number of local Tea Party groups that were active in that district between 2009 and 2015. Romney2012Share: Percentage of that district that voted for Mitt Romney in the 2012 presidential election. Trump2016Share: Percentage of that district that voted for Donald Trump in the 2016 presidential election.
29
Creating tables Rename dataset to make it easier on yourself!
newname <- read_csv(“dataset.csv”) R will generate summary statistics for you. summary(dataset$variable) Will return data with the minimum, maximum, 1st quartile, mean, 3rd quartile, and median.
30
Creating plots Plots that can show show the distribution of data: histograms, box plots, density plots,… The command for density plots: d <- density(variable) plot(d, main = ”name", xlab=”variable", col="black) It is up to you, however, to determine which plot best describes your data.
31
Creating plots Plots that can show show the distribution of data: histograms, box plots, density plots,… The command for box plots: boxplot(name of variable, main=”main title", col=”color of boxes", border=”border color", horizontal= TRUE, ylab=”y-axis", xlab=”x-axis", cex=.5)
32
Prompt 7 DECIPHERING CODE TO MAKE MAPS
By Quinn Riley
33
Steps Installing appropriate packages AND setting them as libraries
(i.e. what packages do I need that go beyond basic plotting functions?) Setting a working directory (i.e. R won’t know where the data is that you want it to be working with unless you tell it.) Recoding variables (i.e. should R be reading a variable as a factor? A character?) Using ggplot and other packages to make a map, paying close attention to variable names and “fill” (i.e. what variable(s) should the map be considering? How can I use colors to differentiate the density of my variable in different states, regions, or congressional districts?)
34
Recoded variables
35
Map code
36
Errors encountered and resolved
Issue is with component called “fill” Google error message → read Look up ggplot2 → read (a website like R-statistics.co has complete tutorials) Look at the codebook → what am I trying to do here? Ah! TP groups by congressional districts. Therefore, the solution is… fill = TPGroups
38
Prompt 8 summarize variation across states
By Megan Burtis
39
Steps to aggregate data
State 0 1 2 7 8 9 AK 0 0 0 0 0 0 AL 2 4 1 0 0 0 AR 0 3 0 1 0 0 AZ 2 1 3 2 0 0 CA 28 17 CO 2 2 0 3 0 0 CT 3 2 0 0 0 0 DC 1 0 0 0 0 0 DE 0 1 0 0 0 0 FL 12 7 7 1 0 0 GA 3 7 3 0 1 0 AK AL AR AZ CA CO CT DC DE FL 1 5 4 Import data Install and load any relevant packages that may have produce graphs Inspect variables and decide which ones are needed is(variable) table(variable), State=StateAbbr$TeaParty Need to aggregate data as states have multiple entries Make a table with the two variables we need: Table1=table(State,TPGroups) Need to add groups for each state: rowSums(Table1)
40
Barplots counts=rowSums(Table1) barplot=(counts,
main="Number of TP Groups by State", xlab="State Abbreviation", ylab="Number of TP Groups"), col=”color”, horiz=TRUE, names.arg=(State)
41
Scatterplot plot(counts, type=”p”, xlab="State",
ylab="Number of TP Groups", main="Tea Party Groups by State”, col=”color”, pch=“”, font=“”) text(counts, labels=unique(State), cex=0.8, font=2, col="red")
42
Prompt 9 data merging By Joe Humenick
43
Prompt Find one that also contains an "icpsr" ID for each member (look for the 113th Congress), and also contains another variable about Congress that could be interesting. 2. Download it and merge the two datasets using the icpsr variable. Look into the dplyr package and focus on commands like "select," "merge," and so forth. 3. Merge the datasets.
44
Steps 1 and 2 Searched databases for unique data from the 113th Congress, I found google to be helpful as ICPSR can have sparse amounts of relevant data. Obtained data in .xsl form, then altered the titles of sections within the data set to match the tea party data set.
45
Step 3 3. Used merge() command with two valid variables within each data set to merge data. These variables had the same name, allowing R to recognize and combine them: merge(dataset1, dataset2, by.x = variable 1, by.y = variable 2) merge(Unity113, TeaParty_and_Congress, by.x = "Idno", by.y = "Name")
46
Prompt 10 subsetting By Hannah Specogna
47
How to subset 1. Establish which main variable you would like to subset, this case it is Republicans TPC$Party=as.factor(TPC$Party) 2. Pick which sub-variable you would like to pick, this case it’s Republicans TPR <- subset(TPC, Party=="R") 3. To double check you followed the steps correctly create a table on the established variable: table(TPR$Party) > D - 0 R Use this subset to identify a subgroup in another variable, Tea Party Caucus Members TeaPC<- subset(TPR, TPCaucus=="1") *Remember to include the double equal sign when forming a relationship between data*
48
How to create plots plot(jitter(NonTeaPC$TPGroups, 8),
NonTeaPC$Trump2016Share, main="Non Tea Party Caucus Members Support for Trump by Tea Party Group Membership", xlab="Tea Party Groups", ylab="Percentage of Vote for Trump", pch=19)
49
How to create plots plot(jitter(TeaPC$TPGroups, 2),
TeaPC$Trump2016Share, main="Tea Party Caucus Member Support for Trump by Tea Party Group Membership", xlab="Tea Party Groups", ylab="Percentage of Vote for Trump", pch=19)
50
Prompt 11 Working with themes and scales on a scatterplot
By Meghan Brandabur
51
What I started with: Packages needed: > library(ggplot2)
> library(ggthemes) > library(viridis)
52
Steps I took: Use the ggplot() command. Fill in using dataset
ggplot(data, aes(y=YourDV, x=YourIV, color=NameofDV) + geom_jitter() + theme_minimal() + scale_color_viridis(option=“inferno”, alpha=1, begin=0, end=.9, direction=-1, name=“Groups”, breaks=seq(0, 10, 2)) + labs(title=“name of plot”, x=“name of x axis”, y=“name of y axis”) + geom_vline(xintercept = 50, color= "blue", size=1) + geom_vline(xintercept = 60, size= 1) + theme(plot.title = element_text(color= "blue", size=10), panel.grid.major = element_blank(), panel.grid.minor = element_blank()) Steps I took: Use the ggplot() command. Fill in using dataset Then options Then you need to label Create the lines Make title stand out Get rid of the grid lines
53
What I ended with: ggplot(TeaParty_and_Congress, aes(y=TPGroups,
x=Romney2012Share, color=TPGroups)) + geom_jitter() + theme_minimal() + scale_color_viridis(option="inferno", alpha=1, begin=0, end=.9, direction=-1, name="Groups", breaks=seq(0, 10, 2))+ labs(title="Tea Party groups clustered in districts that went Republican in 2012", x = "2012 GOP presidential vote share", y="Groups per district") + geom_vline(xintercept = 50, color= "blue", size=1) + geom_vline(xintercept = 60, size= 1)+ theme(plot.title = element_text(color= "blue", size=10)) + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.