Download presentation

Presentation is loading. Please wait.

Published byDevin Massey Modified about 1 year ago

1
Outline Research Question: What determines height? Data Input Look at One Variable Compare Two Variables Children’s Height and Parents Height Children’s Height and Gender Graphic Packages: ggplot2

2
What factors are most responsible for height?

3
X1X2X3Y Galton’s Family Height Dataset

4
Galton’s Notebook on Families & Height

5
> getwd() [1] "C:/Users/johnp_000/Documents" > setwd()

6
Dataset Input Function Filename Object h <- read.csv("GaltonFamilies.csv")

7
str() summary() Data Types: Numbers and Factors/Categorical

8
Steps Continuous Categorical Histogram Scatter Boxplot Child’s Height Dad’s Height Gender Continuous Type Variable Mom’s Height

9
Frequency Distribution, Histogram hist(h$child)

10
Area = 1 Density Plot plot(density(h$childHeight))

11
hist(h$childHeight,freq=F, breaks =25, ylim = c(0,0.14)) curve(dnorm(x, mean=mean(h$childHeight), sd=sd(h$childHeight)), col="red", add=T) Mode, Bimodal

12
Grammar of Graphics formations Legend Axes Seven Components ggplot2 built using the grammar of graphics approach

13
Asst. Professor of Statistics at Rice University ggplot2 plyr reshape rggobi profr Hadley Wickman and ggplot2 http://ggplot2.org/

14
In ggplot2 a plot is made up of layers. ggplot2 Plot

15
ggplot2 library(ggplot2) h.gg <- ggplot(h, aes(child)) h.gg + geom_histogram(binwidth = 1 ) + labs(x = "Height", y = "Frequency") h.gg + geom_density()

16
ggplot2 h.gg <- ggplot(h, aes(child)) + theme(legend.position = "right") h.gg + geom_density() + labs(x = "Height", y = "Frequency") h.gg + geom_density(aes(fill=factor(gender)), size=2)

17
Box Plot

18
Children’s Height vs. Gender boxplot(h$child~gender,data=h, col=(c("pink","lightblue")), main="Children's Height by Gender", xlab="Gender", ylab="")

19
Descriptive Stats: Box Plot

20
Subset Males men<- subset(h, gender=='male')

21
Subset Females women <- subset(h, gender==‘female')

22
Children’s Height: Males hist(men$childHeight)

23
Children’s Height: Females hist(women$child)

24
ggplot2 library(ggplot2) h.bb <- ggplot(h, aes(factor(gender), child)) h.bb + geom_boxplot() h.bb + geom_boxplot(aes(fill = factor(gender)))

25
Steps Continuous Categorical Histogram Scatter Boxplot Child’s Height Dad’s Height Gender Continuous Y X1, X2 X3 Type Variable Mom’s Height

26
Correlation

27
?cor cor(h$father, h$child) 0.2660385

28
Scatterplot Matrix: pairs()

29
Correlations Matrix library(car) scatterplotMatrix(heights)

30
ggplot2

31
Analytics & History: 1st Regression Line The first “Regression Line”

32
Steps Continuous Categorical Histogram Scatter Boxplot Child’s Height Dad’s Height Gender Continuous Type Variable Mom’s Height

33
Appendix

34
.net BIRT cytoscape flot gephi gnuplot graphite iDashboards Incanter Java JMP Javascript: Raphael Highcharts Arbor jfreecharts BI Tools Spotfire Cognos MicroStrategy LogiXML MDX Mondrian octave openlayers OpenViz PhP Powerpoint precog Prezi processing Ptotobi Silverlight splunk SSRS talend webGL Wijmo WPF Xcelcuis XLMiner May, 2013 N=172 What software do you use for creating charts or data visualizations?

35
Easy to Use Interactive Standard Visualizations Steep Learning Curve Visualization and Reporting

37
http://public.tableausoftware.com/views/PapelbonPitchFX/PapelbonPitchFX BI Software: Tableau

38
http://rcharts.io/gallery/

39
https://plot.ly/r/

40
http://shiny.rstudio.com/gallery/movie-explorer.html

41
The next data visual was produced with about 150 lines of R code

42
Data Viz Tutorials

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google