Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Visualization with R (II)

Similar presentations


Presentation on theme: "Data Visualization with R (II)"— Presentation transcript:

1 Data Visualization with R (II)
Dr. Jieh-Shan George YEH

2 Outlines Data Visualization with R Visualizing Different Type of Data
Univariate Univariate Categorical Bivariate Categorical Bivariate Continuous vs Categorical Bivariate Continuous vs Continuous Bivariate: Continuous vs Time

3 Data Visualization with R
Both anecdotally, and per Google Trends, R is the language and tool most closely associated with creating data visualizations.

4 Google Trend on R & Data Visualization

5 Google Trend on R & Data Visualization

6 Graph For data mining

7 Hierarchical Clustering
hc<-hclust(dist(mtcars)) plot(hc) rect.hclust(hc, k=4)

8 Decision Tree require(rpart) require(rpart.plot) rp1<-rpart(factor(cyl)~mpg, data=mtcars) prp(rp1)

9 OTHERS

10 Financial Timeseries Quantitative Financial Modeling Framework
require(quantmod) getSymbols("YHOO",src="google") # from google finance getSymbols("YHOO", from=" ") chartSeries(YHOO)

11 barChart(YHOO) candleChart(YHOO,multi.col=TRUE,theme="white") chartSeries(to.weekly(YHOO),up.col='white',dn.col='blue')

12 ggplot2

13 ggplot2 The ggplot2 package, created by Hadley Wickham, offers a powerful graphics language for creating elegant and complex plots. Originally based on Leland Wilkinson's The Grammar of Graphics, ggplot2 allows you to create graphs that represent both univariate and multivariate numerical and categorical data in a straightforward manner. Grouping can be represented by color, symbol, size, and transparency. The creation of trellis plots (i.e., conditioning) is relatively simple.  qplot() (for quick plot) hides much of this complexity when creating standard graphs.

14 qplot() The qplot() function can be used to create the most common graph types. While it does not expose ggplot's full power, it can create a very wide range of useful plots. The format is: qplot(x, y, data=, color=, shape=, size=, alpha=, geom=, method=, formula=, facets=, xlim=, ylim= xlab=, ylab=, main=, sub=) Notes: At present, ggplot2 cannot be used to create 3D graphs or mosaic plots. Use I(value) to indicate a specific value. For example size=z makes the size of the plotted points or lines proportional to the values of a variable z. In contrast, size=I(3) sets each point or line to three times the default size.

15 Customizing ggplot2 Graphs
Unlike base R graphs, the ggplot2 graphs are not effected by many of the options set in the par( ) function. They can be modified using the theme() function, and by adding graphic parameters within the qplot() function. For greater control, use ggplot() and other functions provided by the package. ggplot2 functions can be chained with "+" signs to generate the final plot.

16

17 Example # ggplot2 examples library(ggplot2)  # create factors with value labels  mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5),    labels=c("3gears","4gears","5gears"))  mtcars$am <- factor(mtcars$am,levels=c(0,1),    labels=c("Automatic","Manual"))  mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8),    labels=c("4cyl","6cyl","8cyl")) 

18 # Kernel density plots for mpg # grouped by number of gears (indicated by color) qplot(mpg, data=mtcars, geom="density", fill=gear, alpha=I(.5),     main="Distribution of Gas Milage", xlab="Miles Per Gallon",     ylab="Density")

19 # Scatterplot of mpg vs. hp for each combination of gears and cylinders # in each facet, transmission type is represented by shape and color qplot(hp, mpg, data=mtcars, shape=am, color=am, facets=gear~cyl, size=I(3), xlab="Horsepower", ylab="Miles per Gallon")

20 # Separate regressions of mpg on weight for each number of cylinders
qplot(wt, mpg, data=mtcars, geom=c("point", "smooth"), method="lm", formula=y~x, color=cyl, xlab="Weight", ylab="Miles per Gallon“, main="Regression of MPG on Weight", )

21 # Boxplots of mpg by number of gears # observations (points) are overlayed and jittered qplot(gear, mpg, data=mtcars, geom=c("boxplot", "jitter"), fill=gear, main="Mileage by Gear Number", xlab="", ylab="Miles per Gallon")

22 To learn more, see the ggplot reference site


Download ppt "Data Visualization with R (II)"

Similar presentations


Ads by Google