Presentation is loading. Please wait.

Presentation is loading. Please wait.

Graphical Display 1 Pictures of Data. Edward Tufte Visual Display of Quantitative Information Using graphics for –Data –Concepts –Instruction –Finding.

Similar presentations


Presentation on theme: "Graphical Display 1 Pictures of Data. Edward Tufte Visual Display of Quantitative Information Using graphics for –Data –Concepts –Instruction –Finding."— Presentation transcript:

1 Graphical Display 1 Pictures of Data

2 Edward Tufte Visual Display of Quantitative Information Using graphics for –Data –Concepts –Instruction –Finding patterns and answering questions

3

4 Simple Plots One variable or one variable plus a grouping variable Reveals shape of the distribution Distribution of cases over a categorical variable Evidence that metric variable follows a normal distribution

5 Tufte’s Rules Above all else show data Maximize the data/ink ratio Erase non-data-ink Erase redundant data-ink Revise and edit

6 Pie Charts One variable, nominal scale, percentages A few categories (3-6) Exploded for emphasis Best in groups (2-4) Best replaced by other graphic displays

7 Rcmdr pie diagram of language family in the CA Indians data set

8 3D Pies Strongly discouraged plotrix has them: –Install.packages(“plotrix”) –library(plotrix) –pie3D(table(CAIndians$Language), radius=.85, labels = levels(CAIndians$Language), explode=.1)

9

10 Bar Charts One variable, nominal scale, counts or percentages More categories than pie charts(4- 15) Color or shaded Can be stacked, set side-by-side or rotated Can compare 2-3 groups

11

12 barplot Rcmdr uses this command: –barplot(table(CAIndians$Language), xlab="Language", ylab="Frequency") Just insert the following before the left parenthesis to get it in color: –, col=rainbow_hcl(6)

13

14 Barplots with 2 groups Side by side or stacked bar plots make it possible to compare two categories at a time Load MissFauna and extract the Lilbourn samples: –Lilbourn <- as.matrix(MissFauna[5:7,]) –LilbournPct <- prop.table(Lilbourn, 1)*100

15 Lilbourn Barplots par(mfrow=c(2,2)) barplot(LilbournPct) barplot(t(LilbournPct)) barplot(LilbournPct, beside=TRUE, legend=TRUE) barplot(t(LilbournPct), beside=TRUE, legend=TRUE)

16

17 Dot Chart Dot charts plot the amounts in each group along a common scale so they are more easily comparable: –dotchart(as.vector(table(CAIndians$La nguage)), pch=16, labels = levels(CAIndians$Language), xlab = "Frequency")

18

19 Stem and Leaf Plot Tukey proposed as a way of looking at the distribution of a numeric variable With a small sample, can preserve the original data while showing its shape

20 > stem.leaf(DartPoints$Length, unit=1, na.rm=TRUE) 1 | 2: represents 12 leaf unit: 1 n: 55 7 3* | 1223334 13 3. | 556788 25 4* | 011222333344 (11) 4. | 55777888999 19 5* | 224 16 5. | 55679 11 6* | 01144 6 6. | 567 3 7* | 4 2 7. | 8 1 8* | 4

21 Histograms Like a bar plot but the x-axis is a continuous measurement. Shape of data distribution is shown, but number of bars can change the shape.

22

23 Boxplot Also called box and whiskers plots show the quartiles and outlier points Multiple boxplots let you compare groups

24

25

26 Stripchart A stripchart plots the actual values along the y-axis It hides less information, but is not as familiar to many people

27

28 Kernel Density Plot Use the data to approximate a smooth distribution – varies according to the bandwidth A normal (or other distribution) is placed on each point and then the distributions are summed plot(density(DartPoints$Length))

29 Example x <- c(53, 59, 62, 63, 65, 67, 69, 71, 72, 77) hist(x, col="blue", las=1, cex.axis=1.5, cex.lab=1.5)

30 plot(density(x), main="Kernel Density Plot") rug(x) xi <- seq(45, 90,.5) for (i in 1:10) lines(xi, dnorm(xi, mean=x[i], sd=3)/10) d <- dnorm(x[1], mean=x[1], sd=3)/10 matlines(rbind(x, x), rbind(rep(0,10), rep(d, 10)), lty=3, lwd=2, col="dark gray")

31

32 # Density plots # Load DartPoints.RData par(mfrow=c(2, 2)) # Vary bandwidth plot(density(DartPoints$Length), cex.main=.8) plot(density(DartPoints$Length, bw=2), cex.main=.8) plot(density(DartPoints$Length, bw=6), cex.main=.8) plot(density(DartPoints$Length, bw=8), cex.main=.8) # To compare two distributions par(mfrow=c(1,1)) a <- density(DartPoints$Length[DartPoints$Name=="Darl"]) b <- density(DartPoints$Length[DartPoints$Name=="Pedernales"]) plot(a, main="Dart Points", xlab="Length", ylab="Density", xlim=c(min(a$x, b$x), max(a$x, b$x)), ylim=c(0, max(a$y, b$y)), col="red") lines(b, col="blue") legend("topright", levels(DartPoints$Name), lty=1, col=c("red", "blue"))

33

34

35 Violin Plot Combine box plot and kernel density plot (package vioplot) with(DartPoints, vioplot(Length[Name=="Darl"], Length[Name=="Pedernales"], names=levels(Name)))

36

37 Beanplot Add strip plot to violin plot (package beanplot) with(DartPoints, beanplot(Length[Name=="Darl"], Length[Name=="Pedernales"], names=levels(Name)))

38


Download ppt "Graphical Display 1 Pictures of Data. Edward Tufte Visual Display of Quantitative Information Using graphics for –Data –Concepts –Instruction –Finding."

Similar presentations


Ads by Google